CN111741531B - Optimization method for optimal operation state of communication equipment under 5G base station - Google Patents

Optimization method for optimal operation state of communication equipment under 5G base station Download PDF

Info

Publication number
CN111741531B
CN111741531B CN202010863911.2A CN202010863911A CN111741531B CN 111741531 B CN111741531 B CN 111741531B CN 202010863911 A CN202010863911 A CN 202010863911A CN 111741531 B CN111741531 B CN 111741531B
Authority
CN
China
Prior art keywords
state
time
real
communication equipment
optimization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010863911.2A
Other languages
Chinese (zh)
Other versions
CN111741531A (en
Inventor
李传煌
倪郑威
李军
毛建洋
梁刚
陈青松
诸葛斌
鲁佳
陈超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Gongshang University
Sunwave Communications Co Ltd
Original Assignee
Zhejiang Gongshang University
Sunwave Communications Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Gongshang University, Sunwave Communications Co Ltd filed Critical Zhejiang Gongshang University
Publication of CN111741531A publication Critical patent/CN111741531A/en
Application granted granted Critical
Publication of CN111741531B publication Critical patent/CN111741531B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/04Wireless resource allocation
    • H04W72/044Wireless resource allocation based on the type of the allocated resource
    • H04W72/0473Wireless resource allocation based on the type of the allocated resource the resource being transmission power
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W52/00Power management, e.g. TPC [Transmission Power Control], power saving or power classes
    • H04W52/04TPC
    • H04W52/18TPC being performed according to specific parameters
    • H04W52/20TPC being performed according to specific parameters using error rate
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W52/00Power management, e.g. TPC [Transmission Power Control], power saving or power classes
    • H04W52/04TPC
    • H04W52/18TPC being performed according to specific parameters
    • H04W52/26TPC being performed according to specific parameters using transmission rate or quality of service QoS [Quality of Service]
    • H04W52/265TPC being performed according to specific parameters using transmission rate or quality of service QoS [Quality of Service] taking into account the quality of service QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W52/00Power management, e.g. TPC [Transmission Power Control], power saving or power classes
    • H04W52/04TPC
    • H04W52/18TPC being performed according to specific parameters
    • H04W52/26TPC being performed according to specific parameters using transmission rate or quality of service QoS [Quality of Service]
    • H04W52/267TPC being performed according to specific parameters using transmission rate or quality of service QoS [Quality of Service] taking into account the information rate
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/50Allocation or scheduling criteria for wireless resources
    • H04W72/54Allocation or scheduling criteria for wireless resources based on quality criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/50Allocation or scheduling criteria for wireless resources
    • H04W72/54Allocation or scheduling criteria for wireless resources based on quality criteria
    • H04W72/542Allocation or scheduling criteria for wireless resources based on quality criteria using measured or perceived quality
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/50Allocation or scheduling criteria for wireless resources
    • H04W72/54Allocation or scheduling criteria for wireless resources based on quality criteria
    • H04W72/543Allocation or scheduling criteria for wireless resources based on quality criteria based on requested quality, e.g. QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/04Wireless resource allocation
    • H04W72/044Wireless resource allocation based on the type of the allocated resource
    • H04W72/0446Resources in time domain, e.g. slots or frames
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/04Wireless resource allocation
    • H04W72/044Wireless resource allocation based on the type of the allocated resource
    • H04W72/046Wireless resource allocation based on the type of the allocated resource the resource being in the space domain, e.g. beams

Landscapes

  • Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention discloses an optimization method of the optimal running state of communication equipment under a 5G base station, wherein the communication equipment under the 5G base station is divided into communication equipment in a non-real-time updating state and communication equipment in a real-time updating state; for communication equipment with a non-real-time update state, knowing a time slot, a power supply quantity and an uncontrollable parameter group, constructing and solving an optimization problem, and realizing the operation state optimization of the communication equipment under the 5G base station according to a solving result; for communication equipment with a real-time updated state, the equipment can obtain the power supply electric quantity and the uncontrollable parameters only through real-time information, namely only through any time slot; and (3) by constructing a Markov decision process, then solving an optimal strategy of the Markov decision process, and realizing the operation state optimization of the communication equipment under the 5G base station according to the optimal strategy. The invention simultaneously aims at updating the communication equipment in two different running states in real time and non-real time, so that the overall performance of the communication equipment in the whole running process can be optimal.

Description

Optimization method for optimal operation state of communication equipment under 5G base station
Technical Field
The invention relates to the field of network communication, in particular to an optimization method for the optimal running state of communication equipment under a 5G base station.
Background
Conventional communication mechanisms tend to default to a device having enough energy to perform corresponding operations during execution, and do not consider the energy factor of the device. When the power storage capacity of the device is weak, and the power supply amount is uncertain, the traditional communication mechanism can increase the risk of power failure of the device.
In order to solve the problems, the application researches an energy collection perception communication mechanism design, namely, a communication process and related parameters are dynamically adjusted according to the change of equipment storage and power supply energy. Selecting a proper modeling mode and an analysis method according to the power supply characteristics, the equipment energy storage characteristics and the data generation characteristics of different equipment: for the equipment with controllable power supply amount (such as the equipment is powered by a special energy source), known energy storage amount of the equipment and predictable data generation amount, simulating the communication process of the equipment by using known information under the online condition, constructing a proper mathematical model by combining service scene characteristics and related performance indexes, and optimizing the communication process by using a mathematical tool; and for some equipment with uncontrollable and unpredictable information, by means of models such as a Markov decision process and the like, operation steps and energy management of the equipment are optimized in real time in a communication process by using algorithms such as dynamic planning and the like.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides an optimization method of the optimal operation state of communication equipment under a 5G base station.
The technical scheme adopted by the application is as follows: the invention provides an optimization method of an optimal operation state of communication equipment under a 5G base station, wherein the communication equipment under the 5G base station is divided into communication equipment in a non-real-time updating state and communication equipment in a real-time updating state;
for communication devices that update status in non-real time, time slots are knowniPower supply quantity
Figure DEST_PATH_IMAGE001
And uncontrollable parameter set
Figure DEST_PATH_IMAGE002
Constructing an optimization problem and solving the optimization problem, wherein the concrete steps are as follows:
a) determining an optimization objective: equipment performance index quantization under 5G base station
Figure DEST_PATH_IMAGE003
Figure DEST_PATH_IMAGE004
Is a controllable parameter set;
b) determining power demand guarantee constraints: the electric quantity contained in the equipment at the beginning of each time slot can guarantee the electric power requirement of the time slot;
c) modeling an optimization problem: under the constraint condition of power demand guarantee, obtaining the optimal performance index, and specifically modeling as follows:
Figure DEST_PATH_IMAGE005
in the formula (I), the compound is shown in the specification,Nthe number of the total time slots is,B 1the amount of power that the device contains at the beginning of the 1 st time slot,
Figure DEST_PATH_IMAGE006
to be in the uncontrollable parameter group
Figure DEST_PATH_IMAGE007
Selecting controllable parameter group
Figure 770855DEST_PATH_IMAGE004
Energy consumed to perform the operation;
d) solving the optimization problem in the step c), and realizing the operation state optimization of the communication equipment under the 5G base station according to the solving result;
for communication equipment with a real-time update state, the equipment only has real-time information, namely, the power supply electric quantity and the uncontrollable parameter set can be obtained only when any time slot is reached; by constructing a Markov decision process and solving an optimal strategy of the Markov decision process, the running state optimization is realized, and the specific steps are as follows:
A) determining state space, action space and reward: in the Markov decision process, if the state of the communication equipment in the real-time update state is the power supply electric quantity
Figure DEST_PATH_IMAGE008
Battery energy storage
Figure DEST_PATH_IMAGE009
Uncontrollable parameter group
Figure DEST_PATH_IMAGE010
In the case of (1), the action taken is to select a controllable parameter set for the communication device that is updating the state in real time
Figure DEST_PATH_IMAGE011
If yes, the reward is the concerned equipment performance index;
B) determining decision rules and policies: if the current state-action history is
Figure DEST_PATH_IMAGE012
And t is denoted as the t-th time slot; in determining the rule
Figure DEST_PATH_IMAGE013
Next, the action is determined by the current state-action history; the strategy is expressed as
Figure DEST_PATH_IMAGE014
C) Determining an optimization target and modeling a problem: evaluating a policy by a desired alternate bonus sum of a bonus sum
Figure DEST_PATH_IMAGE015
Good or bad; when the initial state is
Figure DEST_PATH_IMAGE016
Expectation of bonus sum from 1 st slot to Nth slot
Figure DEST_PATH_IMAGE017
The following were used:
Figure DEST_PATH_IMAGE018
in the formula (I), the compound is shown in the specification,
Figure DEST_PATH_IMAGE019
is a reward for the t-slot(s),
Figure DEST_PATH_IMAGE020
as a policy
Figure 314094DEST_PATH_IMAGE015
(iii) a desire;
Figure DEST_PATH_IMAGE021
and
Figure DEST_PATH_IMAGE022
respectively are elements in a state random sequence and an action random sequence; the ultimate goal is to find the optimal strategy
Figure DEST_PATH_IMAGE023
So that
Figure DEST_PATH_IMAGE024
In the formula (I), the compound is shown in the specification,
Figure DEST_PATH_IMAGE025
represents the set of all possible policies that may be applied,
Figure DEST_PATH_IMAGE026
is a state space;
D) and C), solving the optimization target of the step C) to obtain an optimal strategy, and realizing the operation state optimization of the communication equipment under the 5G base station according to the optimal strategy.
Further, the controllable parameter sets of the communication device with the non-real-time update state and the communication device with the real-time update state comprise the transmission power and the coding rate.
Further, the uncontrollable parameter sets of the communication device with the non-real-time updating state and the communication device with the real-time updating state comprise channel conditions, generated data amount, allocated time resources and space resources.
Further, the device performance indexes under the 5G base station include an error rate, throughput and quality of service (QoS).
Further, the step b) is specifically as follows: suppose iniThe device contains an amount of power at the beginning of a time slot of
Figure DEST_PATH_IMAGE027
Then obtain
Figure DEST_PATH_IMAGE028
In the formula (I), the compound is shown in the specification,
Figure DEST_PATH_IMAGE029
to be in the uncontrollable parameter group
Figure 952886DEST_PATH_IMAGE007
Selecting controllable parameter group
Figure 82516DEST_PATH_IMAGE004
Energy consumed to perform the operation; in order to guarantee the power demand of the equipment, the following energy constraint conditions are necessary:
Figure DEST_PATH_IMAGE030
in the formula (I), the compound is shown in the specification,Nto the total number of time slots, correspond toNAn energy constraint, i.e., the amount of power a device has for any time slot, guarantees the power requirements of that time slot.
Further, in the step d), if the optimization problem is a convex optimization problem, solving by using a standard solution of the convex optimization problem; if the optimization problem is not a convex optimization problem, a solution of a standard convex optimization problem is combined with a genetic algorithm to solve so as to reduce the occurrence of convergence to a suboptimal solution.
Further, the solution of the standard convex optimization problem is newton's method or interior point method.
Further, in the step a), the value set of the power supply capacity of the device is recorded as
Figure DEST_PATH_IMAGE031
And the set of values of the battery energy storage of the equipment is recorded as
Figure DEST_PATH_IMAGE032
The value set of the uncontrollable parameter set is recorded as
Figure DEST_PATH_IMAGE033
(ii) a The state space is represented as
Figure DEST_PATH_IMAGE034
(ii) a Status of state
Figure DEST_PATH_IMAGE035
Being an element of the state space,
Figure DEST_PATH_IMAGE036
(ii) a Value set of controllable parameter group
Figure DEST_PATH_IMAGE037
Namely the motion space; any set of controllable parameter sets is
Figure 959205DEST_PATH_IMAGE037
An element of (1), referred to as an action; selecting a set of controllable parameters is selecting an action in the Markov decision process.
Further, in the step D), for the Markov decision process, the optimal strategy is obtained by using a dynamic programming, value iteration, strategy iteration or linear programming method.
The invention has the beneficial effects that: different communication equipment has different performance indexes which need to be concerned under different operation environments, and the conventional optimization scheme is difficult to solve various performance indexes in a unified and generalized way; the same communication equipment also has two different states of real-time update and non-real-time update during operation, and the performance optimization of the communication equipment in the real-time update state cannot be well solved. The present invention quantifies a device performance metric of interest as
Figure DEST_PATH_IMAGE038
The method is suitable for optimizing performance indexes under different conditions, and meanwhile, the overall performance of the communication equipment in the whole operation process can be optimal by aiming at updating the communication equipment in two different operation states, namely real-time operation state and non-real-time operation state.
Drawings
Fig. 1 is a flowchart of an optimization method for an optimal operating state of a communication device under a 5G base station according to the present invention.
Detailed Description
The following describes embodiments of the present invention in further detail with reference to the accompanying drawings.
As shown in fig. 1, in the method for optimizing the optimal operating state of the communication device in the 5G base station, the communication device in the 5G base station is divided into two cases, namely a non-real-time update state communication device and a real-time update state communication device; the controllable parameter group of the communication device in the non-real-time updating state and the communication device in the real-time updating state comprises transmission power and coding rate. The uncontrollable parameter groups of the communication equipment in the non-real-time updating state and the communication equipment in the real-time updating state are external conditions of the communication equipment, and comprise channel conditions, generated data volume, allocated time resources and space resources.
For communication devices that update status in non-real time, time slots are knowniPower supply quantity
Figure DEST_PATH_IMAGE039
And uncontrollable parameter set
Figure DEST_PATH_IMAGE040
Based on the method, an optimization problem is constructed firstly, and then the optimization problem is solved: a) determining an optimization objective; b) depicting power demand guarantee constraints; c) modeling an optimization problem; d) and (5) solving an optimization problem. The method comprises the following specific steps:
a) determining an optimization objective;
quantization of concerned equipment performance index under 5G base station
Figure DEST_PATH_IMAGE041
(ii) a The goal of the optimization is to select the most suitable one
Figure DEST_PATH_IMAGE042
So that
Figure 109564DEST_PATH_IMAGE041
And max.
Figure DEST_PATH_IMAGE043
The specific form and nature of (a) is related to the device performance indicators of interest, including bit error rate, throughput, quality of service, QoS. Wherein
Figure 111018DEST_PATH_IMAGE040
Is an uncontrollable parameter determined by the external environment or service scenario, and
Figure 605191DEST_PATH_IMAGE042
is a controllable parameter, and performance
Figure DEST_PATH_IMAGE044
Is that
Figure 8490DEST_PATH_IMAGE042
And
Figure 1854DEST_PATH_IMAGE040
are determined jointly according to
Figure 552921DEST_PATH_IMAGE040
Adjustment of
Figure 289933DEST_PATH_IMAGE042
So that
Figure 547739DEST_PATH_IMAGE043
And max. In the embodiment of the invention, the time for sending a frame is set as the length of a time slot, and the information bit contained in the frame is set as
Figure DEST_PATH_IMAGE045
A frame error rate of
Figure DEST_PATH_IMAGE046
If the performance index concerned is the average number of correctly decoded information bits in a slot, then there is
Figure DEST_PATH_IMAGE047
In the formula (I), the compound is shown in the specification,
Figure 508742DEST_PATH_IMAGE045
only with controllable parameters (coding rate, bandwidth, etc.), while the frame error rate is related to both controllable parameters (transmission power, etc.) and uncontrollable parameters (channel fading, etc.). Namely, it is
Figure 750367DEST_PATH_IMAGE045
By
Figure 822228DEST_PATH_IMAGE042
Determine whether or not to use
Figure 465699DEST_PATH_IMAGE046
By
Figure 535287DEST_PATH_IMAGE042
And
Figure 264208DEST_PATH_IMAGE040
and (4) jointly determining.
b) Depicting power demand guarantee constraints;
suppose iniThe device contains an amount of power at the beginning of a time slot of
Figure DEST_PATH_IMAGE048
Then can obtain
Figure DEST_PATH_IMAGE049
In the formula (I), the compound is shown in the specification,
Figure DEST_PATH_IMAGE050
to be in the uncontrollable parameter group
Figure 874181DEST_PATH_IMAGE040
Selecting controllable parameter group
Figure 168896DEST_PATH_IMAGE042
Energy consumed to perform the operation; in order to guarantee the power demand of the equipment, the following energy constraint conditions are necessary:
Figure DEST_PATH_IMAGE051
in the formula (I), the compound is shown in the specification,Nto the total number of time slots, correspond toNThe energy constraint condition is that for any time slot, the electric quantity of the equipment can guarantee the power requirement of the time slot;
c) modeling an optimization problem;
in order to make the overall performance of the communication equipment under the 5G base station in the whole operation process best, namely, to depict the optimization problem of obtaining the optimal performance index under the constraint condition of power demand guarantee, the optimization problem is modeled as follows:
Figure DEST_PATH_IMAGE052
d) solving the optimization problem, if the optimization problem is a convex optimization problem, solving by using a standard solution (Newton method, interior point method and the like) of the convex optimization problem; if the optimization problem is not a convex optimization problem, one method is to convert the problem into the convex optimization problem through observation and then solve the convex optimization problem, and the other method is to combine the Newton method, the interior point method and other methods with other algorithms such as a genetic algorithm and the like to solve the convex optimization problem, for example, the R-genetic optimization algorithm of a derivative is used, so that the condition that the convex optimization problem converges to a suboptimal solution is reduced, and the operation state optimization of the communication equipment under the 5G base station is realized according to the solving result. Some non-convex optimization problems that fit a particular structure may also be solved directly, for example, using projection gradient descent, alternative minimization, expectation maximization algorithms, stochastic optimization, etc.
For communication equipment with a real-time updated state, the equipment can obtain the power supply electric quantity and the uncontrollable parameters only through real-time information, namely only through any time slot; based on the above, the Markov decision process is constructed, and then the optimal strategy of the Markov decision process is solved: the method comprises the following specific steps: a) depicting state space, action space and rewards; b) depicting decision rules and strategies; c) optimizing target depiction and problem modeling; d) and solving the optimal strategy in the Markov decision process.
a) Depicting state space, action space and rewards;
the value set of the electric quantity provided by the equipment is recorded as
Figure DEST_PATH_IMAGE053
And the value set of the electric quantity contained in the equipment is recorded as
Figure DEST_PATH_IMAGE054
The value set of the uncontrollable parameter set is recorded as
Figure DEST_PATH_IMAGE055
(ii) a The state space can be represented as
Figure DEST_PATH_IMAGE056
(ii) a Status of state
Figure DEST_PATH_IMAGE057
Being an element of the state space,
Figure DEST_PATH_IMAGE058
(ii) a Value set of controllable parameter set
Figure DEST_PATH_IMAGE059
Namely the motion space; it can be seen that any set of controllable parameter sets is
Figure 737281DEST_PATH_IMAGE059
An element of (1), referred to as an action; selecting a set of controllable parameters is to select an action in the Markov decision process;
in state during Markov decision process
Figure 687919DEST_PATH_IMAGE057
At the time of, adopt the actions
Figure DEST_PATH_IMAGE060
The resulting benefit is defined as the reward
Figure DEST_PATH_IMAGE061
(ii) a If the power supply capacity of the communication equipment in the real-time update state
Figure DEST_PATH_IMAGE062
Battery energy storage of communication equipment capable of updating state in real time
Figure DEST_PATH_IMAGE063
Uncontrolled parameter set for communication equipment capable of updating state in real time
Figure DEST_PATH_IMAGE064
In the case of selecting a controllable parameter set of a communication device which updates the state in real time
Figure DEST_PATH_IMAGE065
The reward is then the performance indicator of the device concerned at that time, i.e. the reward is
Figure DEST_PATH_IMAGE066
In the formula (I), the compound is shown in the specification,
Figure DEST_PATH_IMAGE067
updating the performance indicators of the state of the communication device in real time for interest;
b) depicting decision rules and strategies; the decision rule is in a certain time slotiSelecting a method of action; the method specifically comprises the following steps: if the current state-action history is
Figure DEST_PATH_IMAGE068
And t is denoted as the t-th time slot; in determining the rule
Figure DEST_PATH_IMAGE069
Next, the action is determined by the current state-action history; the strategy being a sequence of decision rules, using
Figure DEST_PATH_IMAGE070
Is shown, i.e.
Figure DEST_PATH_IMAGE071
NThe number of the total time slots is; the decision rule has Markov property and certainty, namely the selection of the action is only related to the current state;
c) optimizing target depiction and problem modeling;
since the amount of power supplied to the device and the uncontrollable set of parameters are random in practice, a strategy is evaluated here by the expected substitute prize sum of the prize sum
Figure 462102DEST_PATH_IMAGE070
Good and bad, use
Figure DEST_PATH_IMAGE072
Is shown, i.e.
Figure DEST_PATH_IMAGE073
In the formula (I), the compound is shown in the specification,
Figure DEST_PATH_IMAGE074
and
Figure DEST_PATH_IMAGE075
respectively a random sequence of states
Figure DEST_PATH_IMAGE076
And motion random sequence
Figure DEST_PATH_IMAGE077
The elements (A) and (B) in (B),
Figure DEST_PATH_IMAGE078
is a reward for the t-slot(s),
Figure DEST_PATH_IMAGE079
as a policy
Figure 876903DEST_PATH_IMAGE070
(iii) a desire; the ultimate goal is to find the optimal strategy
Figure DEST_PATH_IMAGE080
So that
Figure DEST_PATH_IMAGE081
In the formula (I), the compound is shown in the specification,
Figure DEST_PATH_IMAGE082
represents the set of all possible policies;
d) solving an optimal strategy in a Markov decision process; for a standard Markov decision process, methods such as dynamic programming, value iteration, strategy iteration or linear programming can be used to find an optimal strategy, and in addition, greedy strategies have been largely proven to achieve a locally optimal solution. Therefore, when the performance loss is within an acceptable range, a local optimal strategy with low computational complexity similar to a greedy strategy can be adopted, and the operation state optimization of the communication equipment under the 5G base station can be realized according to the optimal strategy.
The above-described embodiments are intended to illustrate rather than to limit the invention, and any modifications and variations of the present invention are within the spirit of the invention and the scope of the appended claims.

Claims (7)

1. The optimization method of the optimal running state of the communication equipment under the 5G base station is characterized in that the communication equipment under the 5G base station is divided into communication equipment in a non-real-time updating state and communication equipment in a real-time updating state;
for communication devices that update status in non-real time, time slots are knowniPower supply quantity
Figure 464060DEST_PATH_IMAGE001
And uncontrollable parameter set
Figure 731093DEST_PATH_IMAGE002
Constructing an optimization problem and solving the optimization problem, wherein the concrete steps are as follows:
a) determining an optimization objective: equipment performance index quantization under 5G base station
Figure 726731DEST_PATH_IMAGE003
Figure 130030DEST_PATH_IMAGE004
Is a controllable parameter set;
b) determining power demand guarantee constraints: the electric quantity contained in the equipment at the beginning of each time slot can guarantee the electric power requirement of the time slot; the method specifically comprises the following steps: suppose iniThe device contains an amount of power at the beginning of a time slot of
Figure 123394DEST_PATH_IMAGE005
Then obtain
Figure 877724DEST_PATH_IMAGE006
In the formula (I), the compound is shown in the specification,
Figure 411473DEST_PATH_IMAGE007
to be in the uncontrollable parameter group
Figure 934858DEST_PATH_IMAGE008
Selecting controllable parameter group
Figure 99123DEST_PATH_IMAGE009
Energy consumed to perform the operation; in order to guarantee the power demand of the equipment, the following energy constraint conditions are necessary:
Figure 75170DEST_PATH_IMAGE010
in the formula (I), the compound is shown in the specification,Nto the total number of time slots, correspond toNThe energy constraint condition is that for any time slot, the electric quantity of the equipment can guarantee the power requirement of the time slot;
c) modeling an optimization problem: under the constraint condition of power demand guarantee, obtaining the optimal performance index, and specifically modeling as follows:
Figure 412610DEST_PATH_IMAGE011
in the formula (I), the compound is shown in the specification,Nthe number of the total time slots is,B 1the amount of power that the device contains at the beginning of the 1 st time slot,
Figure 56081DEST_PATH_IMAGE012
to be in the uncontrollable parameter group
Figure 125668DEST_PATH_IMAGE008
Selecting controllable parameter group
Figure 854590DEST_PATH_IMAGE004
Energy consumed to perform the operation;
d) solving the optimization problem in the step c), and realizing the operation state optimization of the communication equipment under the 5G base station according to the solving result;
for communication equipment with a real-time update state, the equipment only has real-time information, namely, the power supply electric quantity and the uncontrollable parameter set can be obtained only when any time slot is reached; by constructing a Markov decision process and solving an optimal strategy of the Markov decision process, the running state optimization is realized, and the specific steps are as follows:
A) determining state space, action space and reward: recording the value set of the power supply electric quantity of the equipment as
Figure 730142DEST_PATH_IMAGE013
And the set of values of the battery energy storage of the equipment is recorded as
Figure 228119DEST_PATH_IMAGE014
The value set of the uncontrollable parameter set is recorded as
Figure 468608DEST_PATH_IMAGE015
(ii) a The state space is represented as
Figure 684825DEST_PATH_IMAGE016
(ii) a Status of state
Figure 364069DEST_PATH_IMAGE017
Being an element of the state space,
Figure 716552DEST_PATH_IMAGE018
(ii) a Value set of controllable parameter group
Figure 393521DEST_PATH_IMAGE019
Namely the motion space; any ofA set of controllable parameter sets are all
Figure 126729DEST_PATH_IMAGE019
An element of (1), referred to as an action; selecting a set of controllable parameters is to select an action in the Markov decision process; in the Markov decision process, if the state of the communication equipment in the real-time update state is the power supply electric quantity
Figure 547346DEST_PATH_IMAGE020
Battery energy storage
Figure 754336DEST_PATH_IMAGE021
Uncontrollable parameter group
Figure 602207DEST_PATH_IMAGE022
In the case of (1), the action taken is to select a controllable parameter set for the communication device that is updating the state in real time
Figure 324175DEST_PATH_IMAGE023
If yes, the reward is the concerned equipment performance index;
B) determining decision rules and policies: if the current state-action history is
Figure 548483DEST_PATH_IMAGE024
And t is denoted as the t-th time slot; in determining the rule
Figure 609980DEST_PATH_IMAGE025
Next, the action is determined by the current state-action history; the strategy is expressed as
Figure 894331DEST_PATH_IMAGE026
C) Determining an optimization target and modeling a problem: evaluating a policy by a desired alternate bonus sum of a bonus sum
Figure 103595DEST_PATH_IMAGE027
Good or bad; when the initial state is
Figure 866015DEST_PATH_IMAGE028
Expectation of bonus sum from 1 st slot to Nth slot
Figure 782018DEST_PATH_IMAGE029
The following were used:
Figure 237270DEST_PATH_IMAGE030
in the formula (I), the compound is shown in the specification,
Figure 933831DEST_PATH_IMAGE031
is a reward for the t-slot(s),
Figure 499941DEST_PATH_IMAGE032
as a policy
Figure 270451DEST_PATH_IMAGE027
(iii) a desire;
Figure 693342DEST_PATH_IMAGE033
and
Figure 80461DEST_PATH_IMAGE034
respectively are elements in a state random sequence and an action random sequence; the ultimate goal is to find the optimal strategy
Figure 919104DEST_PATH_IMAGE035
So that
Figure 75279DEST_PATH_IMAGE036
In the formula (I), the compound is shown in the specification,
Figure 669072DEST_PATH_IMAGE037
represents the set of all possible policies that may be applied,
Figure 277907DEST_PATH_IMAGE038
is a state space;
D) and C), solving the optimization target of the step C) to obtain an optimal strategy, and realizing the operation state optimization of the communication equipment under the 5G base station according to the optimal strategy.
2. The method of claim 1, wherein the set of controllable parameters for the communication device in the non-real-time update state and the communication device in the real-time update state includes a transmission power and a coding rate.
3. The method of claim 1, wherein the uncontrollable parameter set of the communication device in the non-real-time update state and the communication device in the real-time update state comprises channel conditions, generated data amount, allocated time resources and space resources.
4. The method of claim 1, wherein the device performance indicators under the 5G base station include bit error rate, throughput, and quality of service (QoS).
5. The method according to claim 1, wherein in step d), if the optimization problem is a convex optimization problem, the solution of a standard convex optimization problem is used to solve; if the optimization problem is not a convex optimization problem, a solution of a standard convex optimization problem is combined with a genetic algorithm to solve so as to reduce the occurrence of convergence to a suboptimal solution.
6. The method as claimed in claim 5, wherein the solution of the convex optimization problem is Newton's method or interior point method.
7. The method as claimed in claim 1, wherein in step D), for the markov decision process, the optimal strategy is obtained by using a dynamic programming, value iteration, strategy iteration or linear programming method.
CN202010863911.2A 2020-08-12 2020-08-25 Optimization method for optimal operation state of communication equipment under 5G base station Active CN111741531B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2020108090162 2020-08-12
CN202010809016 2020-08-12

Publications (2)

Publication Number Publication Date
CN111741531A CN111741531A (en) 2020-10-02
CN111741531B true CN111741531B (en) 2020-11-24

Family

ID=72658804

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010863911.2A Active CN111741531B (en) 2020-08-12 2020-08-25 Optimization method for optimal operation state of communication equipment under 5G base station

Country Status (1)

Country Link
CN (1) CN111741531B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107809764A (en) * 2017-09-21 2018-03-16 浙江理工大学 A kind of multiple affair detection method based on Markov chain

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106713346B (en) * 2017-01-13 2021-01-12 电子科技大学 WLAN protocol design and analysis method based on wireless radio frequency energy transmission
CN107105438B (en) * 2017-04-20 2020-06-26 成都瑞沣信息科技有限公司 QoS-based data and energy integrated transmission strategy design method
CN108880893B (en) * 2018-06-27 2021-02-09 重庆邮电大学 Mobile edge computing server combined energy collection and task unloading method
CN110113195B (en) * 2019-04-26 2021-03-30 山西大学 Method for joint unloading judgment and resource allocation in mobile edge computing system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107809764A (en) * 2017-09-21 2018-03-16 浙江理工大学 A kind of multiple affair detection method based on Markov chain

Also Published As

Publication number Publication date
CN111741531A (en) 2020-10-02

Similar Documents

Publication Publication Date Title
Ortiz et al. Reinforcement learning for energy harvesting point-to-point communications
CN111491358B (en) Adaptive modulation and power control system based on energy acquisition and optimization method
CN112788629B (en) Online combined control method for power and modulation mode of energy collection communication system
CN113438315A (en) Internet of things information freshness optimization method based on dual-network deep reinforcement learning
CN113395723B (en) 5G NR downlink scheduling delay optimization system based on reinforcement learning
Hua et al. GAN-based deep distributional reinforcement learning for resource management in network slicing
CN115065678A (en) Multi-intelligent-device task unloading decision method based on deep reinforcement learning
Ghosh et al. Achieving sub-linear regret in infinite horizon average reward constrained mdp with linear function approximation
CN111741531B (en) Optimization method for optimal operation state of communication equipment under 5G base station
Liu et al. POMDP-based energy cooperative transmission policy for multiple access model powered by energy harvesting
Zhao et al. Joint computing resource and bandwidth allocation for semantic communication networks
CN109246787B (en) Relay selection method combined with predictive control
Wu et al. Q-learning based link adaptation in 5G
CN112953666B (en) Spectrum prediction switching method based on channel quality in cognitive wireless network
Zhu et al. Minimizing age of information in the uplink multi-user networks via dynamic bandwidth allocation
TWI607665B (en) Method for assigning backhaul links in cooperative wireless network
Suljović et al. Leveraging outage probability in systems limited by BX fading and Nakagami-m co-channel interference for classification-based QoS estimation
Masadeh Enhancing the performance of energy harvesting wireless communications using optimization and machine learning
Bodin et al. Energy harvesting communication system with a finite set of transmission rates
Liu et al. Machine learning based adaptive modulation scheme for energy harvesting cooperative relay networks
CN112235131B (en) Data center network service configuration method based on clean energy time window
Ly Reinforcement Learning-based Methods for Wireless Access Optimization and Multi-Interface Connectivity
CN114339892B (en) DQN and joint bidding based two-layer slice resource allocation method
Fu et al. A new theoretic foundation for cross-layer optimization
Zhang et al. Online learning for wireless video transmission with limited information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant