CN113259968A - Intelligent calculation method for power distribution network equipment based on information freshness - Google Patents

Intelligent calculation method for power distribution network equipment based on information freshness Download PDF

Info

Publication number
CN113259968A
CN113259968A CN202110401915.3A CN202110401915A CN113259968A CN 113259968 A CN113259968 A CN 113259968A CN 202110401915 A CN202110401915 A CN 202110401915A CN 113259968 A CN113259968 A CN 113259968A
Authority
CN
China
Prior art keywords
aoi
distribution network
transmission
average
power distribution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110401915.3A
Other languages
Chinese (zh)
Inventor
宁鑫
陈俊
邓元实
张睿
李巍巍
罗洋
朱轲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electric Power Research Institute of State Grid Sichuan Electric Power Co Ltd
Original Assignee
Electric Power Research Institute of State Grid Sichuan Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electric Power Research Institute of State Grid Sichuan Electric Power Co Ltd filed Critical Electric Power Research Institute of State Grid Sichuan Electric Power Co Ltd
Priority to CN202110401915.3A priority Critical patent/CN113259968A/en
Publication of CN113259968A publication Critical patent/CN113259968A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/02Arrangements for optimising operational condition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/15Correlation function computation including computation of convolution operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

In the technical scheme of the invention, an index of information freshness, namely an information age fluctuation coefficient, is provided aiming at the problem that the time delay performance in the information acquisition and control calculation of the power distribution network equipment cannot accurately ensure the timely processing of the service; calculating the average information age and the maximum information age of the system according to the model of the intelligent calculation system of the power distribution network equipment; and calculating the average local calculation rate, the average transmission rate and the average remote calculation rate of the system in the optimization problem of minimizing the fluctuation coefficient according to the two calculated performance indexes. Therefore, according to the characteristics of the operation maintenance system of the power distribution network equipment, the time delay problem of intelligent calculation and transmission is considered, the information freshness consideration index, namely the information age fluctuation coefficient, is provided according to the operation maintenance service, and the transmission and calculation rate is optimized according to the consideration index.

Description

Intelligent calculation method for power distribution network equipment based on information freshness
Technical Field
The invention relates to the technical field of communication networks, in particular to an intelligent calculation method for power distribution network equipment based on information freshness.
Background
With the high-speed development of economy in China, the urbanization level is continuously improved, and the power distribution network equipment is large in scale, high in change speed and unbalanced in development; the method comprises the steps that multisource, heterogeneous and different-scale information representing the state of the power distribution network can be operated and maintained only by needing a large amount of calculation in the information feature extraction and fusion processes, and particularly the requirements on calculation resources are higher and higher along with the application of an artificial intelligence technology in operation and maintenance of the power distribution network; however, the operation and maintenance of the power distribution network equipment put high requirements on data calculation and transmission time delay; Multi-Access Edge Computing (MEC) can sink storage data and Computing resources to the Edge of the network, reducing transmission delay. In the operation and maintenance service of the power distribution network equipment, more emphasis is placed on time delay as a main index of service quality, however, the freshness of information acquisition cannot be guaranteed due to low time delay and high throughput, so that the accuracy and reliability of state analysis of the power distribution network equipment are reduced; the characterization index of the Information freshness is Information Age (AoI), i.e. the time (calculated from the time generated by the Information sending end) when the receiving end receives the Information, which is affected by the communication throughput and the transmission delay; meanwhile, in the power distribution network equipment communication acquisition and calculation network, the time-varying characteristic of a wireless channel needs to be considered, communication resources in a random network need to be optimized, the calculation efficiency of the MEC is improved, the accuracy of power distribution network equipment state analysis is improved, and the stability of the power distribution network is practically improved. Due to different environments of the power distribution network equipment, time-varying wireless channels and ultra-reliable low-delay service requirements, operation and maintenance of the power distribution network equipment still face huge challenges.
Disclosure of Invention
The inventors have found that the following challenges are faced in the maintenance of the operation of power distribution network equipment:
1) due to the fact that power distribution network equipment is numerous, the number of sensors for monitoring the states of the power distribution network equipment is large, but due to the time-varying characteristic of a wireless channel and the limited energy of the sensors, the communication transmission scheme is optimized under the condition that the calculated amount and the transmission amount are uncertain, and the utilization efficiency of energy is improved.
2) Traditional MEC uninstallation and communication scheme all concern its transmission delay and calculation delay, have neglected the new freshness of receiving terminal receipt information, even total calculation delay and transmission delay are all shorter, can not guarantee that the receiving terminal can normally carry out state update to lack calculation uninstallation and communication scheme of optimizing information new freshness, thereby influence the monitoring accuracy of distribution network equipment.
3) In the monitoring service of the power distribution network equipment, the statistical characteristics of a wireless channel and an interference channel are greatly different from those of a traditional public network, and the information generated by the service is subjected to different random distributions, so that the difficulty is brought to the analysis of the freshness of the information, and the optimization of the freshness of the information is directly influenced.
Aiming at the characteristics of monitoring and operation maintenance of the power distribution network equipment, the method provided by the invention mainly solves the problem of optimizing communication resources and computing resources in the processes of computation unloading and communication transmission, minimizes AoI under the condition of energy limitation, and verifies the advantage of the algorithm in the aspect of information freshness based on actually measured channel data and information statistic random distribution.
AoI is widely studied in various MEC applications as a measure, in 5G, Ultra-Reliable Low-Latency Communication (URLLC) tends to focus on average AoI performance, whereas URLLC performance is determined to be an extreme event, and particularly for power distribution network equipment monitoring traffic, the state accuracy depends on information freshness, whereas for URLLC traffic, the information freshness mainly depends on the influence of high-Latency events on power distribution network equipment monitoring. According to the invention, the statistical characteristic of the maximum AoI in the power distribution network equipment monitoring system is firstly analyzed, communication and computing resources are optimized according to the statistical characteristic of the maximum AoI, the unloading capacity of the system is improved, and the monitoring accuracy of the power distribution network equipment is finally improved.
In view of this, the invention aims to provide an intelligent calculation method for power distribution network equipment based on information freshness, so as to improve the monitoring accuracy of the power distribution network equipment.
Based on the above purpose, the invention provides an intelligent calculation method in a power distribution network equipment communication network based on information freshness, and the specific technical scheme is as follows:
step A, calculating the transmission rate and the signal-to-noise ratio of the system under short packet communication according to a model of an intelligent calculation system of the power distribution network equipment;
step B, according to the successful state updating, defining AoI expression;
step C, calculating the average information age of the system according to the model of the intelligent computing system of the power distribution network equipment;
step D, calculating the maximum information age of the system according to the model of the intelligent computing system of the power distribution network equipment;
and E, defining AoI fluctuation coefficients according to the two calculated performance indexes, and calculating the average local calculation rate, the average transmission rate and the average remote calculation rate of the system to finally obtain AoI fluctuation coefficient optimal values.
Wherein, step A specifically includes:
a1, detecting that the distribution network equipment is URLLC service, requiring finite block length communication transmission, and calculating the transmission rate of the finite block length communication transmission according to the finite block length transmission theory
Figure BDA0003020705350000031
Wherein L isnThe length of the data packet transmitted in the nth transmission and calculation is the effective transmission information of the data packet not exceeding 250bytes, and the decoding error probability epsilon is more than 0.
Figure BDA0003020705350000032
Representing the nth kth sensor transmissionThe rate.
Figure BDA0003020705350000033
Signal-to-noise ratio, erfc, for the nth sensor transmission channel-1(. cndot.) is an inverse complement error function.
A2, calculating the signal-to-noise ratio as
Figure BDA0003020705350000041
Wherein
Figure BDA0003020705350000042
For the nth sensor transmit power upstream,
Figure BDA0003020705350000043
channel gain of transmitting power for the nth sensor in the uplink; n is a radical of0Is the power spectral density of the noise and W is the bandwidth of the transmission.
Wherein, step B specifically includes:
b1, according to successful status update, defining AoI expression as
Figure BDA0003020705350000044
Wherein tau iskAoI, I, denoted as the kth sensor at any successive time t ≧ 0{·}To indicate the function, and,
Figure BDA0003020705350000045
is a Bernoulli distributed variable which obeys B (1, 1-epsilon), wherein
Figure BDA0003020705350000046
In order for the receiving end to successfully receive and update the status,
Figure BDA0003020705350000047
fails to update the state.
Wherein, step C specifically includes:
c1, calculating locally calculated average AoI
Figure BDA0003020705350000048
Wherein C isiDenotes the ith local calculation time, λlIs the local service rate.
C2, calculate average AoI calculated by MEC Server
Figure BDA0003020705350000049
Wherein λtFor the average rate of transmission, λcIs the computing rate of the MEC server.
C3, calculating average AoI of partial unloading schemes
Figure BDA00030207053500000410
Wherein
Figure BDA0003020705350000051
ζ=ηssδ,
Figure BDA0003020705350000052
ηsδ is the error probability, which is the distribution strength of the poisson distribution.
Wherein, step D specifically includes:
d1, according to the distributed computing model and the application scene of intelligent computing of the distribution network equipment, the distribution network equipment is in a busy processing state at the transmitting end and the receiving end, and the condition distribution obtained AoI according to the system is
Figure BDA0003020705350000053
Wherein T isiUpdate time of state i, Ti=Wi+Si,WiFor waiting time, SiAs service time, FiIs the time interval of arrival.
D2, calculating to obtain a peak value AoI, which is specifically represented as
Figure BDA0003020705350000054
Where γ is the signal-to-noise ratio, τiAoI at time i.
Wherein, step E specifically includes:
e1, the peak value AoI directly influences the error rate of execution of the power distribution network control signal, the stability of AoI directly influences the long-time stability degree of execution of the power distribution network control system, and a AoI fluctuation coefficient is defined in the technical scheme of the patent and specifically expressed as
Figure BDA0003020705350000055
Figure BDA0003020705350000056
The average calculated for the MEC server AoI.
E2, in order to ensure that data of equipment sensors in the power distribution network can be effectively transmitted to an MEC server and state updating can be effectively achieved, and a fluctuation coefficient AoI is optimized.
E3, aiming at the optimization problem, optimizing by adopting a heuristic artificial intelligence method, adopting a Markov decision process and adopting multi-agent deep reinforcement learning to convert the optimization problem into a partial observed Markov process; the observation set is represented by O ═ O1,…,Oj,…,OVThe action set may be denoted as a ═ a1,…,Aj,…,AVIn the observation space, the observation space of each proxy computing unit at the time t is oj(t)∈OjWhich can be described as
Figure BDA0003020705350000061
In the action set, the jth proxy computation element at time t is aj(t)∈Aj
Figure BDA0003020705350000062
E4, determining a reward function, which is a function of the state and the action, and which is a measure of the effect of the computing unit in performing the action in the given state. In the training phase, when the agent computing unit selects an action, its corresponding reward function value is fed back to the agent, and the agent computing unit updates its state. Each agent computing unit obtains an optimal strategy according to continuous rewards, so that the rewards are maximized, and AoI fluctuation coefficients and AoI average parameters are adopted correspondingly to the rewards.
E5, a depth-enhanced gradient descent algorithm based on a multi-agent computing unit, which employs an actor-critic method to initialize each parameter
(1) The method comprises the steps of initializing estimation of actors and critics of each agency, and obtaining an action set in a training stage;
(2) initializing the size of a replay storage memory of each agency mechanism;
e6, further performing parameter updating based on the initialization, receiving the initial observation space OmAnd Oj(j-1, …, V) and set γ -0.
E7, obtaining the current input state and the obtained new observed value and input state when each iteration of t
<1>Selecting action a for compute agent j each timej(t)=μj(Oj(t)), implementing the present strategy πjAnd obtaining a current input state sj(t);
<2>Executing a (t) ═ a1(t),a2(t),…,av(t) obtaining a prize θ according to the established functionmEach compute proxy unit to obtain a new observed value o'jAnd input state sj′。
E8, for each compute agent j, the following conversion is performed.
<1>If the number of transitions is less than NγThen store { sj(t),aj(t),γ(t),sj' } into a buffer;
<2> otherwise
Replace the conversion held earliest in the buffer.
Randomly selecting a small batch of conversion from the buffer area;
thirdly, updating a parameter matrix of the critic evaluation network by taking loss minimization as a target;
updating parameters of the actor evaluation network through a maximization strategy objective function;
according to
Figure BDA0003020705350000071
And
Figure BDA0003020705350000072
and updating the target network parameters of the operator and the critic.
E9, the iteration is ended, and the optimal signal-to-noise ratio γ is obtained as γ + γ' (t), so that the optimal AoI fluctuation coefficient is obtained.
Compared with the prior art, the technical scheme has the following advantages:
in the technical scheme of the invention, an index of information freshness, namely an information age (AoI) fluctuation coefficient is provided aiming at the problem that the time delay performance in the information acquisition and control calculation of the power distribution network equipment cannot accurately ensure the timely processing of the service; calculating the average information age and the maximum information age of the system according to the model of the intelligent calculation system of the power distribution network equipment; and calculating the average local calculation rate, the average transmission rate and the average remote calculation rate of the system in the optimization problem of minimizing the fluctuation coefficient according to the two calculated performance indexes. Therefore, according to the characteristics of the operation maintenance system of the power distribution network equipment, the problem of intelligent calculation and transmission delay is considered, the information age (AoI) fluctuation coefficient which is the considered index of information freshness is provided according to the operation maintenance service, and the transmission and calculation rate (including local and remote calculation) is optimized according to the considered index.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a flowchart of an intelligent calculation method for power distribution network equipment based on information freshness according to an embodiment of the present invention;
FIG. 2 is an optimization scheme comparing three algorithms according to an embodiment of the present invention, wherein the average AoI is obtained under different SNR conditions;
fig. 3 is an optimization scheme comparing four algorithms according to an embodiment of the present invention, and illustrates an extremum AoI for different K values.
Detailed Description
The inventors have found that the following challenges are faced in the maintenance of the operation of power distribution network equipment:
1) due to the fact that power distribution network equipment is numerous, the number of sensors for monitoring the states of the power distribution network equipment is large, but due to the time-varying characteristic of a wireless channel and the limited energy of the sensors, the communication transmission scheme is optimized under the condition that the calculated amount and the transmission amount are uncertain, and the utilization efficiency of energy is improved.
2) Traditional MEC uninstallation and communication scheme all concern its transmission delay and calculation delay, have neglected the new freshness of receiving terminal receipt information, even total calculation delay and transmission delay are all shorter, can not guarantee that the receiving terminal can normally carry out state update to lack calculation uninstallation and communication scheme of optimizing information new freshness, thereby influence the monitoring accuracy of distribution network equipment.
3) In the monitoring service of the power distribution network equipment, the statistical characteristics of a wireless channel and an interference channel are greatly different from those of a traditional public network, and the information generated by the service is subjected to different random distributions, so that the difficulty is brought to the analysis of the freshness of the information, and the optimization of the freshness of the information is directly influenced.
Aiming at the characteristics of monitoring and operation maintenance of the power distribution network equipment, the method solves the problem of optimizing communication resources and computing resources in the processes of computation unloading and communication transmission, minimizes AoI under the condition of energy limitation, and verifies the advantage of the algorithm in the aspect of information freshness based on actually measured channel data and information statistic random distribution.
AoI is widely studied in various MEC applications as a measure, in 5G, Ultra-Reliable Low-Latency Communication (URLLC) tends to focus on average AoI performance, whereas URLLC performance is determined to be an extreme event, and particularly for power distribution network equipment monitoring traffic, the state accuracy depends on information freshness, whereas for URLLC traffic, the information freshness mainly depends on the influence of high-Latency events on power distribution network equipment monitoring. According to the invention, the statistical characteristic of the maximum AoI in the power distribution network equipment monitoring system is firstly analyzed, communication and computing resources are optimized according to the statistical characteristic of the maximum AoI, the unloading capacity of the system is improved, and the monitoring accuracy of the power distribution network equipment is finally improved.
In view of this, the invention aims to provide an intelligent calculation method for power distribution network equipment based on information freshness, so as to improve the monitoring accuracy of the power distribution network equipment.
The inventor of the invention considers that the MEC-based power distribution network communication network can be utilized, on one hand, the MEC is applied to a power distribution communication system, the data calculation rate is effectively improved, and the stable operation is realized; on the other hand, the intelligent device can calculate quickly, further reduce the calculation delay and the transmission delay of the data when the power distribution network equipment is in operation and maintenance, improve the information freshness of the system, and reduce the information age of the system.
In addition, the technical scheme of the invention can also optimize AoI fluctuation coefficients of users by a depth-enhanced gradient descent algorithm based on a multi-agent computing unit, and the algorithm adopts an actor-critic method to ensure that data of equipment sensors in the power distribution network can be effectively transmitted to the MEC server and the state updating can be effectively achieved.
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to specific embodiments and the accompanying drawings.
Referring to fig. 1, an embodiment of the present invention provides an intelligent calculation method for power distribution network devices based on information freshness, where the method includes:
step A, calculating the transmission rate and the signal-to-noise ratio of the system under short packet communication according to a model of an intelligent calculation system of the power distribution network equipment;
step B, according to the successful state updating, defining AoI expression;
step C, calculating the average information age of the system according to the model of the intelligent computing system of the power distribution network equipment;
step D, calculating the maximum information age of the system according to the model of the intelligent computing system of the power distribution network equipment;
and E, defining AoI fluctuation coefficients according to the two calculated performance indexes, and calculating the average local calculation rate, the average transmission rate and the average remote calculation rate of the system to finally obtain AoI fluctuation coefficient optimal values.
Wherein, step A specifically includes:
a1, detecting that the distribution network equipment is URLLC service, requiring finite block length communication transmission, and calculating the transmission rate of the finite block length communication transmission according to the finite block length transmission theory
Figure BDA0003020705350000101
Wherein L isnThe length of the data packet transmitted in the nth transmission and calculation is the effective transmission information of the data packet not exceeding 250bytes, and the decoding error probability epsilon is more than 0.
Figure BDA0003020705350000102
Representing the rate of transmission of the nth kth sensor.
Figure BDA0003020705350000103
Signal-to-noise ratio, erfc, for the nth sensor transmission channel-1(. cndot.) is an inverse complement error function.
A2, calculating the signal-to-noise ratio as
Figure BDA0003020705350000111
Wherein
Figure BDA0003020705350000112
For the nth sensor transmit power upstream,
Figure BDA0003020705350000113
channel gain of transmitting power for the nth sensor in the uplink; n is a radical of0Is the power spectral density of the noise and W is the bandwidth of the transmission.
Wherein, step B specifically includes:
b1, according to successful status update, defining AoI expression as
Figure BDA0003020705350000114
Wherein tau iskAoI, I, denoted as the kth sensor at any successive time t ≧ 0{·}To indicate the function, and,
Figure BDA0003020705350000115
is a Bernoulli distributed variable which obeys B (1, 1-epsilon), wherein
Figure BDA0003020705350000116
In order for the receiving end to successfully receive and update the status,
Figure BDA0003020705350000117
to be updatedThe state fails.
Wherein, step C specifically includes:
c1, calculating locally calculated average AoI
Figure BDA0003020705350000118
Wherein C isiDenotes the ith local calculation time, λlIs the local service rate.
C2, calculate average AoI calculated by MEC Server
Figure BDA0003020705350000119
Wherein λtFor the average rate of transmission, λcIs the computing rate of the MEC server.
C3, calculating average AoI of partial unloading schemes
Figure BDA00030207053500001110
Wherein
Figure BDA0003020705350000121
ζ=ηssδ,
Figure BDA0003020705350000122
ηsδ is the error probability, which is the distribution strength of the poisson distribution.
Wherein, step D specifically includes:
d1, according to the distributed computing model and the application scene of intelligent computing of the distribution network equipment, the distribution network equipment is in a busy processing state at the transmitting end and the receiving end, and the condition distribution obtained AoI according to the system is
Figure BDA0003020705350000123
Wherein T isiUpdate time of state i, Ti=Wi+Si,WiFor waiting time, SiAs service time, FiIs the time interval of arrival.
D2, calculating to obtain a peak value AoI, which is specifically represented as
Figure BDA0003020705350000124
Where γ is the signal-to-noise ratio, τiAoI at time i.
Wherein, step E specifically includes:
e1, the peak value AoI directly influences the error rate of execution of the power distribution network control signal, the stability of AoI directly influences the long-time stability degree of execution of the power distribution network control system, and a AoI fluctuation coefficient is defined in the technical scheme of the patent and specifically expressed as
Figure BDA0003020705350000125
Figure BDA0003020705350000126
The average calculated for the MEC server AoI.
E2, in order to ensure that data of equipment sensors in the power distribution network can be effectively transmitted to an MEC server and state updating can be effectively achieved, and a fluctuation coefficient AoI is optimized.
E3, aiming at the optimization problem, optimizing by adopting a heuristic artificial intelligence method, adopting a Markov decision process and adopting multi-agent deep reinforcement learning to convert the optimization problem into a partial observed Markov process; the observation set is represented by O ═ O1,…,Oj,…,OVThe action set may be denoted as a ═ a1,…,Aj,…,AVIn the observation space, the observation space of each proxy computing unit at the time t is oj(t)∈OjWhich can be described as
Figure BDA0003020705350000131
In the action set, the jth proxy computation element at time t is aj(t)∈Aj
Figure BDA0003020705350000132
E4, determining a reward function, which is a function of the state and the action, and which is a measure of the effect of the computing unit in performing the action in the given state. In the training phase, when the agent computing unit selects an action, its corresponding reward function value is fed back to the agent, and the agent computing unit updates its state. Each agent computing unit obtains an optimal strategy according to continuous rewards, so that the rewards are maximized, and AoI fluctuation coefficients and AoI average parameters are adopted correspondingly to the rewards.
E5, a depth-enhanced gradient descent algorithm based on a multi-agent computing unit, which employs an actor-critic method to initialize each parameter
(1) The method comprises the steps of initializing estimation of actors and critics of each agency, and obtaining an action set in a training stage;
(2) initializing the size of a replay storage memory of each agency mechanism;
e6, further performing parameter updating based on the initialization, receiving the initial observation space OmAnd Oj(j-1, …, V) and set γ -0.
E7, obtaining the current input state and the obtained new observed value and input state when each iteration of t
<1>Selecting action a for compute agent j each timej(t)=μj(Oj(t)), implementing the present strategy πjAnd obtaining a current input state sj(t);
<2>Executing a (t) ═ a1(t),a2(t),…,av(t) obtaining a prize θ according to the established functionmEach compute proxy unit to obtain a new observed value o'jAnd input state sj′。
E8, for each compute agent j, the following conversion is performed.
<1>If the number of transitions is less than NγThen store { sj(t),aj(t),γ(t),sj' } into a buffer;
<2> otherwise
Replace the conversion held earliest in the buffer.
Randomly selecting a small batch of conversion from the buffer area;
thirdly, updating a parameter matrix of the critic evaluation network by taking loss minimization as a target;
updating parameters of the actor evaluation network through a maximization strategy objective function;
according to
Figure BDA0003020705350000141
And
Figure BDA0003020705350000142
and updating the target network parameters of the operator and the critic.
E9, the iteration is ended, and the optimal signal-to-noise ratio γ is obtained as γ + γ' (t), so that the optimal AoI fluctuation coefficient is obtained.
In practical applications, the communication system may be a power distribution communication system, and includes a Base Station (BS) equipped with an edge server and various electrical devices as intelligent devices. Wherein the base station provides wireless access services and the edge server provides computing services. Various types of electrical devices as smart devices may specifically include: a distribution monitoring terminal device, a distribution automation terminal device, or a distribution transformer monitoring terminal device.
In the technical scheme of the invention, an index of information freshness, namely an information age (AoI) fluctuation coefficient is provided aiming at the problem that the time delay performance in the information acquisition and control calculation of the power distribution network equipment cannot accurately ensure the timely processing of the service; calculating the average information age and the maximum information age of the system according to the model of the intelligent calculation system of the power distribution network equipment; and calculating the average local calculation rate, the average transmission rate and the average remote calculation rate of the system in the optimization problem of minimizing the fluctuation coefficient according to the two calculated performance indexes. Therefore, according to the characteristics of the operation maintenance system of the power distribution network equipment, the problem of intelligent calculation and transmission delay is considered, the information age (AoI) fluctuation coefficient which is the considered index of information freshness is provided according to the operation maintenance service, and the transmission and calculation rate (including local and remote calculation) is optimized according to the considered index.
According to the application, on one hand, the MEC is applied to a power distribution communication system, so that the data calculation rate is effectively improved, and the stable operation is realized; on the other hand, the intelligent device can perform optimal calculation, further reduce the calculation delay and the transmission delay of the data during operation and maintenance of the power distribution network device, improve the information freshness of the system, and reduce the average information age of the system.
Specifically, the following specific embodiments are given in the present application in combination with an actual application scenario:
1. system model
Consider K sensors in a plurality of devices of a power distribution network, sending their status information (i.e., monitored data) to an edge server (compute node). Because the one-time calculation and the transmission time have the cut-off time, the transmission and calculation times are set as n,
Figure BDA0003020705350000151
and setting the time of the nth transmission as tnThe time for successfully updating the status (including three parts: upload information time, calculation time, transfer action time) can be represented as Tn=tn-tn-1(ii) a Due to the increasing number of sensors, it is unlikely that one sensor transmission can be supported independently in the same orthogonal communication resource, and Non-orthogonal multiple access (NOMA) technology needs to be adopted[10]Meanwhile, the distribution network equipment detects URLLC service, which needsFor the transmission of limited-block-length communication, the effective transmission information is 250bytes at most, and according to the transmission theory of the limited-block length, the transmission rate can be expressed as
Figure BDA0003020705350000152
Wherein L isnThe length of the data packet transmitted in the nth transmission and calculation is the effective transmission information of the data packet not exceeding 250bytes, and the decoding error probability epsilon is more than 0.
Figure BDA0003020705350000153
Representing the rate of transmission of the nth kth sensor.
Figure BDA0003020705350000154
The signal-to-noise ratio for the nth kth sensor transmission channel can be calculated according to equation (2)
Figure BDA0003020705350000155
Wherein
Figure BDA0003020705350000156
For the nth sensor transmit power upstream,
Figure BDA0003020705350000157
gain for the uplink channel; n is a radical of0Power spectral density for noise, W is the bandwidth of transmission, erfc-1(. cndot.) is an inverse complement error function. In the course of the transmission process,
Figure BDA0003020705350000158
and transmit power
Figure BDA0003020705350000159
Not exceeding maximum transmission power, i.e.
Figure BDA00030207053500001510
In order to ensure that the state of the transmission can be updated, the method needs to be implemented
Figure BDA00030207053500001511
Wherein DkRepresenting the data that needs to be transmitted for each state update. I.e. each time the information is transferred is less than the maximum transmission rate. A state update time interval of SnShould be greater than the minimum time interval, i.e. Sn≥Smin. AoI for the kth sensor at any successive time t ≧ 0 is denoted as τkIf the kth sensor can successfully update the state at the nth time, when t equals tnAoI is reset to 0. So the expression of AoI is
Figure BDA0003020705350000161
Wherein, I{·}To indicate the function, and,
Figure BDA0003020705350000162
is a Bernoulli distributed variable which obeys B (1, 1-epsilon), wherein
Figure BDA0003020705350000163
In order for the receiving end to successfully receive and update the status,
Figure BDA0003020705350000164
fails to update the state. Its AoI linear increase for two consecutive time instants can be expressed as
τk(tn)=τk(tn-1)+t-tn-1 (4)
2. Distributed computing model
In addition to AoI which accounts for status updates, AoI which accounts for local and remote computations is also needed. In the local calculation, the calculation amount is small, so the maximum AoI is not considered, for the convenience of calculation and the practical situation, the calculation is carried out in the form of average AoI, and the ith local calculation time is set as CiThen it is knownThe calculation time follows independent exponential distribution with the same distribution, and the mean value is E [ C ]i]=1/λl,λlFor local service rate, and because of CiAnd Ci-1Independently of one another, have a mean value of
Figure BDA0003020705350000165
Its second moment is
Figure BDA0003020705350000166
According to the definition of locally calculated average AoI, i.e.
Figure BDA0003020705350000167
For remote computing problem, a zero-waiting processing mode is adopted, the arrival random process of a computing queue is equal to the random process of channel transmission departure, the computing queue follows the poisson process due to the zero-waiting computing processing mode, so that the computing queue and an edge computing server form an M/M/1 system, and the average value of the time required by transmission is E [ F ]i]=1/λtWherein λ istFor the average rate of transmission, its average service time may be denoted as E Bi]=1/λcWherein λ iscIs the computing rate of the MEC server. According to the transmission obeys exponential distribution, the time interval FiAnd Fi-1Independent of each other, can be obtained
Figure BDA0003020705350000168
And is
Figure BDA0003020705350000169
According to the service system of the MEC server, the update time of the state i is TiWhich is divided into two parts, respectively waiting time WiAnd service time SiI.e. Ti=Wi+Si(ii) a Wherein waiting time WiAssociated with the i-1 th state update time Ti-1And time interval of arrival FiOn that, then can obtain
E[TiFi-1]=E[WiFi-1]+E[Bi]E[Fi-1] (6)
According to E [ Bi]=1/λcCan find out
Figure BDA00030207053500001610
From equations (6) and (7), the average AoI calculated by the MEC server can be obtained as
Figure BDA0003020705350000171
3. Average AoI Performance analysis
However, in most cases of MEC-assisted calculation, a partial offloading manner is adopted for calculation, so on this basis, AoI of the power distribution network equipment monitoring partial calculation offloading manner is analyzed next, and considering that the calculation amount is relatively small, derivation is performed for a fixed calculation time, that is, each task amount is fixed, and the calculation resources allocated thereto are fixed; for partial offload computation, if the local computation and the transport process are considered as one process, the remote computation can be considered as a G1/D/1 queuing model if λc≥λtAnd the local computation time is 1/lambdatRemote calculation time is 1/lambdacNote that the arrival process of the remote compute queue is the same as the departure process of the concatenation of the local server and the transport channel. The computation time of the local server and the remote server is determined, and the transmission time of the channel is exponentially distributed; therefore, three expectations in equation (9) need to be calculated,
Figure BDA0003020705350000172
since the local computation time is constant, i.e. Li=1/λtSo that E [ B ] can be derivedi]Is composed of
Figure BDA0003020705350000173
According to E [ Fi]Can derive E [ F ]iFi-1]And to obtain
Figure BDA0003020705350000174
Figure BDA0003020705350000175
Then calculate E [ T ]iFi-1]. When the status i is updated, wait time WiConsistent with the transmission time and time interval of the i-1 th state in the MEC system. In particular, when Ti-1<FiFor example, when the state i-1 still waits for queuing processing, the i-1 st message always waits or is processed in the queue, and W can be obtainedi=Ti-1-Fi. Otherwise, the latency W may be consideredi0. Thus, the latency for the ith status message may be defined as
Wi=(Ti-1-Fi)+ (13)
According to Ti=Wi+SiCan obtain E [ T ]iFi-1]
E[TiFi-1]=E[(Wi+Si)Fi-1]=E[WiFi-1]+E[SiFi-1] (14)
From Ti=Wi+SiAnd equation (13), to obtain WiIs expressed as
Figure BDA0003020705350000176
Wherein at time Ti-2Is the i-2 nd information stateService time and wait time with Si-1、FiAnd Fi-1Independently of each other, taking into account the time Ti、Ti-1And Ti-2The probability density distribution function at GI/M/1 is
Figure BDA0003020705350000177
Wherein δ satisfies the following transformation
δ=L′(ηssδ) (17)
Wherein L' (. cndot.) is a Laplace transform, and when M > 0, then MiHas a probability density function of
Figure BDA0003020705350000181
According to MiThe probability density function of (a), can find delta,
Figure BDA0003020705350000182
let ζ equal to ηssδ, from the probability density distribution function of GI/M/1, equation (16), one can obtain
E[Ti|Mi-1=mi]=E[((Ti-2-mi)++Fi-1-Mi)+] (20)
Can be integrated according to probability density
Figure BDA0003020705350000183
Wherein
Figure BDA0003020705350000184
And
Figure BDA0003020705350000185
from equation (6), one can obtain
Figure BDA0003020705350000186
Wherein
Figure BDA0003020705350000187
From equation (22), one can obtain
Figure BDA0003020705350000188
The average AoI for the partial unloading schemes obtained according to (11), (12) and (21) is
Figure BDA0003020705350000189
4. Peak AoI Performance analysis
In the intelligent calculation scheme of the power distribution network equipment, the control effect of the intelligent calculation scheme is often dependent on the fluctuation range of the peak value and the average value of AoI, so that the analysis of AoI in the edge calculation becomes very important. According to a distributed computing model and an application scene of intelligent computing of the power distribution network equipment, the transmitting end and the receiving end are in a busy processing state, so that AoI condition distribution can be obtained according to the system
Figure BDA00030207053500001810
According to FiProbability density being unconditional probability, i.e.
Figure BDA0003020705350000191
Followed by the removal of TiThe correlation of (a) can be obtained:
Figure BDA0003020705350000192
finally, the value of the peak value AoI is obtained
Figure BDA0003020705350000193
The peak value AoI directly affects the error rate of the execution of the distribution network control signal, however, the stability of AoI directly affects the long-time stability of the execution of the distribution network control system, so the scheme adopts a new measurement index-AoI fluctuation coefficient, and the average AoI represents the speed of the long-time distribution network control signal processing, and also needs to ensure that the execution efficiency of the system can be ensured within a short time. Define the m-th user's AoI fluctuation coefficient as
Figure BDA0003020705350000194
5. Optimization problem and solution
In order to ensure that data of equipment sensors in the power distribution network can be effectively transmitted to the MEC server and state updating can be effectively achieved, the actual needs of monitoring of the power distribution network equipment are considered, and the optimization problem is determined as
Figure BDA0003020705350000195
Wherein u isnk(t) is an indicator that the kth task is assigned at device n at time t if unkIf (t) is 1, then the task is serviced, otherwise unk(t) is 0.Γ is the value of the longest average AoI of a service in the distribution network system, which is related to the time delay of the calculation and action of the long-term distribution network control service.
In a power distribution network control system, the control system is trans-regional, so the decision process is jointly decided by using the information of a plurality of control unitsThe optimization problem of the formula (30) is relatively complex, so that a heuristic artificial intelligence method is adopted for optimization, a Markov decision process is adopted, and multi-agent deep reinforcement learning is adopted to convert the problem of the formula (30) into a partially observed Markov process; let the number of computational units it handles be V, which represents V state sets S in a state set, whose observation set can be expressed as O ═ { O ═ O1,…,Oj,…,OVIts action set can be represented as a ═ a1,…,Aj,…,AVA state set S describes the state configuration of a plurality of proxy computing units; o isjState S (t) e S observed at time t by each proxy calculation unit representing the space observed by the jth (j ═ 1, …, V) proxy calculation unit; a. thejIs the action space of the jth (j ═ 1, …, V) proxy compute unit, which uses the policy pi for each given state S ∈ S for each proxy compute unitj:S→AjAn action is selected from the action space based on the observation space. The state of the environmental state at time t may be denoted as S (t) e S, which may be denoted as S (t) λ ═ λ1c(t),λ2c(t),…,λvc(t),λ1t(t),λ2t(t)…,λvt(t),λ1l(t),λ2l(t)…,λvl(t) }. Since the observation space of each proxy computing unit at the time t is o in the observation spacej(t)∈OjWhich can be described as
Figure BDA0003020705350000201
In the action set, the jth proxy computation element at time t is aj(t)∈Aj
Figure BDA0003020705350000202
The reward function is a function of state and action, which is a measure of the effectiveness of a computing unit to act in a given state. In the training phase, when the agent computing unit selects an action, the corresponding reward function value is fed back to the agent, and the agent computing unit computesThe cell updates its state. Each agent computing unit obtains the optimal strategy according to the continuous rewards, and the reward maximization is realized by adopting AoI fluctuation coefficients of an equation (29) and an average AoI parameter of an equation (24) corresponding to the reward.
According to the design, a depth-enhanced gradient descent algorithm based on a multi-agent computing unit is provided, the algorithm adopts an actor-critic method, and the algorithm can be expressed as
Initialization
(1) The method comprises the steps of initializing estimation of actors and critics of each agency, and obtaining an action set in a training stage;
(2) initializing the size of a replay storage memory of each agency mechanism;
parameter updating
(1) When each total iteration
Receiving an initial observation space OmAnd Oj(j-1, …, V) and set γ -0.
(2) When each t iterates
<1>Selecting action a for compute agent j each timej(t)=μj(Oj(t)), implementing the present strategy πjAnd obtaining a current input state sj(t);
<2>Executing a (t) ═ a1(t),a2(t),…,av(t) obtaining a prize θ according to the established functionmEach compute proxy unit to obtain a new observed value o'jAnd input status of s'j
(3) For each computing agent j
<1>If the number of transitions is less than Nγ
Then store sj(t),aj(t),γ(t),sj' } into a buffer;
<2> otherwise
Replace the conversion held earliest in the buffer.
Randomly selecting a small batch of conversion from the buffer area;
thirdly, updating a parameter matrix of the critic evaluation network by taking loss minimization as a target;
updating parameters of the actor evaluation network through a maximization strategy objective function;
according to
Figure BDA0003020705350000211
And
Figure BDA0003020705350000212
and updating the target network parameters of the operator and the critic.
<3>γ=γ+γ′(t)
6. Simulation result
Considering the characteristics of distribution network equipment distribution, considering that the distribution range of equipment served by one edge computing agent is not too large, the service range adopted by simulation is 0.1km, considering the cooperative range of a plurality of edge agents in the simulation, and considering the number V of the cooperative edge agents as 9; considering that the wireless channel bandwidth is W-20 MHz, and the transmission power of each acquisition sensor is p-20 dBm, the variance of the noise is σ2-100 dBm; in the simulation, the wireless channel state adopts a finite state Markov chain, such as C ═ C1c2c3Wherein its transmission probability matrix is
Figure BDA0003020705350000213
Other simulation Environment settings are shown in Table 1
TABLE 1 simulation Environment parameters and values
Parameter(s) Value of parameter
Input the data length l of each status update 20KB
Number of cycles n of the CPU required for each state update 150bits
Main frequency f of CPU of edge computing serverm 20GHz
Main frequency f of CPU of local computing unit devices 1.5GHz
Sensor to edge server bandwidth W 20MHz
Distance d of sensor to edge server 0.1km
Sensor transmission power p 20dBm
Variance σ of complex white Gaussian noise2 -100dBm
Duration t of each symboll 0.006ms
Actor learning rate for deep reinforcement learning Decay from 0:0002 to 0:0000001
Critic learning rate for deep reinforcement learning Decay from 0:002 to 0:000001
Sensor distribution density d s 30 pieces/square meter
Discount factor for reward function Increase from 0.8 to 0.99
As shown in fig. 2, the optimization schemes of the three algorithms are compared with the average AoI under different SNR conditions, wherein the first optimization method adopts a multi-objective optimization mode, and the optimization objective is the average AoI, which is represented as "average AoI optimization" in fig. 2; the second method employs a single-computational-unit reinforcement learning method, denoted as "single-computational-unit average AoI" in fig. 2; the third is a deep reinforcement learning method of a plurality of computing units proposed in the present study, and the optimized index is AoI fluctuation coefficient, which is represented as "multi-agent AoI fluctuation coefficient" in fig. 2, and as can be seen from fig. 2, the proposed scheme can obtain a smaller average AoI.
As shown in fig. 3, the optimization schemes of the four algorithms are compared, and the extremum AoI in the case of different K values is compared, wherein the first optimization method adopts a multi-objective optimization mode, and the optimization objective is an average AoI, which is denoted as "average AoI multi-objective optimization" in fig. 2; the second method employs a single computational unit reinforcement learning method, denoted as "single agent average AoI" in FIG. 2; the third is the deep reinforcement learning method of multiple computing units proposed in the present study, and the optimized index is AoI fluctuation coefficient, which is represented as "multiple agent AoI fluctuation coefficient" in fig. 3, and the fourth scheme is represented as "single agent AoI fluctuation coefficient" as can be seen from fig. 3, the proposed scheme can obtain a smaller extreme value AoI.
7. Summary of the invention
Aiming at the characteristics of the operation maintenance system of the power distribution network equipment, the problems of intelligent calculation and transmission time delay are considered, the consideration index of information freshness is provided according to the operation maintenance service, the transmission and calculation rate (including local and remote calculation) is optimized according to the index, theoretical support is provided for subsequent communication design and equipment optimization, and practical reference is also provided for actual system deployment.
In the description, each part is described in a progressive manner, each part is emphasized to be different from other parts, and the same and similar parts among the parts are referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (1)

1. The intelligent calculation method of the power distribution network equipment based on information freshness is characterized by comprising the following steps:
step A, calculating the transmission rate and the signal-to-noise ratio of the system under short packet communication according to a model of an intelligent calculation system of the power distribution network equipment;
step B, according to the successful state updating, defining AoI expression;
step C, calculating the average information age of the system according to the model of the intelligent computing system of the power distribution network equipment;
step D, calculating the maximum information age of the system according to the model of the intelligent computing system of the power distribution network equipment;
and E, defining AoI fluctuation coefficients according to the two calculated performance indexes, and calculating the average local calculation rate, the average transmission rate and the average remote calculation rate of the system to finally obtain AoI fluctuation coefficient optimal values.
Wherein, step A specifically includes:
a1, detecting that the distribution network equipment is URLLC service, requiring finite block length communication transmission, and calculating the transmission rate of the finite block length communication transmission according to the finite block length transmission theory
Figure FDA0003020705340000011
Wherein L isnThe length of the data packet transmitted in the nth transmission and calculation is the effective transmission information of the data packet not exceeding 250bytes, and the decoding error probability epsilon is more than 0.
Figure FDA0003020705340000012
Representing the rate of transmission of the nth kth sensor.
Figure FDA0003020705340000013
Signal-to-noise ratio, erfc, for the nth sensor transmission channel-1(. cndot.) is an inverse complement error function.
A2, calculating the signal-to-noise ratio as
Figure FDA0003020705340000014
Wherein
Figure FDA0003020705340000015
For the nth sensor transmit power upstream,
Figure FDA0003020705340000016
channel gain of transmitting power for the nth sensor in the uplink; n is a radical of0Is the power spectral density of the noise and W is the bandwidth of the transmission.
Wherein, step B specifically includes:
b1, according to successful status update, defining AoI expression as
Figure FDA0003020705340000021
Wherein tau iskAoI, I, denoted as the kth sensor at any successive time t ≧ 0{·}To indicate the function, and,
Figure FDA0003020705340000022
is a Bernoulli distributed variable which obeys B (1, 1-epsilon), wherein
Figure FDA0003020705340000023
In order for the receiving end to successfully receive and update the status,
Figure FDA0003020705340000024
fails to update the state.
Wherein, step C specifically includes:
c1, calculating locally calculated average AoI
Figure FDA0003020705340000025
Wherein C isiDenotes the ith local calculation time, λlIs the local service rate.
C2, calculate average AoI calculated by MEC Server
Figure FDA0003020705340000026
Wherein λtFor the average rate of transmission, λcIs the computing rate of the MEC server.
C3, calculating average AoI of partial unloading schemes
Figure FDA0003020705340000027
Wherein
Figure FDA0003020705340000028
ξ=ηssδ,
Figure FDA0003020705340000029
ηsδ is the error probability, which is the distribution strength of the poisson distribution.
Wherein, step D specifically includes:
d1, according to the distributed computing model and the application scene of intelligent computing of the distribution network equipment, the distribution network equipment is in a busy processing state at the transmitting end and the receiving end, and the condition distribution obtained AoI according to the system is
Figure FDA0003020705340000031
Wherein T isiUpdate time of state i, Ti=Wi+Si,WiFor waiting time, SiAs service time, FiIs the time interval of arrival.
D2, calculating to obtain a peak value AoI, which is specifically represented as
Figure FDA0003020705340000032
Where γ is the signal-to-noise ratio, τiAoI at time i.
Wherein, step E specifically includes:
e1, the peak value AoI directly influences the error rate of execution of the power distribution network control signal, the stability of AoI directly influences the long-time stability degree of execution of the power distribution network control system, and a AoI fluctuation coefficient is defined in the technical scheme of the patent and specifically expressed as
Figure FDA0003020705340000033
Figure FDA0003020705340000034
The average calculated for the MEC server AoI.
E2, in order to ensure that data of equipment sensors in the power distribution network can be effectively transmitted to an MEC server and state updating can be effectively achieved, and a fluctuation coefficient AoI is optimized.
E3, aiming at the optimization problem, optimizing by adopting a heuristic artificial intelligence method, adopting a Markov decision process and adopting multi-agent deep reinforcement learning to convert the optimization problem into a partial observed Markov process; the observation set is represented by O ═ O1,…,Oj,…,OVThe action set may be denoted as a ═ a1,…,Aj,…,AVIn the observation space, the observation space of each proxy computing unit at the time t is oj(t)∈OjWhich can be described as
Figure FDA0003020705340000035
In the action set, the jth proxy computation element at time t is aj(t)∈Aj
Figure FDA0003020705340000041
E4, determining a reward function, which is a function of the state and the action, and which is a measure of the effect of the computing unit in performing the action in the given state. In the training phase, when the agent computing unit selects an action, its corresponding reward function value is fed back to the agent, and the agent computing unit updates its state. Each agent computing unit obtains an optimal strategy according to continuous rewards, so that the rewards are maximized, and AoI fluctuation coefficients and AoI average parameters are adopted correspondingly to the rewards.
E5, a depth-enhanced gradient descent algorithm based on a multi-agent computing unit, which employs an actor-critic method to initialize each parameter
(1) The method comprises the steps of initializing estimation of actors and critics of each agency, and obtaining an action set in a training stage;
(2) initializing the size of a replay storage memory of each agency mechanism;
e6, further performing parameter updating based on the initialization, receiving the initial observation space OmAnd Oj(j-1, …, V) and set γ -0.
E7, obtaining the current input state and the obtained new observed value and input state when each iteration of t
<1>Selecting action a for compute agent j each timej(t)=μj(Oj(t)), implementing the present strategy πjAnd obtaining a current input state sj(t);
<2>Executing a (t) ═ a1(t),a2(t),…,av(t) obtaining a prize θ according to the established functionmEach compute proxy unit to obtain a new observed value o'jAnd input status of s'j
E8, for each compute agent j, the following conversion is performed.
<1>If the number of transitions is less than NγThen store { sj(t),aj(t),γ(t),s′j-into a buffer;
<2> otherwise
Replace the conversion held earliest in the buffer.
Randomly selecting a small batch of conversion from the buffer area;
thirdly, updating a parameter matrix of the critic evaluation network by taking loss minimization as a target;
updating parameters of the actor evaluation network through a maximization strategy objective function;
according to
Figure FDA0003020705340000051
And
Figure FDA0003020705340000052
updating actor and critic's target network parameters.
E9, the iteration is ended, and the optimal signal-to-noise ratio γ is obtained as γ + γ' (t), so that the optimal AoI fluctuation coefficient is obtained.
CN202110401915.3A 2021-04-14 2021-04-14 Intelligent calculation method for power distribution network equipment based on information freshness Pending CN113259968A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110401915.3A CN113259968A (en) 2021-04-14 2021-04-14 Intelligent calculation method for power distribution network equipment based on information freshness

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110401915.3A CN113259968A (en) 2021-04-14 2021-04-14 Intelligent calculation method for power distribution network equipment based on information freshness

Publications (1)

Publication Number Publication Date
CN113259968A true CN113259968A (en) 2021-08-13

Family

ID=77220825

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110401915.3A Pending CN113259968A (en) 2021-04-14 2021-04-14 Intelligent calculation method for power distribution network equipment based on information freshness

Country Status (1)

Country Link
CN (1) CN113259968A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113783941A (en) * 2021-08-23 2021-12-10 华北电力大学(保定) Method for minimizing average AoI in large-scale MIMO-MEC
CN113784353A (en) * 2021-08-24 2021-12-10 华北电力大学(保定) Method, apparatus and storage medium for status update system
CN114039918A (en) * 2021-10-09 2022-02-11 广东技术师范大学 Information age optimization method and device, computer equipment and storage medium
CN114615684A (en) * 2022-02-25 2022-06-10 哈尔滨工业大学(深圳) Information age optimization method and device of closed-loop system and storage medium
CN114786174A (en) * 2022-03-09 2022-07-22 西安电子科技大学 Information age minimization method and system for hidden transmission system of Internet of things
CN115052325A (en) * 2022-06-07 2022-09-13 华北电力大学(保定) Multi-frequency heterogeneous wireless communication network access selection algorithm suitable for transformer substation service
CN115361734A (en) * 2022-07-14 2022-11-18 鹏城实验室 Power and IRS phase shift joint optimization method and device based on information timeliness
CN115361705A (en) * 2022-07-14 2022-11-18 鹏城实验室 NOMA network task processing method and system for guaranteeing information timeliness
CN115842926A (en) * 2021-11-29 2023-03-24 北京航空航天大学 Remote video timeliness optimization method based on improved SARL
CN116193367A (en) * 2023-04-27 2023-05-30 北京航空航天大学 Unmanned aerial vehicle ad hoc network reliable transmission timeliness evaluation and calculation method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110235532A1 (en) * 2010-03-24 2011-09-29 Skyhook Wireless, Inc. System and Method for Resolving Multiple Location Estimate Conflicts in a WLAN-Positioning System
US20200027022A1 (en) * 2019-09-27 2020-01-23 Satish Chandra Jha Distributed machine learning in an information centric network
CN111262947A (en) * 2020-02-10 2020-06-09 深圳清华大学研究院 Calculation-intensive data state updating implementation method based on mobile edge calculation
CN111526495A (en) * 2020-04-22 2020-08-11 华中科技大学 Internet of vehicles AoI optimization task unloading method based on improved genetic algorithm
CN111884947A (en) * 2020-07-29 2020-11-03 电子科技大学 Data packet management method based on information age at receiving end
CN112437131A (en) * 2020-11-10 2021-03-02 西北农林科技大学 Data dynamic acquisition and transmission method considering data correlation in Internet of things

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110235532A1 (en) * 2010-03-24 2011-09-29 Skyhook Wireless, Inc. System and Method for Resolving Multiple Location Estimate Conflicts in a WLAN-Positioning System
US20200027022A1 (en) * 2019-09-27 2020-01-23 Satish Chandra Jha Distributed machine learning in an information centric network
CN111262947A (en) * 2020-02-10 2020-06-09 深圳清华大学研究院 Calculation-intensive data state updating implementation method based on mobile edge calculation
CN111526495A (en) * 2020-04-22 2020-08-11 华中科技大学 Internet of vehicles AoI optimization task unloading method based on improved genetic algorithm
CN111884947A (en) * 2020-07-29 2020-11-03 电子科技大学 Data packet management method based on information age at receiving end
CN112437131A (en) * 2020-11-10 2021-03-02 西北农林科技大学 Data dynamic acquisition and transmission method considering data correlation in Internet of things

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王紫荆: "面向物联网的高效路由与调度算法研究", 《中国知网优秀硕士学位论文全文库》 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113783941A (en) * 2021-08-23 2021-12-10 华北电力大学(保定) Method for minimizing average AoI in large-scale MIMO-MEC
CN113783941B (en) * 2021-08-23 2023-04-18 华北电力大学(保定) Method for minimizing average AoI based on large-scale MIMO-MEC
CN113784353A (en) * 2021-08-24 2021-12-10 华北电力大学(保定) Method, apparatus and storage medium for status update system
CN113784353B (en) * 2021-08-24 2023-06-30 华北电力大学(保定) Method, apparatus and storage medium for a state updating system
CN114039918A (en) * 2021-10-09 2022-02-11 广东技术师范大学 Information age optimization method and device, computer equipment and storage medium
CN115842926A (en) * 2021-11-29 2023-03-24 北京航空航天大学 Remote video timeliness optimization method based on improved SARL
CN115842926B (en) * 2021-11-29 2024-06-18 北京航空航天大学 Remote video timeliness optimization method based on improved SARL
CN114615684A (en) * 2022-02-25 2022-06-10 哈尔滨工业大学(深圳) Information age optimization method and device of closed-loop system and storage medium
CN114615684B (en) * 2022-02-25 2023-07-25 哈尔滨工业大学(深圳) Information age optimization method and device of closed-loop system and storage medium
CN114786174A (en) * 2022-03-09 2022-07-22 西安电子科技大学 Information age minimization method and system for hidden transmission system of Internet of things
CN115052325A (en) * 2022-06-07 2022-09-13 华北电力大学(保定) Multi-frequency heterogeneous wireless communication network access selection algorithm suitable for transformer substation service
CN115052325B (en) * 2022-06-07 2023-05-19 华北电力大学(保定) Multi-frequency heterogeneous wireless communication network access selection method suitable for substation service
CN115361734A (en) * 2022-07-14 2022-11-18 鹏城实验室 Power and IRS phase shift joint optimization method and device based on information timeliness
CN115361705A (en) * 2022-07-14 2022-11-18 鹏城实验室 NOMA network task processing method and system for guaranteeing information timeliness
CN115361705B (en) * 2022-07-14 2024-04-12 鹏城实验室 NOMA network task processing method and system for guaranteeing information timeliness
CN115361734B (en) * 2022-07-14 2024-05-14 鹏城实验室 Power and IRS phase shift combined optimization method and device based on information timeliness
CN116193367A (en) * 2023-04-27 2023-05-30 北京航空航天大学 Unmanned aerial vehicle ad hoc network reliable transmission timeliness evaluation and calculation method
CN116193367B (en) * 2023-04-27 2023-07-25 北京航空航天大学 Unmanned aerial vehicle ad hoc network reliable transmission timeliness evaluation and calculation method

Similar Documents

Publication Publication Date Title
CN113259968A (en) Intelligent calculation method for power distribution network equipment based on information freshness
CN113612843B (en) MEC task unloading and resource allocation method based on deep reinforcement learning
Fadlullah et al. HCP: Heterogeneous computing platform for federated learning based collaborative content caching towards 6G networks
Xiong et al. Resource allocation based on deep reinforcement learning in IoT edge computing
CN113242568B (en) Task unloading and resource allocation method in uncertain network environment
CN108920280B (en) Mobile edge computing task unloading method under single-user scene
CN110418416B (en) Resource allocation method based on multi-agent reinforcement learning in mobile edge computing system
US11831708B2 (en) Distributed computation offloading method based on computation-network collaboration in stochastic network
CN112105062B (en) Mobile edge computing network energy consumption minimization strategy method under time-sensitive condition
CN110890930B (en) Channel prediction method, related equipment and storage medium
CN102592171A (en) Method and device for predicting cognitive network performance based on BP (Back Propagation) neural network
Heydari et al. Dynamic task offloading in multi-agent mobile edge computing networks
CN114697333B (en) Edge computing method for energy queue equalization
CN112911647A (en) Calculation unloading and resource allocation method based on deep reinforcement learning
CN116366576A (en) Method, device, equipment and medium for scheduling computing power network resources
Bie et al. Queue management algorithm for satellite networks based on traffic prediction
CN116321255A (en) Compression and user scheduling method for high-timeliness model in wireless federal learning
Jiang et al. Age-of-Information-Based Computation Offloading and Transmission Scheduling in Mobile-Edge-Computing-Enabled IoT Networks
Liang et al. Stochastic Stackelberg Game Based Edge Service Selection for Massive IoT Networks
CN117880852A (en) Intelligent service migration method for edge Internet of vehicles system
CN110912588B (en) Downlink time-varying channel prediction method based on improved Prony method
CN114337881B (en) Wireless spectrum intelligent sensing method based on multi-unmanned aerial vehicle distribution and LMS
CN114615705B (en) Single-user resource allocation strategy method based on 5G network
Wang et al. Towards adaptive packet scheduler with deep-q reinforcement learning
CN113783941B (en) Method for minimizing average AoI based on large-scale MIMO-MEC

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210813

RJ01 Rejection of invention patent application after publication