CN113259968A

CN113259968A - Intelligent calculation method for power distribution network equipment based on information freshness

Info

Publication number: CN113259968A
Application number: CN202110401915.3A
Authority: CN
Inventors: 宁鑫; 陈俊; 邓元实; 张睿; 李巍巍; 罗洋; 朱轲
Original assignee: Electric Power Research Institute of State Grid Sichuan Electric Power Co Ltd
Current assignee: Electric Power Research Institute of State Grid Sichuan Electric Power Co Ltd
Priority date: 2021-04-14
Filing date: 2021-04-14
Publication date: 2021-08-13

Abstract

In the technical scheme of the invention, an index of information freshness, namely an information age fluctuation coefficient, is provided aiming at the problem that the time delay performance in the information acquisition and control calculation of the power distribution network equipment cannot accurately ensure the timely processing of the service; calculating the average information age and the maximum information age of the system according to the model of the intelligent calculation system of the power distribution network equipment; and calculating the average local calculation rate, the average transmission rate and the average remote calculation rate of the system in the optimization problem of minimizing the fluctuation coefficient according to the two calculated performance indexes. Therefore, according to the characteristics of the operation maintenance system of the power distribution network equipment, the time delay problem of intelligent calculation and transmission is considered, the information freshness consideration index, namely the information age fluctuation coefficient, is provided according to the operation maintenance service, and the transmission and calculation rate is optimized according to the consideration index.

Description

Intelligent calculation method for power distribution network equipment based on information freshness

Technical Field

The invention relates to the technical field of communication networks, in particular to an intelligent calculation method for power distribution network equipment based on information freshness.

Background

With the high-speed development of economy in China, the urbanization level is continuously improved, and the power distribution network equipment is large in scale, high in change speed and unbalanced in development; the method comprises the steps that multisource, heterogeneous and different-scale information representing the state of the power distribution network can be operated and maintained only by needing a large amount of calculation in the information feature extraction and fusion processes, and particularly the requirements on calculation resources are higher and higher along with the application of an artificial intelligence technology in operation and maintenance of the power distribution network; however, the operation and maintenance of the power distribution network equipment put high requirements on data calculation and transmission time delay; Multi-Access Edge Computing (MEC) can sink storage data and Computing resources to the Edge of the network, reducing transmission delay. In the operation and maintenance service of the power distribution network equipment, more emphasis is placed on time delay as a main index of service quality, however, the freshness of information acquisition cannot be guaranteed due to low time delay and high throughput, so that the accuracy and reliability of state analysis of the power distribution network equipment are reduced; the characterization index of the Information freshness is Information Age (AoI), i.e. the time (calculated from the time generated by the Information sending end) when the receiving end receives the Information, which is affected by the communication throughput and the transmission delay; meanwhile, in the power distribution network equipment communication acquisition and calculation network, the time-varying characteristic of a wireless channel needs to be considered, communication resources in a random network need to be optimized, the calculation efficiency of the MEC is improved, the accuracy of power distribution network equipment state analysis is improved, and the stability of the power distribution network is practically improved. Due to different environments of the power distribution network equipment, time-varying wireless channels and ultra-reliable low-delay service requirements, operation and maintenance of the power distribution network equipment still face huge challenges.

Disclosure of Invention

The inventors have found that the following challenges are faced in the maintenance of the operation of power distribution network equipment:

1) due to the fact that power distribution network equipment is numerous, the number of sensors for monitoring the states of the power distribution network equipment is large, but due to the time-varying characteristic of a wireless channel and the limited energy of the sensors, the communication transmission scheme is optimized under the condition that the calculated amount and the transmission amount are uncertain, and the utilization efficiency of energy is improved.

2) Traditional MEC uninstallation and communication scheme all concern its transmission delay and calculation delay, have neglected the new freshness of receiving terminal receipt information, even total calculation delay and transmission delay are all shorter, can not guarantee that the receiving terminal can normally carry out state update to lack calculation uninstallation and communication scheme of optimizing information new freshness, thereby influence the monitoring accuracy of distribution network equipment.

3) In the monitoring service of the power distribution network equipment, the statistical characteristics of a wireless channel and an interference channel are greatly different from those of a traditional public network, and the information generated by the service is subjected to different random distributions, so that the difficulty is brought to the analysis of the freshness of the information, and the optimization of the freshness of the information is directly influenced.

Aiming at the characteristics of monitoring and operation maintenance of the power distribution network equipment, the method provided by the invention mainly solves the problem of optimizing communication resources and computing resources in the processes of computation unloading and communication transmission, minimizes AoI under the condition of energy limitation, and verifies the advantage of the algorithm in the aspect of information freshness based on actually measured channel data and information statistic random distribution.

AoI is widely studied in various MEC applications as a measure, in 5G, Ultra-Reliable Low-Latency Communication (URLLC) tends to focus on average AoI performance, whereas URLLC performance is determined to be an extreme event, and particularly for power distribution network equipment monitoring traffic, the state accuracy depends on information freshness, whereas for URLLC traffic, the information freshness mainly depends on the influence of high-Latency events on power distribution network equipment monitoring. According to the invention, the statistical characteristic of the maximum AoI in the power distribution network equipment monitoring system is firstly analyzed, communication and computing resources are optimized according to the statistical characteristic of the maximum AoI, the unloading capacity of the system is improved, and the monitoring accuracy of the power distribution network equipment is finally improved.

In view of this, the invention aims to provide an intelligent calculation method for power distribution network equipment based on information freshness, so as to improve the monitoring accuracy of the power distribution network equipment.

Based on the above purpose, the invention provides an intelligent calculation method in a power distribution network equipment communication network based on information freshness, and the specific technical scheme is as follows:

step A, calculating the transmission rate and the signal-to-noise ratio of the system under short packet communication according to a model of an intelligent calculation system of the power distribution network equipment;

step B, according to the successful state updating, defining AoI expression;

step C, calculating the average information age of the system according to the model of the intelligent computing system of the power distribution network equipment;

step D, calculating the maximum information age of the system according to the model of the intelligent computing system of the power distribution network equipment;

and E, defining AoI fluctuation coefficients according to the two calculated performance indexes, and calculating the average local calculation rate, the average transmission rate and the average remote calculation rate of the system to finally obtain AoI fluctuation coefficient optimal values.

Wherein, step A specifically includes:

a1, detecting that the distribution network equipment is URLLC service, requiring finite block length communication transmission, and calculating the transmission rate of the finite block length communication transmission according to the finite block length transmission theory

Wherein L is_nThe length of the data packet transmitted in the nth transmission and calculation is the effective transmission information of the data packet not exceeding 250bytes, and the decoding error probability epsilon is more than 0.

Representing the nth kth sensor transmissionThe rate.

Signal-to-noise ratio, erfc, for the nth sensor transmission channel^-1(. cndot.) is an inverse complement error function.

A2, calculating the signal-to-noise ratio as

Wherein

For the nth sensor transmit power upstream,

channel gain of transmitting power for the nth sensor in the uplink; n is a radical of₀Is the power spectral density of the noise and W is the bandwidth of the transmission.

Wherein, step B specifically includes:

b1, according to successful status update, defining AoI expression as

Wherein tau is^kAoI, I, denoted as the kth sensor at any successive time t ≧ 0_{·}To indicate the function, and,

is a Bernoulli distributed variable which obeys B (1, 1-epsilon), wherein

In order for the receiving end to successfully receive and update the status,

fails to update the state.

Wherein, step C specifically includes:

c1, calculating locally calculated average AoI

Wherein C is_iDenotes the ith local calculation time, λ_lIs the local service rate.

C2, calculate average AoI calculated by MEC Server

Wherein λ_tFor the average rate of transmission, λ_cIs the computing rate of the MEC server.

C3, calculating average AoI of partial unloading schemes

Wherein

ζ＝η_s-η_sδ，

η_sδ is the error probability, which is the distribution strength of the poisson distribution.

Wherein, step D specifically includes:

d1, according to the distributed computing model and the application scene of intelligent computing of the distribution network equipment, the distribution network equipment is in a busy processing state at the transmitting end and the receiving end, and the condition distribution obtained AoI according to the system is

Wherein T is_iUpdate time of state i, T_i＝W_i+S_i，W_iFor waiting time, S_iAs service time, F_iIs the time interval of arrival.

D2, calculating to obtain a peak value AoI, which is specifically represented as

Where γ is the signal-to-noise ratio, τ_iAoI at time i.

Wherein, step E specifically includes:

e1, the peak value AoI directly influences the error rate of execution of the power distribution network control signal, the stability of AoI directly influences the long-time stability degree of execution of the power distribution network control system, and a AoI fluctuation coefficient is defined in the technical scheme of the patent and specifically expressed as

The average calculated for the MEC server AoI.

E2, in order to ensure that data of equipment sensors in the power distribution network can be effectively transmitted to an MEC server and state updating can be effectively achieved, and a fluctuation coefficient AoI is optimized.

E3, aiming at the optimization problem, optimizing by adopting a heuristic artificial intelligence method, adopting a Markov decision process and adopting multi-agent deep reinforcement learning to convert the optimization problem into a partial observed Markov process; the observation set is represented by O ═ O₁,…,O_j,…,O_VThe action set may be denoted as a ═ a₁,…,A_j,…,A_VIn the observation space, the observation space of each proxy computing unit at the time t is o_j(t)∈O_jWhich can be described as

In the action set, the jth proxy computation element at time t is a_j(t)∈A_j，

E4, determining a reward function, which is a function of the state and the action, and which is a measure of the effect of the computing unit in performing the action in the given state. In the training phase, when the agent computing unit selects an action, its corresponding reward function value is fed back to the agent, and the agent computing unit updates its state. Each agent computing unit obtains an optimal strategy according to continuous rewards, so that the rewards are maximized, and AoI fluctuation coefficients and AoI average parameters are adopted correspondingly to the rewards.

E5, a depth-enhanced gradient descent algorithm based on a multi-agent computing unit, which employs an actor-critic method to initialize each parameter

(1) The method comprises the steps of initializing estimation of actors and critics of each agency, and obtaining an action set in a training stage;

(2) initializing the size of a replay storage memory of each agency mechanism;

e6, further performing parameter updating based on the initialization, receiving the initial observation space O_mAnd O_j(j-1, …, V) and set γ -0.

E7, obtaining the current input state and the obtained new observed value and input state when each iteration of t

<1>Selecting action a for compute agent j each time_j(t)＝μ_j(O_j(t)), implementing the present strategy π_jAnd obtaining a current input state s_j(t)；

<2>Executing a (t) ═ a₁(t),a₂(t),…,a_v(t) obtaining a prize θ according to the established function_mEach compute proxy unit to obtain a new observed value o'_jAnd input state s_j′。

E8, for each compute agent j, the following conversion is performed.

<1>If the number of transitions is less than N_γThen store { s_j(t),a_j(t),γ(t),s_j' } into a buffer;

<2> otherwise

Replace the conversion held earliest in the buffer.

Randomly selecting a small batch of conversion from the buffer area;

thirdly, updating a parameter matrix of the critic evaluation network by taking loss minimization as a target;

updating parameters of the actor evaluation network through a maximization strategy objective function;

according to

And

and updating the target network parameters of the operator and the critic.

E9, the iteration is ended, and the optimal signal-to-noise ratio γ is obtained as γ + γ' (t), so that the optimal AoI fluctuation coefficient is obtained.

Compared with the prior art, the technical scheme has the following advantages:

in the technical scheme of the invention, an index of information freshness, namely an information age (AoI) fluctuation coefficient is provided aiming at the problem that the time delay performance in the information acquisition and control calculation of the power distribution network equipment cannot accurately ensure the timely processing of the service; calculating the average information age and the maximum information age of the system according to the model of the intelligent calculation system of the power distribution network equipment; and calculating the average local calculation rate, the average transmission rate and the average remote calculation rate of the system in the optimization problem of minimizing the fluctuation coefficient according to the two calculated performance indexes. Therefore, according to the characteristics of the operation maintenance system of the power distribution network equipment, the problem of intelligent calculation and transmission delay is considered, the information age (AoI) fluctuation coefficient which is the considered index of information freshness is provided according to the operation maintenance service, and the transmission and calculation rate (including local and remote calculation) is optimized according to the considered index.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

Fig. 1 is a flowchart of an intelligent calculation method for power distribution network equipment based on information freshness according to an embodiment of the present invention;

FIG. 2 is an optimization scheme comparing three algorithms according to an embodiment of the present invention, wherein the average AoI is obtained under different SNR conditions;

fig. 3 is an optimization scheme comparing four algorithms according to an embodiment of the present invention, and illustrates an extremum AoI for different K values.

Detailed Description

Aiming at the characteristics of monitoring and operation maintenance of the power distribution network equipment, the method solves the problem of optimizing communication resources and computing resources in the processes of computation unloading and communication transmission, minimizes AoI under the condition of energy limitation, and verifies the advantage of the algorithm in the aspect of information freshness based on actually measured channel data and information statistic random distribution.

The inventor of the invention considers that the MEC-based power distribution network communication network can be utilized, on one hand, the MEC is applied to a power distribution communication system, the data calculation rate is effectively improved, and the stable operation is realized; on the other hand, the intelligent device can calculate quickly, further reduce the calculation delay and the transmission delay of the data when the power distribution network equipment is in operation and maintenance, improve the information freshness of the system, and reduce the information age of the system.

In addition, the technical scheme of the invention can also optimize AoI fluctuation coefficients of users by a depth-enhanced gradient descent algorithm based on a multi-agent computing unit, and the algorithm adopts an actor-critic method to ensure that data of equipment sensors in the power distribution network can be effectively transmitted to the MEC server and the state updating can be effectively achieved.

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to specific embodiments and the accompanying drawings.

Referring to fig. 1, an embodiment of the present invention provides an intelligent calculation method for power distribution network devices based on information freshness, where the method includes:

step B, according to the successful state updating, defining AoI expression;

Wherein, step A specifically includes:

Representing the rate of transmission of the nth kth sensor.

A2, calculating the signal-to-noise ratio as

Wherein

For the nth sensor transmit power upstream,

Wherein, step B specifically includes:

b1, according to successful status update, defining AoI expression as

is a Bernoulli distributed variable which obeys B (1, 1-epsilon), wherein

In order for the receiving end to successfully receive and update the status,

to be updatedThe state fails.

Wherein, step C specifically includes:

c1, calculating locally calculated average AoI

C2, calculate average AoI calculated by MEC Server

C3, calculating average AoI of partial unloading schemes

Wherein

ζ＝η_s-η_sδ，

Wherein, step D specifically includes:

Where γ is the signal-to-noise ratio, τ_iAoI at time i.

Wherein, step E specifically includes:

The average calculated for the MEC server AoI.

(2) initializing the size of a replay storage memory of each agency mechanism;

E8, for each compute agent j, the following conversion is performed.

<2> otherwise

Replace the conversion held earliest in the buffer.

Randomly selecting a small batch of conversion from the buffer area;

according to

And

and updating the target network parameters of the operator and the critic.

In practical applications, the communication system may be a power distribution communication system, and includes a Base Station (BS) equipped with an edge server and various electrical devices as intelligent devices. Wherein the base station provides wireless access services and the edge server provides computing services. Various types of electrical devices as smart devices may specifically include: a distribution monitoring terminal device, a distribution automation terminal device, or a distribution transformer monitoring terminal device.

According to the application, on one hand, the MEC is applied to a power distribution communication system, so that the data calculation rate is effectively improved, and the stable operation is realized; on the other hand, the intelligent device can perform optimal calculation, further reduce the calculation delay and the transmission delay of the data during operation and maintenance of the power distribution network device, improve the information freshness of the system, and reduce the average information age of the system.

Specifically, the following specific embodiments are given in the present application in combination with an actual application scenario:

1. system model

Consider K sensors in a plurality of devices of a power distribution network, sending their status information (i.e., monitored data) to an edge server (compute node). Because the one-time calculation and the transmission time have the cut-off time, the transmission and calculation times are set as n,

and setting the time of the nth transmission as t_nThe time for successfully updating the status (including three parts: upload information time, calculation time, transfer action time) can be represented as T_n＝t_n-t_n-1(ii) a Due to the increasing number of sensors, it is unlikely that one sensor transmission can be supported independently in the same orthogonal communication resource, and Non-orthogonal multiple access (NOMA) technology needs to be adopted^[10]Meanwhile, the distribution network equipment detects URLLC service, which needsFor the transmission of limited-block-length communication, the effective transmission information is 250bytes at most, and according to the transmission theory of the limited-block length, the transmission rate can be expressed as

Representing the rate of transmission of the nth kth sensor.

The signal-to-noise ratio for the nth kth sensor transmission channel can be calculated according to equation (2)

Wherein

For the nth sensor transmit power upstream,

gain for the uplink channel; n is a radical of₀Power spectral density for noise, W is the bandwidth of transmission, erfc^-1(. cndot.) is an inverse complement error function. In the course of the transmission process,

and transmit power

Not exceeding maximum transmission power, i.e.

In order to ensure that the state of the transmission can be updated, the method needs to be implemented

Wherein D^kRepresenting the data that needs to be transmitted for each state update. I.e. each time the information is transferred is less than the maximum transmission rate. A state update time interval of S_nShould be greater than the minimum time interval, i.e. S_n≥S_min. AoI for the kth sensor at any successive time t ≧ 0 is denoted as τ^kIf the kth sensor can successfully update the state at the nth time, when t equals t_nAoI is reset to 0. So the expression of AoI is

Wherein, I_{·}To indicate the function, and,

is a Bernoulli distributed variable which obeys B (1, 1-epsilon), wherein

In order for the receiving end to successfully receive and update the status,

fails to update the state. Its AoI linear increase for two consecutive time instants can be expressed as

τ^k(t_n)＝τ^k(t_n-1)+t-t_n-1 (4)

2. Distributed computing model

In addition to AoI which accounts for status updates, AoI which accounts for local and remote computations is also needed. In the local calculation, the calculation amount is small, so the maximum AoI is not considered, for the convenience of calculation and the practical situation, the calculation is carried out in the form of average AoI, and the ith local calculation time is set as C_iThen it is knownThe calculation time follows independent exponential distribution with the same distribution, and the mean value is E [ C ]_i]＝1/λ_l，λ_lFor local service rate, and because of C_iAnd C_i-1Independently of one another, have a mean value of

Its second moment is

According to the definition of locally calculated average AoI, i.e.

For remote computing problem, a zero-waiting processing mode is adopted, the arrival random process of a computing queue is equal to the random process of channel transmission departure, the computing queue follows the poisson process due to the zero-waiting computing processing mode, so that the computing queue and an edge computing server form an M/M/1 system, and the average value of the time required by transmission is E [ F ]_i]＝1/λ_tWherein λ is_tFor the average rate of transmission, its average service time may be denoted as E B_i]＝1/λ_cWherein λ is_cIs the computing rate of the MEC server. According to the transmission obeys exponential distribution, the time interval F_iAnd F_i-1Independent of each other, can be obtained

And is

According to the service system of the MEC server, the update time of the state i is T_iWhich is divided into two parts, respectively waiting time W_iAnd service time S_iI.e. T_i＝W_i+S_i(ii) a Wherein waiting time W_iAssociated with the i-1 th state update time T_i-1And time interval of arrival F_iOn that, then can obtain

E[T_iF_i-1]＝E[W_iF_i-1]+E[B_i]E[F_i-1] (6)

According to E [ B_i]＝1/λ_cCan find out

From equations (6) and (7), the average AoI calculated by the MEC server can be obtained as

3. Average AoI Performance analysis

However, in most cases of MEC-assisted calculation, a partial offloading manner is adopted for calculation, so on this basis, AoI of the power distribution network equipment monitoring partial calculation offloading manner is analyzed next, and considering that the calculation amount is relatively small, derivation is performed for a fixed calculation time, that is, each task amount is fixed, and the calculation resources allocated thereto are fixed; for partial offload computation, if the local computation and the transport process are considered as one process, the remote computation can be considered as a G1/D/1 queuing model if λ_c≥λ_tAnd the local computation time is 1/lambda_tRemote calculation time is 1/lambda_cNote that the arrival process of the remote compute queue is the same as the departure process of the concatenation of the local server and the transport channel. The computation time of the local server and the remote server is determined, and the transmission time of the channel is exponentially distributed; therefore, three expectations in equation (9) need to be calculated,

since the local computation time is constant, i.e. L_i＝1/λ_tSo that E [ B ] can be derived_i]Is composed of

According to E [ F_i]Can derive E [ F ]_iF_i-1]And to obtain

Then calculate E [ T ]_iF_i-1]. When the status i is updated, wait time W_iConsistent with the transmission time and time interval of the i-1 th state in the MEC system. In particular, when T_i-1<F_iFor example, when the state i-1 still waits for queuing processing, the i-1 st message always waits or is processed in the queue, and W can be obtained_i＝T_i-1-F_i. Otherwise, the latency W may be considered_i0. Thus, the latency for the ith status message may be defined as

W_i＝(T_i-1-F_i)⁺ (13)

According to T_i＝W_i+S_iCan obtain E [ T ]_iF_i-1]

E[T_iF_i-1]＝E[(W_i+S_i)F_i-1]＝E[W_iF_i-1]+E[S_iF_i-1] (14)

From T_i＝W_i+S_iAnd equation (13), to obtain W_iIs expressed as

Wherein at time T_i-2Is the i-2 nd information stateService time and wait time with S_i-1、F_iAnd F_i-1Independently of each other, taking into account the time T_i、T_i-1And T_i-2The probability density distribution function at GI/M/1 is

Wherein δ satisfies the following transformation

δ＝L′(η_s-η_sδ) (17)

Wherein L' (. cndot.) is a Laplace transform, and when M > 0, then M_iHas a probability density function of

According to M_iThe probability density function of (a), can find delta,

let ζ equal to η_s-η_sδ, from the probability density distribution function of GI/M/1, equation (16), one can obtain

E[T_i|M_i-1＝m_i]＝E[((T_i-2-m_i)⁺+F_i-1-M_i)⁺] (20)

Can be integrated according to probability density

Wherein

And

from equation (6), one can obtain

Wherein

From equation (22), one can obtain

The average AoI for the partial unloading schemes obtained according to (11), (12) and (21) is

4. Peak AoI Performance analysis

In the intelligent calculation scheme of the power distribution network equipment, the control effect of the intelligent calculation scheme is often dependent on the fluctuation range of the peak value and the average value of AoI, so that the analysis of AoI in the edge calculation becomes very important. According to a distributed computing model and an application scene of intelligent computing of the power distribution network equipment, the transmitting end and the receiving end are in a busy processing state, so that AoI condition distribution can be obtained according to the system

According to F_iProbability density being unconditional probability, i.e.

Followed by the removal of T_iThe correlation of (a) can be obtained:

finally, the value of the peak value AoI is obtained

The peak value AoI directly affects the error rate of the execution of the distribution network control signal, however, the stability of AoI directly affects the long-time stability of the execution of the distribution network control system, so the scheme adopts a new measurement index-AoI fluctuation coefficient, and the average AoI represents the speed of the long-time distribution network control signal processing, and also needs to ensure that the execution efficiency of the system can be ensured within a short time. Define the m-th user's AoI fluctuation coefficient as

5. Optimization problem and solution

In order to ensure that data of equipment sensors in the power distribution network can be effectively transmitted to the MEC server and state updating can be effectively achieved, the actual needs of monitoring of the power distribution network equipment are considered, and the optimization problem is determined as

Wherein u is_nk(t) is an indicator that the kth task is assigned at device n at time t if u_nkIf (t) is 1, then the task is serviced, otherwise u_nk(t) is 0.Γ is the value of the longest average AoI of a service in the distribution network system, which is related to the time delay of the calculation and action of the long-term distribution network control service.

In a power distribution network control system, the control system is trans-regional, so the decision process is jointly decided by using the information of a plurality of control unitsThe optimization problem of the formula (30) is relatively complex, so that a heuristic artificial intelligence method is adopted for optimization, a Markov decision process is adopted, and multi-agent deep reinforcement learning is adopted to convert the problem of the formula (30) into a partially observed Markov process; let the number of computational units it handles be V, which represents V state sets S in a state set, whose observation set can be expressed as O ═ { O ═ O₁,…,O_j,…,O_VIts action set can be represented as a ═ a₁,…,A_j,…,A_VA state set S describes the state configuration of a plurality of proxy computing units; o is_jState S (t) e S observed at time t by each proxy calculation unit representing the space observed by the jth (j ═ 1, …, V) proxy calculation unit; a. the_jIs the action space of the jth (j ═ 1, …, V) proxy compute unit, which uses the policy pi for each given state S ∈ S for each proxy compute unit_j:S→A_jAn action is selected from the action space based on the observation space. The state of the environmental state at time t may be denoted as S (t) e S, which may be denoted as S (t) λ ═ λ_1c(t),λ_2c(t),…,λ_vc(t),λ_1t(t),λ_2t(t)…,λ_vt(t),λ_1l(t),λ_2l(t)…,λ_vl(t) }. Since the observation space of each proxy computing unit at the time t is o in the observation space_j(t)∈O_jWhich can be described as

The reward function is a function of state and action, which is a measure of the effectiveness of a computing unit to act in a given state. In the training phase, when the agent computing unit selects an action, the corresponding reward function value is fed back to the agent, and the agent computing unit computesThe cell updates its state. Each agent computing unit obtains the optimal strategy according to the continuous rewards, and the reward maximization is realized by adopting AoI fluctuation coefficients of an equation (29) and an average AoI parameter of an equation (24) corresponding to the reward.

According to the design, a depth-enhanced gradient descent algorithm based on a multi-agent computing unit is provided, the algorithm adopts an actor-critic method, and the algorithm can be expressed as

Initialization

(2) initializing the size of a replay storage memory of each agency mechanism;

parameter updating

(1) When each total iteration

Receiving an initial observation space O_mAnd O_j(j-1, …, V) and set γ -0.

(2) When each t iterates

<2>Executing a (t) ═ a₁(t),a₂(t),…,a_v(t) obtaining a prize θ according to the established function_mEach compute proxy unit to obtain a new observed value o'_jAnd input status of s'_j。

(3) For each computing agent j

<1>If the number of transitions is less than N_γ，

Then store s_j(t),a_j(t),γ(t),s_j' } into a buffer;

<2> otherwise

Replace the conversion held earliest in the buffer.

Randomly selecting a small batch of conversion from the buffer area;

according to

And

and updating the target network parameters of the operator and the critic.

<3>γ＝γ+γ′(t)

6. Simulation result

Considering the characteristics of distribution network equipment distribution, considering that the distribution range of equipment served by one edge computing agent is not too large, the service range adopted by simulation is 0.1km, considering the cooperative range of a plurality of edge agents in the simulation, and considering the number V of the cooperative edge agents as 9; considering that the wireless channel bandwidth is W-20 MHz, and the transmission power of each acquisition sensor is p-20 dBm, the variance of the noise is σ²-100 dBm; in the simulation, the wireless channel state adopts a finite state Markov chain, such as C ═ C₁c₂c₃Wherein its transmission probability matrix is

Other simulation Environment settings are shown in Table 1

TABLE 1 simulation Environment parameters and values

Parameter(s)	Value of parameter
		Input the data length l of each status update	20KB
Number of cycles n of the CPU required for each state update	150bits
		Main frequency f of CPU of edge computing server_m	20GHz
Main frequency f of CPU of local computing unit device_s	1.5GHz
		Sensor to edge server bandwidth W	20MHz
Distance d of sensor to edge server	0.1km
		Sensor transmission power p	20dBm
Variance σ of complex white Gaussian noise²	-100dBm
		Duration t of each symbol_l	0.006ms
Actor learning rate for deep reinforcement learning	Decay from 0:0002 to 0:0000001
		Critic learning rate for deep reinforcement learning	Decay from 0:002 to 0:000001
Sensor distribution density d _s	30 pieces/square meter
		Discount factor for reward function	Increase from 0.8 to 0.99

As shown in fig. 2, the optimization schemes of the three algorithms are compared with the average AoI under different SNR conditions, wherein the first optimization method adopts a multi-objective optimization mode, and the optimization objective is the average AoI, which is represented as "average AoI optimization" in fig. 2; the second method employs a single-computational-unit reinforcement learning method, denoted as "single-computational-unit average AoI" in fig. 2; the third is a deep reinforcement learning method of a plurality of computing units proposed in the present study, and the optimized index is AoI fluctuation coefficient, which is represented as "multi-agent AoI fluctuation coefficient" in fig. 2, and as can be seen from fig. 2, the proposed scheme can obtain a smaller average AoI.

As shown in fig. 3, the optimization schemes of the four algorithms are compared, and the extremum AoI in the case of different K values is compared, wherein the first optimization method adopts a multi-objective optimization mode, and the optimization objective is an average AoI, which is denoted as "average AoI multi-objective optimization" in fig. 2; the second method employs a single computational unit reinforcement learning method, denoted as "single agent average AoI" in FIG. 2; the third is the deep reinforcement learning method of multiple computing units proposed in the present study, and the optimized index is AoI fluctuation coefficient, which is represented as "multiple agent AoI fluctuation coefficient" in fig. 3, and the fourth scheme is represented as "single agent AoI fluctuation coefficient" as can be seen from fig. 3, the proposed scheme can obtain a smaller extreme value AoI.

7. Summary of the invention

Aiming at the characteristics of the operation maintenance system of the power distribution network equipment, the problems of intelligent calculation and transmission time delay are considered, the consideration index of information freshness is provided according to the operation maintenance service, the transmission and calculation rate (including local and remote calculation) is optimized according to the index, theoretical support is provided for subsequent communication design and equipment optimization, and practical reference is also provided for actual system deployment.

In the description, each part is described in a progressive manner, each part is emphasized to be different from other parts, and the same and similar parts among the parts are referred to each other.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. The intelligent calculation method of the power distribution network equipment based on information freshness is characterized by comprising the following steps:

step B, according to the successful state updating, defining AoI expression;

Wherein, step A specifically includes: