CN110971471A

CN110971471A - Power communication backbone network fault recovery method and device based on state perception

Info

Publication number: CN110971471A
Application number: CN201911394553.9A
Authority: CN
Inventors: 吴海洋; 缪巍巍; 江凇; 李伟; 郭波; 贾平; 蒋春霞; 陈兵; 汤震; 张懿; 李箐
Original assignee: State Grid Jiangsu Electric Power Co Ltd Zhenjiang Power Supply Branch; Information and Telecommunication Branch of State Grid Jiangsu Electric Power Co Ltd
Current assignee: State Grid Jiangsu Electric Power Co Ltd Zhenjiang Power Supply Branch; Information and Telecommunication Branch of State Grid Jiangsu Electric Power Co Ltd
Priority date: 2019-12-30
Filing date: 2019-12-30
Publication date: 2020-04-07
Anticipated expiration: 2039-12-30
Also published as: CN110971471B

Abstract

The invention discloses a state-aware power communication backbone network fault recovery method and device, which are used for obtaining the number of nodes of which actual unsaturated nodes in the power communication backbone network are turned into saturated nodes at the current moment; predicting the number of nodes for converting the unsaturated nodes into the saturated nodes at the next moment according to the number of the nodes for converting the actual unsaturated nodes into the saturated nodes; and taking the number of the nodes of which the unsaturated nodes are converted into the saturated nodes at the next moment as the input of the reinforcement learning algorithm, when a certain link e fails, the reinforcement learning algorithm provides a new network topology structure as a recovery strategy, a new more effective recovery strategy is established according to the obtained feedback, and the failure recovery is carried out according to the final recovery strategy. The invention shows better performance in the power communication backbone network fault recovery problem.

Description

Power communication backbone network fault recovery method and device based on state perception

Technical Field

The invention relates to the technical field of intelligent power grids and network communication, in particular to a method and a device for recovering a fault of a power communication backbone network based on state perception.

Background

Modern power grids rely on a corresponding communication backbone and ensure safe operation and authority control of the power grid via the communication network. As an important infrastructure, the communication backbone network has different bandwidth, delay and reliability characteristics compared with the general communication network, and its corresponding fault diagnosis, location and recovery problems are also hot research. The mainstream method is to alleviate the channel congestion problem by establishing a fault model of the communication network and the cache capacity, and typically a fault detection and recovery method (DM-FDR) based on distributed deployment, and a fault detection and recovery method (GA-CFRM) based on a genetic algorithm.

However, existing methods do not guarantee that the smart grid service is completely isolated when a link failure occurs, thereby recovering from the failure with minimal cost. Reinforcement learning, one of the core technologies of artificial intelligence, has a significant effect in the aspect of effectively analyzing large-scale data. As a tool specially used for analyzing rapidly-growing large-scale heterogeneous data, the reinforcement learning method is very suitable for solving the problems of communication backbone network fault prediction and processing.

Disclosure of Invention

The invention provides a state-sensing-based power communication backbone network fault recovery method, which solves the problems that rapid isolation cannot be realized when a link fault occurs in the conventional power communication backbone network, and the fault recovery cost is high.

In order to achieve the above purpose, the invention adopts the following technical scheme: a power communication backbone network fault recovery method based on artificial intelligence and state perception comprises the following steps:

acquiring the number of nodes of which actual unsaturated nodes are turned into saturated nodes in the power communication backbone network at the current moment;

the number of the nodes of which the actual unsaturated nodes are turned into the saturated nodes is corrected and predicted through the weighting parameters to obtain the number of the nodes of which the actual unsaturated nodes are turned into the saturated nodes at the next moment;

and taking the number of the nodes of which the unsaturated nodes are converted into the saturated nodes at the next moment as the input of the reinforcement learning algorithm, when a certain link e fails, the reinforcement learning algorithm provides a new network topology structure as a recovery strategy, a new more effective recovery strategy is established according to the obtained feedback, and the failure recovery is carried out according to the final recovery strategy.

Further, the number of the nodes where the actual unsaturated node is inverted into the saturated node is as follows:

N_act(t)＝p_r(k,q)·N_max(t)

in the formula, k and q are respectively node numbers; p is a radical of_r(k, q) represents at least M_maxNode k rollover probability, n, when a packet is dropped at node q_kIndicates the number of packets at node k, M_maxRepresenting the maximum number of packets that a node can receive, n_qIndicating the number of packets at node q,

representing the maximum number of packets at node k,

represents the maximum number of packets at node q;

wherein m is_sizeIs the packet size; b is the channel transmission rate; Δ t is the time interval;

N_max(t)＝N_v(t)

N_max(t) represents the maximum number of inversion nodes at which the unsaturated node is inverted to the saturated node at time t, N_v(t) represents the number of nodes that can communicate with a node at time t.

Further, the number of the nodes at which the unsaturated node is turned into the saturated node at the next moment is N_act(t+Δt)：

N_act(t+Δt)＝p_r(k,q)·N_max(t+Δt)

N_max(t+Δt)＝N′_v(t+Δt)：

N′_v(t + Δ t) is the average of the sum of the predicted value of the number of nodes with which the current node can communicate and the predicted values of the number of nodes with which all reachable nodes around the current node can communicate;

under the normalized condition, u + b + d is 1, and a simultaneous equation set of the following three equations is solved:

where v and s are nodes, respectively, known at tThe number of nodes with which the respective time of day can communicate is N_v(t) and N_s(t)，N_v(t + Δ t) is a predicted value of the number of nodes that can communicate with the current node v at time t + Δ t, N_s(t + Δ t) represents a predicted value of the number of nodes that can communicate with a node s at the time t + Δ t, the node s representing a certain node communicating with the current node v; b represents the weighted weight of the current node v flip state perception result; d represents the influence weight of the sensing result of the node s in the overturning state, u represents the empirical influence weight of other unknown factors on the link between the two nodes, and N is obtained by solving_v(t+Δt)，N_s(t+Δt)，b,d,u。

Further, the reinforcement learning algorithm proposes a new network topology as a recovery strategy when a certain link e fails, establishes a new more effective recovery strategy according to the obtained feedback, and performs failure recovery according to a final recovery strategy, and the specific steps include:

1) when a certain link E belongs to E and fails, taking out the service affected by the failure from the new network topology structure; e represents the aggregate of links;

2) recalculating a risk value of each service in the power communication under a new network topology structure, sequencing the risk values according to the sizes, removing links with risks above a threshold value, and then solving the shortest path between nodes to serve as a service path;

3) and if the service path exists, performing fault recovery through the service path, otherwise, continuously updating the network structure, repeatedly calculating the risk value, and solving the path until the shortest path is solved to realize fault recovery.

A power communication backbone network fault recovery device based on artificial intelligence and state perception comprises:

the actual overturning node acquisition module is used for acquiring the number of nodes of which actual unsaturated nodes in the power communication backbone network are overturned into saturated nodes at the current moment;

the next-time turning node prediction module is used for correcting and predicting the number of the nodes of which the actual unsaturated nodes are turned into the saturated nodes through weighting parameters to obtain the number of the nodes of which the actual unsaturated nodes are turned into the saturated nodes at the next time;

and the fault recovery module is used for taking the number of the nodes of which the unsaturated nodes are turned into the saturated nodes at the next moment as the input of the reinforcement learning algorithm, when a certain link e fails, the reinforcement learning algorithm provides a new network topology structure as a recovery strategy, a new more effective recovery strategy is established according to the obtained feedback, and fault recovery is carried out according to a final recovery strategy.

N_act(t)＝p_r(k,q)·N_max(t)

representing the maximum number of packets at node k,

represents the maximum number of packets at node q;

N_max(t)＝N_v(t)

N_act(t+Δt)＝p_r(k,q)·N_max(t+Δt)

N_max(t+Δt)＝N_v(t+Δt)：

where v and s are nodes, respectively, and the number of nodes with which communication is possible at time t is known to be N_v(t) and N_s(t)，N_v(t + Δ t) is a predicted value of the number of nodes that can communicate with the current node v at time t + Δ t, N_s(t + Δ t) represents a predicted value of the number of nodes that can communicate with a node s at the time t + Δ t, the node s representing a certain node communicating with the current node v; b represents the weighted weight of the current node v flip state perception result; d represents the influence weight of the sensing result of the node s in the overturning state, u represents the empirical influence weight of other unknown factors on the link between the two nodes, and N is obtained by solving_s(t+Δt)，N_s(t+Δt)，b,d,u。

The invention achieves the following beneficial effects: the invention combines the state perception of the node and the reinforcement learning method to obtain an automatic fault recovery method, and the automatic fault recovery method is applied to the power communication backbone network, thereby realizing an automatic fault recovery method, greatly reducing the human intervention and improving the reliability of network service. When a link fault occurs in the power communication backbone network, the fast isolation is realized, and the fault recovery cost is low;

compared with the traditional DM-FDR and GA-CFRM methods, the method has better performance in the problem of power communication backbone network fault recovery.

Drawings

FIG. 1 is a schematic diagram of an infrastructure network and corresponding serving network;

FIG. 2 is a schematic diagram of an enhanced learning algorithm architecture for a power communications backbone;

fig. 3 is a flow chart of the RL-FRA algorithm.

Detailed Description

The invention is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.

Referring to fig. 1, there is shown a basic power communication backbone network, consisting of actual physical nodes a-F, and various service networks of network loads. On the network, in different topologiesRunning corresponding web services, including₁-d₁Service network 1 and a formed by nodes₂-e₂The nodes form a service network 2.

Example 1:

a power communication backbone network fault recovery method based on artificial intelligence and state perception comprises the following steps:

step 1, acquiring the number of nodes of which actual unsaturated nodes are turned into saturated nodes in a power communication backbone network at the current moment t;

the occupancy degree of the node cache is a direct index reflecting the communication state of the network. However, for the power communication backbone network, the state of a single node is not enough to reflect the congestion information of all the nodes around. If only a certain link is blocked, the reasonable forwarding modification can effectively relieve the blockage. Nodes in the network may be classified into saturated nodes and unsaturated nodes according to their cache occupancy states. When the node judges the state of the node, the corresponding blocking state can be judged according to the return information of other nodes in the communication range. Ideally, the upper limit of the number of the unsaturated nodes turned into the saturated nodes can be determined by the information of the peripheral nodes received within a period of time:

N_max(t)＝N_v(t) (1)

in the formula (1), N_max(t) represents the maximum number of inversion nodes at which the unsaturated node is inverted to the saturated node at time t, N_v(t) represents the number of nodes that can communicate with the current node v at time t, i.e. the node congestion status. Consider the maximum capacity for data transmission on a link between two nodes over a time interval Δ t of B x Δ t, where B is the channel transmission rate. If the residual cache capacity on a node is larger than B multiplied by delta t, the probability of the node being turned into a saturated node is 0, otherwise, the node is in a transition state, and the maximum number M of data packets which can be received by the node in the transition state at the moment_maxComprises the following steps:

wherein m is_sizeIs the packet size.

According to the routing principle, when between any two nodes k and q, at least M_maxWhen a data packet is discarded at the node q, the node k will turn over, and the turning probability of the node k is as follows:

representing the maximum number of packets at node k,

represents the maximum number of packets at node q;

therefore, the number of unsaturated node flips actually occurred is as shown in equation (4):

N_act(t)＝p_r(k,q)·N_max(t) (4)

in the formula, N_act(t) represents the number of flip nodes that actually occur. The more the number of actually-occurring turning nodes is, the higher the congestion degree of the nodes is; therefore, the congestion degree of the node can be automatically evaluated according to the number of actually occurring overturning nodes (namely the overturning state perception result of the node).

Step 2, correcting and predicting the number of the nodes which are actually converted into the saturated nodes through weighting parameters to obtain the number of the nodes which are converted into the saturated nodes at the next moment;

the state information of the network nodes has strong dynamic characteristics, so that the number N of the nodes capable of communicating with the current node v only at the moment t_v(t) the precise state of the node v at the next time t + Δ t cannot be accurately predictedI.e. the number of nodes N that can communicate with the current node v at time t + Deltat_v(t + Δ t), the present invention improves the prediction accuracy of the node state by quantifying the uncertainty of the node state. For nodes v and s, it is known that the number of nodes with which each can communicate at time t is N_v(t) and N_s(t) at time t + Δ t for the corresponding N_v(t + Δ t) and N_s(t + Δ t) from N_v(t) and N_s(t) and weighted array (b, d, u) prediction. Wherein b represents the sensing result of the current node v for the flip state in step 1 (i.e. the number of nodes N capable of communicating with the current node v at time t)_v(t)) a weighted weight positively correlated to the congestion status of the node at the next time; d represents the influence weight of the sensing result of the peer node s in a communication state with the node v by turning the state, and u represents the experience influence weight of other unknown factors on the link between the two nodes. Three weight factors meet the normalization requirement. In order to correct the influence of other factors, the node state N at the moment t + delta t is improved_v(t + Δ t) and N_sPrediction accuracy of (t + Δ t), normalizing u to:

the other two weighting factors are calculated according to the following formula:

to N between two nodes_v(t + Δ t) and N_sThe solution of (t + Δ t) can solve the simultaneous equations of equations (5), (6) and (7) under the normalized condition (i.e., u + b + d ═ 1). After all nodes and peripheral nodes are pairwise simultaneous equations, N is given to each node_v(t+Δt)、N_s(t + Δ t) and the solution of u, b, d is iteratively solved in a mean field manner, where N is_vN in (t + Δ t)_v(t) initialization, N_sN in (t + Δ t)_s(t) Initialization, u, b, d are initialized with 1, 0, respectively.

State information N corresponding to all current nodes v peripheral reachable nodes_s(t) is sensed and the prediction is modified according to the weighting parameters to obtain N_s(t + Δ t), and finally predicting value N 'at t + Δ t moment of current node v'_v(t + Δ t) is a predicted value N of the number of nodes with which the current node can communicate_v(t + Deltat) and the predicted value N of the number of nodes with which all the peripheral reachable nodes can communicate_sAverage of the sum of (t + Δ t). Predicting value N 'by the number of nodes with which the current node can communicate at the next moment'_v(t + Deltat) correction of N_v(t) to predict more accurately the number N of nodes at the next time when the unsaturated node is inverted to the saturated node_act(t+Δt)。

Predicting the number N of turning nodes at the moment of t + delta t_act(t+Δt)：

N_act(t+Δt)＝p_r(k,q)·N_max(t+Δt) (8)

N_max(t+Δt)＝N′_v(t+Δt) (9)

Correcting and predicting the number of the turning nodes actually generated in the step 1 through weighting parameters to obtain the number of the turning nodes at the next moment; n is a radical of_max(t + Δ t) represents the maximum number of inversion nodes where the unsaturated node is inverted to the saturated node at time t + Δ t;

and 3, taking the number of the nodes of which the unsaturated nodes are converted into the saturated nodes at the next moment as the input of the reinforcement learning algorithm, when a certain link e fails, the reinforcement learning algorithm provides a new network topology structure as a recovery strategy, a new more effective recovery strategy is established according to the obtained feedback, and the failure recovery is carried out according to the final recovery strategy.

The reinforcement learning algorithm for the power communication backbone is shown in fig. 2.

The input of the algorithm is the number of the predicted overturning nodes at the next moment, and the output is a new learned network topology structure as a recovery strategy. Reinforcement learning is a method for strategy learning by a terminal in the field of machine learning. In learning, the learning end gives a possible new network topology as the trying action a in case of a specific failure. And removing the high-risk link from the network structure, and finally calculating the reasonability of the network structure to draw a conclusion on the action a as the supervision information s for enhancing learning. The learning end judges whether the current action can obtain the forward gain r according to the recursive slave information s of the reinforcement learning algorithm. Finally, the learning end can learn a series of effective network topology structure transformation strategies through learning, and a new replaceable structure is guaranteed to be recovered from a fault topology structure in one step.

When a link e fails, the reinforcement learning algorithm proposes a new network topology structure as a recovery strategy, establishes a new more effective recovery strategy according to the obtained feedback, and performs failure recovery according to a final recovery strategy, as shown in fig. 3, and the specific steps include:

1) when a certain link E belongs to E and fails, taking out the service affected by the failure from the new network topology structure; assume that the communication network topology is represented by G' (V, E), where V represents the aggregate set of node devices and E represents the aggregate set of links.

3) if the service path exists, fault recovery is carried out through the service path, otherwise, the network structure is continuously updated, the risk value is repeatedly calculated, and the path is solved until the shortest path is solved, so that fault recovery is realized; .

Example 2:

N_act(t)＝p_r(k,q)·N_max(t)

representing the maximum number of packets at node k,

represents the maximum number of packets at node q;

N_max(t)＝N_v(t)

N_act(t+Δt)＝p_r(k,q)·N_max(t+Δt)

N_max(t+Δt)＝N_v(t+Δt)：

where v and s are nodes, respectively, and the number of nodes with which communication is possible at time t is known to be N_v(t) and N_s(t)，N_v(t + Δ t) is the number of nodes that can communicate with the current node v at time t + Δ t, N_s(t + Δ t) represents the number of nodes that can communicate with node s at time t + Δ t, node s representing some node that communicates with current node v; b represents the weighted weight of the current node v flip state perception result; d represents the influence weight brought by saturated nodes in the sensing result of the turning state of the node s, u represents the empirical influence weight of other unknown factors on the link between the two nodes, and N is obtained by solving_s(t+Δt)，N_s(t+Δt)，b,d,u。

1) when a certain link E belongs to E and fails, taking out the service affected by the failure from the new network topology structure; e denotes the aggregate of links.

Simulation results in a virtual environment show that compared with the traditional DM-FDR method and the GA-CFRM method, the recovery capability of the method (RL-FRA) for the service is improved in networks with different scales, and the recovery time is shortened, and the advantage is more obvious particularly when the network scale is enlarged to more than 500 nodes.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims

1. A power communication backbone network fault recovery method based on artificial intelligence and state perception is characterized in that: the method comprises the following steps:

2. The method for recovering the fault of the power communication backbone network based on artificial intelligence and state perception as claimed in claim 1, wherein: the number of the nodes for actually inverting the unsaturated nodes into the saturated nodes is as follows:

N_act(t)＝p_r(k,q)·N_max(t)

representing the maximum number of packets at node k,

represents the maximum number of packets at node q;

N_max(t)＝N_v(t)

3. The method for recovering the fault of the power communication backbone network based on artificial intelligence and state perception as claimed in claim 2, wherein: the number of the nodes of which the unsaturated nodes are turned into the saturated nodes at the next moment is N_act(t+Δt)：

N_act(t+Δt)＝p_r(k,q)·N_max(t+Δt)

N_max(t+Δt)＝N_v'(t+Δt)：

N_v' (t + Δ t) is the average of the sum of the predicted value of the number of nodes with which the current node can communicate and the predicted value of the number of nodes with which all reachable nodes around the current node can communicate;

where v and s are nodes, respectively, and the number of nodes with which communication is possible at time t is known to be N_v(t) and N_s(t)，N_v(t + Δ t) is a predicted value of the number of nodes that can communicate with the current node v at time t + Δ t, N_s(t + Δ t) represents a predicted value of the number of nodes that can communicate with a node s at the time t + Δ t, the node s representing a certain node communicating with the current node v; b represents the weighted weight of the current node v flip state perception result; d represents the influence weight of the sensing result of the node s in the overturning state, u represents the empirical influence weight of other unknown factors on the link between the two nodes, and N is obtained by solving_v(t+Δt)，N_s(t+Δt)，b,d,u。

4. The method for recovering the fault of the power communication backbone network based on artificial intelligence and state perception as claimed in claim 1, wherein: when a certain link e fails, the reinforcement learning algorithm provides a new network topology structure as a recovery strategy, a new more effective recovery strategy is established according to the obtained feedback, and the failure recovery is carried out according to the final recovery strategy, which specifically comprises the following steps:

5. A power communication backbone network fault recovery device based on artificial intelligence and state perception is characterized in that: the method comprises the following steps:

6. The device for recovering the fault of the power communication backbone network based on artificial intelligence and state perception of claim 5, wherein: the number of the nodes for actually inverting the unsaturated nodes into the saturated nodes is as follows:

N_act(t)＝p_r(k,q)·N_max(t)

representing the maximum number of packets at node k,

represents the maximum number of packets at node q;

N_max(t)＝N_v(t)

7. The device for recovering the fault of the power communication backbone network based on artificial intelligence and state perception of claim 5, wherein: the unsaturated node is turned into a saturated node at the next momentThe number of nodes is N_act(t+Δt)：

N_act(t+Δt)＝p_r(k,q)·N_max(t+Δt)

N_max(t+Δt)＝N_v(t+Δt)：

8. The device for recovering the fault of the power communication backbone network based on artificial intelligence and state perception of claim 5, wherein: when a certain link e fails, the reinforcement learning algorithm provides a new network topology structure as a recovery strategy, a new more effective recovery strategy is established according to the obtained feedback, and the failure recovery is carried out according to the final recovery strategy, which specifically comprises the following steps: