CN116526477B

CN116526477B - Method and device for determining power grid reconstruction strategy, computer equipment and storage medium

Info

Publication number: CN116526477B
Application number: CN202310787133.7A
Authority: CN
Inventors: 李鹏; 黄文琦; 梁凌宇; 张焕明; 赵翔宇; 戴珍
Original assignee: Southern Power Grid Digital Grid Research Institute Co Ltd
Current assignee: Southern Power Grid Digital Grid Research Institute Co Ltd
Priority date: 2023-06-30
Filing date: 2023-06-30
Publication date: 2024-03-26
Anticipated expiration: 2043-06-30
Also published as: CN116526477A

Abstract

The application relates to a method, a device, computer equipment and a storage medium for determining a power grid reconstruction strategy. The method comprises the following steps: acquiring power grid operation data of a target area under the condition of power grid faults of the target area; acquiring a grid initial state matrix of a target area according to the grid operation data; acquiring a power grid reconstruction strategy of a target area according to a power grid initial state matrix and a preset random Monte Carlo tree search model; the random Monte Carlo tree search model is obtained by pruning nodes related to an initial power grid reconstruction strategy in the initial random Monte Carlo tree search model according to recursive rewards of all power grid nodes in the initial power grid reconstruction strategy output by the initial random Monte Carlo tree search model. By adopting the method, the timeliness of determining the power grid reconstruction strategy can be improved.

Description

Method and device for determining power grid reconstruction strategy, computer equipment and storage medium

Technical Field

The present disclosure relates to the field of power grid technologies, and in particular, to a method and apparatus for determining a power grid reconstruction policy, a computer device, and a storage medium.

Background

When the power system fails, the power system needs to be recovered to work by performing reconstruction operations such as fault isolation, switch control and the like. Therefore, under the new situation that the disaster frequency and the power supply reliability of the power system are required to be improved, the reconstruction strategy of the circuit system is determined timely, and the reconstruction strategy plays an important role in ensuring the normal operation of the power system.

In the conventional technology, target profit values of the reconstruction of each power grid node are calculated by traversing power grid outage data, nash equilibrium solutions of each power grid node are calculated by improving a particle swarm algorithm and the target profit values of the reconstruction of each power grid node, so that an optimal power grid node is determined, and the current power grid is reconstructed according to the optimal power grid node.

However, the conventional technology has a problem that the timeliness of determining the power grid reconstruction strategy is low.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a method, an apparatus, a computer device, and a storage medium for determining a power grid reconstruction policy, which can improve the timeliness of determining the power grid reconstruction policy.

In a first aspect, the present application provides a method for determining a power grid reconstruction policy. The method comprises the following steps:

acquiring power grid operation data of a target area under the condition of power grid faults of the target area;

Acquiring a grid initial state matrix of the target area according to the grid operation data;

acquiring a power grid reconstruction strategy of the target area according to the power grid initial state matrix and a preset random Monte Carlo tree search model; the random Monte Carlo tree search model is obtained by pruning nodes related to an initial power grid reconstruction strategy in the initial random Monte Carlo tree search model according to recursive rewards of all power grid nodes in the initial power grid reconstruction strategy output by the initial random Monte Carlo tree search model.

In one embodiment, a sample grid initial state matrix of the target area is obtained according to a training sample set;

performing an iterative operation, the iterative operation comprising: inputting the initial state matrix of the sample power grid into the initial random Monte Carlo tree search model to obtain an initial power grid reconstruction strategy of the target area, and pruning nodes related to the initial power grid reconstruction strategy in the initial random Monte Carlo tree search model according to recursive reward values of all nodes in the initial random Monte Carlo tree search model to obtain an intermediate random Monte Carlo tree search model;

The intermediate random Monte Carlo tree search model is used as a new initial random Monte Carlo tree search model, the iterative operation is returned to be executed until a preset convergence condition is reached, and the intermediate random Monte Carlo tree search model corresponding to the convergence condition is determined as the random Monte Carlo tree search model; the convergence condition is determined according to the power grid reconstruction constraint condition of the target area.

In one embodiment, the pruning processing is performed on the nodes related to the initial power grid reconstruction strategy in the initial random monte carlo tree search model according to the recursive reward values of the nodes in the initial random monte carlo tree search model, so as to obtain an intermediate random monte carlo tree search model, including:

and pruning nodes with recursive rewards smaller than a preset threshold value in nodes related to the initial power grid reconstruction strategy in the initial random Monte Carlo tree search model to obtain the intermediate random Monte Carlo tree search model.

In one embodiment, the inputting the sample grid initial state matrix into the initial random monte carlo tree search model to obtain an initial grid reconstruction strategy of the target area includes:

Determining initial state information of grid nodes corresponding to all nodes in the initial random Monte Carlo tree search model according to the initial state matrix of the sample grid;

and starting searching from the root node of the initial random Monte Carlo tree searching model to obtain an initial power grid reconstruction strategy of the target area.

In one embodiment, the method further comprises:

and acquiring the training sample set according to the historical power grid reconstruction event record of the target area or the power grid reconstruction simulation result of the target area.

In one embodiment, according to the grid operation data, acquiring a grid initial state matrix of the target area includes:

acquiring a power grid state matrix according to the power grid operation data; the grid state matrix characterizes the line state among all grid nodes in the target area;

acquiring a node characteristic matrix according to the power grid operation data; the node characteristic matrix characterizes characteristic information of each power grid node in the target area;

and acquiring a power grid initial state matrix of the target area according to the power grid state matrix and the node characteristic matrix.

In one embodiment, the method further comprises:

carrying out power grid reconstruction on the power grid of the target area according to the power grid reconstruction strategy, and obtaining a power grid reconstruction result;

and according to the power grid reconstruction result, acquiring an evaluation result corresponding to the power grid reconstruction strategy.

In a second aspect, the application further provides a device for determining the power grid reconstruction strategy. The device comprises:

the first acquisition module is used for acquiring power grid operation data of a target area under the condition of power grid faults of the target area;

the second acquisition module is used for acquiring a grid initial state matrix of the target area according to the grid operation data;

the third acquisition module is used for acquiring a power grid reconstruction strategy of the target area according to the power grid initial state matrix and a preset random Monte Carlo tree search model; the random Monte Carlo tree search model is obtained by pruning nodes related to an initial power grid reconstruction strategy in the initial random Monte Carlo tree search model according to recursive rewards of all power grid nodes in the initial power grid reconstruction strategy output by the initial random Monte Carlo tree search model.

In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor which when executing the computer program performs the steps of:

In a fourth aspect, the present application also provides a computer-readable storage medium. The computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:

In a fifth aspect, the present application also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the steps of:

According to the method, the device, the computer equipment and the storage medium for determining the power grid reconstruction strategy, the initial random Monte Carlo tree search model is established in advance, the recursive rewards of all power grid nodes in the initial power grid reconstruction strategy output by the initial random Monte Carlo tree search model are processed in a pruning mode according to the recursive rewards of all power grid nodes in the initial random Monte Carlo tree search model, the trained random Monte Carlo tree search model is obtained, therefore, under the condition of power grid faults of a target area, power grid operation data of the target area are obtained, according to the power grid operation data of the target area, the power grid initial state matrix of the target area is obtained, the power grid initial state matrix is input into the trained random Monte Carlo tree search model, and the power grid reconstruction strategy of the target area is obtained.

Drawings

FIG. 1 is an application environment diagram of a method for determining a power grid reconstruction policy in one embodiment;

FIG. 2 is one of the flow diagrams of the method for determining the power grid reconstruction strategy in one embodiment;

FIG. 3 is a second flow chart of a method for determining a power grid reconstruction strategy according to an embodiment;

FIG. 4 is a schematic diagram of a training process of a random Monte Carlo tree search model in one embodiment;

FIG. 5 is a third flow chart of a method for determining a grid reconstruction strategy in one embodiment;

FIG. 6 is a flow chart diagram of a method of determining a grid reconstruction policy in one embodiment;

FIG. 7 is a flow chart diagram of a method for determining a grid reconstruction policy in one embodiment;

FIG. 8 is one of the block diagrams of the apparatus for determining the power grid reconstruction policy in one embodiment;

FIG. 9 is a second block diagram of an embodiment of a device for determining a grid reconstruction strategy;

FIG. 10 is a third block diagram of a power grid reconstruction policy determination device in one embodiment;

FIG. 11 is a fourth block diagram of a determination device of a grid reconstruction policy in one embodiment;

FIG. 12 is a fifth block diagram of a power grid reconstruction policy determination device in one embodiment;

FIG. 13 is a sixth block diagram of a power grid reconstruction policy determination device in one embodiment;

fig. 14 is a block diagram of a power grid reconstruction policy determination device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

The method for determining the power grid reconstruction strategy provided by the embodiment of the application can be applied to an application environment shown in fig. 1. The computer device may be a terminal, and its internal structure may be as shown in fig. 1. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless mode can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a method of determining a grid reconstruction policy. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.

In one embodiment, as shown in fig. 2, a method for determining a power grid reconstruction policy is provided, and an example of application of the method to the terminal in fig. 1 is described, including the following steps:

s201, under the condition of grid faults of the target area, acquiring the grid operation data of the target area.

The power grid faults of the target area comprise faults such as generator set faults, short circuit or open circuit of a power generation line, excessive load of power grid nodes and the like, and the power grid operation data of the target area comprise power grid switch states, tide data, node load importance, load recovery gain parameters and the like. Optionally, in this embodiment, the operation monitoring system may be used to obtain the power grid operation data of the target area, or may also be used to obtain the power grid operation data of the target area through the relay protection system, or may also be used to obtain the power grid operation data of the target area through the man-machine interaction interface, where the embodiment of the method is not limited herein.

S202, acquiring a grid initial state matrix of a target area according to grid operation data.

The grid initial state matrix is used for representing the grid characteristics and the node characteristics of the target area when the grid fails.

Optionally, the power grid operation data can be normalized, and the normalized power grid operation data is formed into a power grid initial state matrix of the target area according to preset generation logic; or, the power grid operation data can be screened according to a preset rule, and the characteristics of the screened power grid operation data are extracted, so that a power grid initial state matrix of the target area is obtained.

S203, acquiring a power grid reconstruction strategy of a target area according to a power grid initial state matrix and a preset random Monte Carlo tree search model; the random Monte Carlo tree search model is obtained by pruning nodes related to an initial power grid reconstruction strategy in the initial random Monte Carlo tree search model according to recursive rewards of all power grid nodes in the initial power grid reconstruction strategy output by the initial random Monte Carlo tree search model.

The pruning process is used for improving the problem of overfitting of the Monte Carlo tree search model caused by excessive decision branches.

In this embodiment, an initial random monte carlo tree search model may be established according to a monte carlo tree search algorithm, the initial random monte carlo tree search model is trained, a trained random monte carlo tree search model is obtained, and a grid reconstruction strategy of the target area is obtained according to a grid initial state matrix and a preset random monte carlo tree search model. The training process of the random Monte Carlo tree search model includes pruning nodes related to an initial power grid reconstruction strategy in the initial random Monte Carlo tree search model according to output of the initial random Monte Carlo tree search model, providing unbiased estimation of a reward value of an expected reconstruction strategy based on historical data and a random optimization method, optimizing the random Monte Carlo tree search model, evaluating the initial power grid reconstruction strategy, and optimizing the random Monte Carlo tree search model according to evaluation of the initial power grid reconstruction strategy in training. Illustratively, the random optimization method may be a sample mean approximation (sample average approximation, SAA) method.

Illustratively, evaluating the initial grid reconstruction strategy may be evaluating a single step search of the Monte Carlo tree that violates the business constraint and feeding back the single step search to a penalty term, where the penalty term may be expressed asWherein V is _pen,bl Representing island operation penalty, V _pen,op Indicating repeat switch operation penalty, V _pen,pf Representing a power balance penalty.

According to the method for determining the power grid reconstruction strategy, the initial random Monte Carlo tree search model is established in advance, the recursive rewarding value of each power grid node in the initial power grid reconstruction strategy output by the initial random Monte Carlo tree search model is used for pruning the nodes related to the initial power grid reconstruction strategy in the initial random Monte Carlo tree search model, so that the trained random Monte Carlo tree search model is obtained, power grid operation data of a target area are obtained under the condition that the power grid of the target area fails, the power grid initial state matrix of the target area is obtained according to the power grid operation data of the target area, the power grid initial state matrix is input into the trained random Monte Carlo tree search model, the power grid reconstruction strategy of the target area is obtained, and because the trained random Monte Carlo tree search model is obtained according to the recursive rewarding value of each power grid node in the initial power grid reconstruction strategy output by the initial random Monte Carlo tree search model, the random Monte tree search is prevented from being processed, and the time-dependent Monte tree search efficiency of the random Monte tree search model is improved, and the time-dependent Monte tree search process is improved.

The training process of the random Monte Carlo tree search model will be described in detail below, and in one embodiment, as shown in FIG. 3, the method further includes:

s301, acquiring a sample power grid initial state matrix of a target area according to a training sample set.

In this embodiment, the sample grid operation data of the target area is extracted from the training sample set by a preset sampling method, and the preset sampling method may be a sampling method such as a simple random sampling method, a hierarchical random sampling method, a clustered random sampling method, or the like. And obtaining the initial state matrix of the sample power grid of the target area according to the operation data of the sample power grid of the target area. Optionally, the sample power grid operation data can be normalized, and the normalized sample power grid operation data is formed into a sample power grid initial state matrix of the target area according to preset generation logic; or, the sample power grid operation data can be screened according to a preset rule, and the characteristics of the screened sample power grid operation data are extracted, so that a sample power grid initial state matrix of the target area is obtained.

Optionally, in this embodiment, the training sample set may be obtained according to a historical grid reconstruction event record of the target area or a grid reconstruction simulation result of the target area. As an alternative embodiment, the power grid reconstruction simulation may include power flow verification, simulation data generation, expected fault set generation, etc., so as to generate a training sample set from the power grid reconstruction simulation result.

S302, performing iterative operation, wherein the iterative operation comprises the following steps: inputting a sample power grid initial state matrix into an initial random Monte Carlo tree search model to obtain an initial power grid reconstruction strategy of a target area, and pruning nodes related to the initial power grid reconstruction strategy in the initial random Monte Carlo tree search model according to recursive rewards of all nodes in the initial random Monte Carlo tree search model to obtain an intermediate random Monte Carlo tree search model.

In this embodiment, a sample grid initial state matrix is input into an initial random monte carlo tree search model, as shown in fig. 4, paths are recursively selected and marked from preset node positions, an unmarked node is selected to access and update a search tree structure of the initial random monte carlo tree search model to obtain an initial search path, an initial grid reconstruction strategy is determined through the initial search path, counter propagation is performed along the initial search path, recursion reward values of all nodes related to the initial search path are calculated according to the reward values of all nodes, nodes to be pruned of the same father node as the searched nodes are marked according to the recursion reward values of all nodes, and pruning processing is performed on the nodes to be pruned in the initial random monte carlo tree search model to obtain an intermediate random monte carlo tree search model.

Illustratively, the reward estimation function may be obtained according to a preset historical statistical model, and the reward estimation function may be expressed as:the calculation of the recursive prize value in back propagation may be as shown in equation 1:

(1)

Wherein Q (v) _t ) _r+1 For the prize value searched to the t node in the (r+1) th iteration, Q (v) _t ) _r For searching the prize value to the t node in the r iteration,searching for an evaluation index value to the t+1st node in the (r) th iteration operation, ++>And searching an evaluation index value to the t node in the r iteration operation, wherein K is a constant.

Optionally, in this embodiment, pruning may be performed on a node, where the recursive reward value is smaller than a preset threshold, in the nodes related to the initial power grid reconstruction policy in the initial random monte carlo tree search model, to obtain an intermediate random monte carlo tree search model.

S303, taking the intermediate random Monte Carlo tree search model as a new initial random Monte Carlo tree search model, returning to execute iterative operation until reaching a preset convergence condition, and determining the intermediate random Monte Carlo tree search model corresponding to the convergence condition as a random Monte Carlo tree search model; the convergence condition is determined according to the power grid reconstruction constraint condition of the target area.

In this embodiment, according to the training sample set, a power grid reconstruction constraint condition of the target area is obtained, the power grid reconstruction constraint condition of the target area is used as a convergence condition, the intermediate random monte carlo tree search model is used as a new initial random monte carlo tree search model, iterative operation is performed in a return mode, and when the power grid reconstruction strategy output by the intermediate random monte carlo tree search model meets the constraint condition, the current intermediate random monte carlo tree search model is determined to be the random monte carlo tree search model. The power grid reconstruction constraint conditions of the target area comprise power balance constraint, node voltage constraint, branch power constraint and radial constraint. The grid reconstruction constraint conditions of the target area will be described in detail below:

(1) The power flow condition of the power grid reconstruction process is calculated by adopting a forward-push back substitution method, the active power required by the node load and the active power generated by the generator set in the power grid reconstruction process are kept balanced as standards, and the power balance constraint condition is set, wherein the power balance constraint condition is shown as a formula 2:

(2)

Wherein N is _bus Total number of nodes of the grid representing the target area, N _gen The total number of generators of the grid representing the target area, Representing the active power required by the load of the ith node,/->Representing the active power produced by each kth generator.

(2) The node voltage constraint is shown in fig. 3:

(3)

Wherein,represents node voltage, +_>Representing the lower limit of the node voltage,/-, for>Representing the upper limit of the node voltage.

(3) The branch power constraint is shown in fig. 4:

(4)

Wherein,representing the actual active power of the line flowing from node i to node j,/>Representing the actual reactive power of the line flowing from node i to node j,/->The rated capacity of the power of the line flowing from node i to node j is represented.

(4) The radial constraint is shown in formulas 5 and 6:

(5)

(6)

Wherein N is _b For the total node number of the power grid of the target area, N _s For the number of grid root nodes of the target area, M _i,cir The method is characterized in that the method is in an open state of a branch m in an ith power supply loop of the regional power grid, the line is 1 when closed, and the line is 0 when open.

In the embodiment, in the training process of the random Monte Carlo tree search model, the nodes related to the initial power grid reconstruction strategy in the initial random Monte Carlo tree search model are pruned according to the recursive rewarding values of the nodes in the initial random Monte Carlo tree search model, so that repeated searching of the random Monte Carlo tree search model in the searching process can be avoided in the process of determining the power grid reconstruction strategy of the target area, the searching efficiency of the random Monte Carlo tree search model is improved, and the timeliness of the power grid reconstruction strategy of the target area is ensured.

In the scenario that the initial state matrix of the sample power grid is input into the initial random Monte Carlo tree search model to obtain the initial power grid reconstruction strategy of the target area, searching can be started from the root node of the initial random Monte Carlo tree search model to obtain the initial power grid reconstruction strategy of the target area. In one embodiment, as shown in fig. 5, S302 includes:

s401, determining initial state information of grid nodes corresponding to all nodes in an initial random Monte Carlo tree search model according to a sample grid initial state matrix.

In this embodiment, the sample initial state matrix of the power grid is used to represent initial state information of each power grid node when the power grid fails, the search tree structure of the initial random monte carlo tree search model includes a plurality of nodes, the power grid of the target area includes a plurality of power grid nodes, and in the training process of the random monte carlo tree search model, the sample initial state matrix of the power grid of the target area is input into the initial random monte carlo tree search model, and the corresponding relation between each node in the search tree structure and each power grid node can be determined first, so that initial state information of the power grid node corresponding to each node in the initial random monte carlo tree search model is determined according to the sample initial state matrix of the power grid.

S402, searching from a root node of an initial random Monte Carlo tree search model to obtain an initial power grid reconstruction strategy of a target area.

In this embodiment, a root node of an initial random monte carlo tree search model is used as a first parent node to start searching, a child node searched in the next step is determined according to an evaluation standard value of each child node of the parent node, the child node is determined to be the next parent node, a search path corresponding to the current iterative operation is obtained, and an initial power grid reconstruction strategy of a target area is obtained according to nodes included in the search path. Alternatively, a child node having the largest evaluation index value among the child nodes may be used as the next parent node. Illustratively, the calculation formula of the evaluation index value of each node is shown in formula 7:

(7)

Wherein, UCT represents an evaluation index value, E is a desired operator, and T is a search step length; q (v) _t ) For searching to node v _t Is a prize value for (1); n (v) _p ) The number of times the parent node is accessed in the search; n (v) _c ) The number of times a child node is accessed in the search; v (V) _pen,i Punishment terms for violating constraints; gamma is a constant coefficient, and gamma>0, illustratively, γ=0.8 in the present embodiment. In the present embodiment, the search path of the initial random monte carlo tree search model is determined by the evaluation index value of the node of each child node, and the global expected value is rewarded by the search Single step prize value Q (v) replacing a traditional random monte carlo tree search model _t ) The global optimization of the random Monte Carlo tree search model is quickened, and the reconstruction search is prevented from sinking into a local optimal solution.

In this embodiment, initial state information of grid nodes corresponding to each node in an initial random monte carlo tree search model is determined according to a sample grid initial state matrix, so that each node in the initial random monte carlo tree search model corresponds to each grid node one by one, the accuracy of searching each node by the initial random monte carlo tree search model is ensured, searching is started from a root node of the initial random monte carlo tree search model, the searching process is more ordered, and node omission in the searching process is avoided.

In the above scenario of acquiring the grid initial state matrix of the target area according to the grid operation data, the grid state matrix and the node feature matrix may be acquired according to the grid operation data, and then the grid initial state matrix may be acquired according to the grid state matrix and the node feature matrix. In one embodiment, as shown in fig. 6, S202 includes:

s501, acquiring a power grid state matrix according to power grid operation data; the grid state matrix characterizes the line states between the grid nodes in the target area.

The line state among the grid nodes comprises information such as a switching state, active power, reactive power, resistance, reactance and the like among the grid nodes.

Alternatively, in this embodiment, the grid state matrix may be denoted as G _ij =[δ _ij ,P _ij ,Q _ij ,r _ij ,x _ij ]Wherein G is _ij State matrix, delta, representing the line from grid node i to grid node j _ij Representing the switching state of the line flowing from grid node i to grid node j, delta when the line flowing from grid node i to grid node j is closed _ij A value of 1, otherwise 0; p (P) _ij Representing the active power of the line flowing from grid node i to grid node j, Q _ij Representing the reactive power of the line flowing from the grid node i to the grid node j; r is (r) _ij Representing the resistance, x, of the line from grid node i to grid node j _ij Representing the reactance of the line flowing from grid node i to grid node j.

S502, acquiring a node characteristic matrix according to power grid operation data; the node characteristic matrix characterizes characteristic information of each power grid node in the target area.

The characteristic information of the power grid node comprises information such as a voltage value, active power, importance degree, load recovery gain coefficient and the like of the power grid node.

Alternatively, in this embodiment, the node feature matrix may be expressed as Wherein N is _i For the characteristic matrix of the power grid node i, U _i For the voltage value of the network node i, < >>Active power representing the connection of grid node i to the power supply, < >>Active power, z, representing the load of grid node i _i For the load importance degree of the power grid node i, eta _i Restoring gain coefficient, eta for power grid node i load _i And solving and obtaining the historical recovery data.

S503, acquiring a grid initial state matrix of the target area according to the grid state matrix and the node characteristic matrix.

In this embodiment, according to the set of the state matrices of each power grid and the set of the feature matrices of each node, the initial state matrix of the power grid in the target area is obtained. Optionally, the grid initial state matrix may be represented as f= [ G, N ], where F is the grid reconstruction process initial state matrix, G is each grid state matrix set, and N is each node feature matrix set.

In this embodiment, according to the power grid operation data, the power grid state matrix and the node feature matrix are respectively obtained, so that the power grid initial state matrix of the target area is obtained according to the power grid state matrix and the node feature matrix, and compared with the power grid initial state matrix obtained by using one of the power grid state matrix and the node feature matrix, the power grid initial matrix is obtained from two aspects of the power grid state and the node feature, so that the power grid state information represented in the power grid initial state matrix is more comprehensive and accurate.

After the power grid reconstruction strategy of the target area is obtained, an evaluation result corresponding to the power grid reconstruction strategy can be obtained according to the power grid reconstruction result of the power grid reconstruction strategy. In one embodiment, as shown in fig. 7, the method further includes:

s601, carrying out power grid reconstruction on a power grid of a target area according to a power grid reconstruction strategy, and obtaining a power grid reconstruction result.

In this embodiment, the power grid of the target area may be reconfigured according to the power grid reconfiguration policy of the target area, so as to obtain power grid operation data of the target area after power grid reconfiguration, and obtain a power grid reconfiguration result according to the preset processing logic and the power grid operation data. The preset processing logic may include requirements for various types of data, so that the power grid operation data may be screened according to the requirements for various types of data, unreasonable data or invalid data may be removed, and the screened power grid operation data may be determined as a power grid reconstruction result.

S602, according to the power grid reconstruction result, acquiring an evaluation result corresponding to the power grid reconstruction strategy.

In this embodiment, the power grid reconstruction result may be evaluated according to the power grid reconstruction result, a preset evaluation index and a preset evaluation method, so as to obtain an evaluation result corresponding to the power grid reconstruction policy. Optionally, the preset evaluation index may include a power grid load loss amount, a power grid reconstruction recovery time and a load electricity satisfaction degree. Optionally, the preset evaluation method may include that when the power grid load loss is smaller than a first preset threshold, the evaluation result corresponding to the power grid load loss is qualified, and when the power grid load loss is not smaller than the first preset threshold, the evaluation result corresponding to the power grid load loss is unqualified; the preset evaluation method may further include that when the power grid reconstruction recovery time is less than the preset time, the evaluation result corresponding to the power grid reconstruction recovery time is qualified, and when the power grid reconstruction recovery time is not less than the preset time, the evaluation result corresponding to the power grid reconstruction recovery time is unqualified; the load electricity satisfaction degree can be obtained by an electricity user in the target area, and the preset evaluation method can further comprise the step that when the load electricity satisfaction degree is larger than a second preset threshold value, an evaluation result corresponding to the load electricity satisfaction degree is qualified, and when the load electricity satisfaction degree is not larger than the second preset threshold value, the evaluation result corresponding to the load electricity satisfaction degree is unqualified.

In this embodiment, according to the power grid reconstruction result, an evaluation result corresponding to the power grid reconstruction policy is obtained, and the output result of the random monte carlo tree search model can be evaluated, so that the random monte carlo tree search model is adjusted according to the evaluation result, and the accuracy of the power grid reconstruction policy is improved.

An embodiment of the present disclosure is described below in connection with a specific grid reconstruction strategy determination scenario, the method comprising the steps of:

s1, acquiring a training sample set according to a historical power grid reconstruction event record of a target area or a power grid reconstruction simulation result of the target area.

S2, acquiring a sample power grid initial state matrix of the target area according to the training sample set.

S3, performing iterative operation, wherein the iterative operation comprises the following steps: determining initial state information of grid nodes corresponding to all nodes in an initial random Monte Carlo tree search model according to a sample grid initial state matrix; starting searching from a root node of an initial random Monte Carlo tree searching model to obtain an initial power grid reconstruction strategy of a target area; and pruning the nodes with the recursive rewarding value smaller than the preset threshold value in the nodes related to the initial power grid reconstruction strategy in the initial random Monte Carlo tree search model to obtain an intermediate random Monte Carlo tree search model.

S4, taking the intermediate random Monte Carlo tree search model as a new initial random Monte Carlo tree search model, returning to execute iterative operation until reaching a preset convergence condition, and determining the intermediate random Monte Carlo tree search model corresponding to the convergence condition as a random Monte Carlo tree search model; the convergence condition is determined according to the power grid reconstruction constraint condition of the target area.

S5, under the condition of grid faults of the target area, acquiring the grid operation data of the target area.

S6, acquiring a power grid state matrix according to power grid operation data; acquiring a node characteristic matrix according to the power grid operation data; the power grid state matrix characterizes the line state among all power grid nodes in the target area; the node characteristic matrix characterizes characteristic information of each power grid node in the target area;

s7, acquiring a power grid initial state matrix of the target area according to the power grid state matrix and the node characteristic matrix.

S8, acquiring a power grid reconstruction strategy of the target area according to the power grid initial state matrix and a preset random Monte Carlo tree search model.

S9, carrying out power grid reconstruction on the power grid of the target area according to a power grid reconstruction strategy, and obtaining a power grid reconstruction result.

S10, according to the power grid reconstruction result, acquiring an evaluation result corresponding to the power grid reconstruction strategy.

It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.

Based on the same inventive concept, the embodiment of the application also provides a determining device of the power grid reconstruction strategy for realizing the determining method of the power grid reconstruction strategy. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation in the embodiments of the determining device for one or more power grid reconstruction strategies provided below may refer to the limitation of the determining method for the power grid reconstruction strategy hereinabove, and will not be described herein.

In one embodiment, as shown in fig. 8, there is provided a determining apparatus of a power grid reconstruction policy, including: a first acquisition module 10, a second acquisition module 11, and a third acquisition module 12, wherein:

a first obtaining module 10, configured to obtain power grid operation data of a target area in case of a power grid failure of the target area;

the second obtaining module 11 is configured to obtain a grid initial state matrix of the target area according to the grid operation data;

the third obtaining module 12 is configured to obtain a power grid reconstruction policy of the target area according to the power grid initial state matrix and a preset random monte carlo tree search model; the random Monte Carlo tree search model is obtained by pruning nodes related to an initial power grid reconstruction strategy in the initial random Monte Carlo tree search model according to recursive rewards of all power grid nodes in the initial power grid reconstruction strategy output by the initial random Monte Carlo tree search model.

The determining device for the power grid reconstruction policy provided in this embodiment may execute the above method embodiment, and its implementation principle and technical effects are similar, and are not described herein again.

In one embodiment, as shown in fig. 9, the apparatus further comprises: a fourth acquisition module 13, a first execution module 14 and a second execution module 15, wherein:

And a fourth obtaining module 13, configured to obtain a sample grid initial state matrix of the target area according to the training sample set.

A first execution module 14, configured to execute an iterative operation, where the iterative operation includes: inputting a sample power grid initial state matrix into an initial random Monte Carlo tree search model to obtain an initial power grid reconstruction strategy of a target area, and pruning nodes related to the initial power grid reconstruction strategy in the initial random Monte Carlo tree search model according to recursive rewards of all nodes in the initial random Monte Carlo tree search model to obtain an intermediate random Monte Carlo tree search model.

The second execution module 15 is configured to take the intermediate random monte carlo tree search model as a new initial random monte carlo tree search model, return to perform iterative operation until a preset convergence condition is reached, and determine the intermediate random monte carlo tree search model corresponding to the convergence condition as a random monte carlo tree search model; the convergence condition is determined according to the power grid reconstruction constraint condition of the target area.

In one embodiment, as shown in fig. 10, the first execution module 14 includes: a processing unit 141, wherein:

and the processing unit 141 is configured to prune nodes, in the initial random monte carlo tree search model, that have a recursive prize value smaller than a preset threshold among nodes related to the initial power grid reconstruction strategy, to obtain an intermediate random monte carlo tree search model.

In one embodiment, as shown in fig. 11, the first execution module 14 includes: a determination unit 142 and a first acquisition unit 143, wherein:

the determining unit 142 is configured to determine initial state information of grid nodes corresponding to each node in the initial random monte carlo tree search model according to the initial state matrix of the sample grid.

The first obtaining unit 143 is configured to perform a search from a root node of the initial random monte carlo tree search model, so as to obtain an initial power grid reconstruction policy of the target area.

In one embodiment, as shown in fig. 12, the apparatus further comprises: a fifth acquisition module 16, wherein:

and a fifth obtaining module 16, configured to obtain a training sample set according to the historical grid reconstruction event record of the target area or the grid reconstruction simulation result of the target area.

In one embodiment, as shown in fig. 13, the second obtaining module 11 includes: a second acquisition unit 111, a third acquisition unit 112, and a fourth acquisition unit 113, wherein:

a second obtaining unit 111, configured to obtain a grid state matrix according to the grid operation data; the grid state matrix characterizes the line states between the grid nodes in the target area.

A third obtaining unit 112, configured to obtain a node feature matrix according to the grid operation data; the node characteristic matrix characterizes characteristic information of each power grid node in the target area.

And the fourth obtaining unit 113 is configured to obtain a grid initial state matrix of the target area according to the grid state matrix and the node feature matrix.

In one embodiment, as shown in fig. 14, the apparatus further comprises: a sixth acquisition module 17 and a seventh acquisition module 18, wherein:

and a sixth obtaining module 17, configured to perform grid reconstruction on the grid of the target area according to a grid reconstruction policy, and obtain a grid reconstruction result.

And a seventh obtaining module 18, configured to obtain an evaluation result corresponding to the power grid reconstruction policy according to the power grid reconstruction result.

The above-mentioned modules in the determination device of the power grid reconstruction policy may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided comprising a memory and a processor, the memory having stored therein a computer program, the processor when executing the computer program performing the steps of:

acquiring a grid initial state matrix of a target area according to the grid operation data;

acquiring a power grid reconstruction strategy of a target area according to a power grid initial state matrix and a preset random Monte Carlo tree search model; the random Monte Carlo tree search model is obtained by pruning nodes related to an initial power grid reconstruction strategy in the initial random Monte Carlo tree search model according to recursive rewards of all power grid nodes in the initial power grid reconstruction strategy output by the initial random Monte Carlo tree search model.

In one embodiment, the processor when executing the computer program further performs the steps of:

acquiring a sample power grid initial state matrix of a target area according to the training sample set;

performing an iterative operation, the iterative operation comprising: inputting a sample power grid initial state matrix into an initial random Monte Carlo tree search model to obtain an initial power grid reconstruction strategy of a target area, and pruning nodes related to the initial power grid reconstruction strategy in the initial random Monte Carlo tree search model according to recursive rewards of all nodes in the initial random Monte Carlo tree search model to obtain an intermediate random Monte Carlo tree search model;

Taking the intermediate random Monte Carlo tree search model as a new initial random Monte Carlo tree search model, returning to execute iterative operation until reaching a preset convergence condition, and determining the intermediate random Monte Carlo tree search model corresponding to the convergence condition as a random Monte Carlo tree search model; the convergence condition is determined according to the power grid reconstruction constraint condition of the target area.

and pruning the nodes with the recursive rewarding value smaller than the preset threshold value in the nodes related to the initial power grid reconstruction strategy in the initial random Monte Carlo tree search model to obtain an intermediate random Monte Carlo tree search model.

determining initial state information of grid nodes corresponding to all nodes in an initial random Monte Carlo tree search model according to a sample grid initial state matrix;

and searching from the root node of the initial random Monte Carlo tree search model to obtain an initial power grid reconstruction strategy of the target area.

And acquiring a training sample set according to the historical power grid reconstruction event record of the target area or the power grid reconstruction simulation result of the target area.

acquiring a power grid state matrix according to power grid operation data; the power grid state matrix represents the line state among all power grid nodes in the target area;

carrying out power grid reconstruction on a power grid of a target area according to a power grid reconstruction strategy, and obtaining a power grid reconstruction result;

In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of:

In one embodiment, the computer program when executed by the processor further performs the steps of:

In one embodiment, a computer program product is provided comprising a computer program which, when executed by a processor, performs the steps of:

It should be noted that, user information (including but not limited to user equipment information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the various embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as Static Random access memory (Static Random access memory AccessMemory, SRAM) or dynamic Random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the various embodiments provided herein may include at least one of relational databases and non-relational databases. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic units, quantum computing-based data processing logic units, etc., without being limited thereto.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples only represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application shall be subject to the appended claims.

Claims

1. A method for determining a power grid reconstruction strategy, the method comprising:

acquiring a power grid reconstruction strategy of the target area according to the power grid initial state matrix and a preset random Monte Carlo tree search model; the random Monte Carlo tree search model is obtained by pruning nodes related to an initial power grid reconstruction strategy in the initial random Monte Carlo tree search model according to recursive rewards of all power grid nodes in the initial power grid reconstruction strategy output by the initial random Monte Carlo tree search model, wherein the recursive rewards of all power grid nodes are counter-propagated along an initial search path corresponding to the initial power grid reconstruction strategy, and are calculated according to rewards of all power grid nodes.

2. The method according to claim 1, wherein the method further comprises:

acquiring a sample power grid initial state matrix of the target area according to a training sample set;

3. The method according to claim 2, wherein the pruning processing is performed on the nodes related to the initial power grid reconstruction strategy in the initial random monte carlo tree search model according to the recursive prize value of each node in the initial random monte carlo tree search model, so as to obtain an intermediate random monte carlo tree search model, including:

4. The method of claim 2, wherein inputting the sample grid initial state matrix into the initial random monte carlo tree search model results in an initial grid reconstruction strategy for the target area, comprising:

5. The method according to any one of claims 2-4, further comprising:

6. The method of claim 1, wherein obtaining a grid initial state matrix for the target area from the grid operation data comprises:

7. The method according to any one of claims 1-4, further comprising:

8. A device for determining a power grid reconstruction strategy, the device comprising:

the third acquisition module is used for acquiring a power grid reconstruction strategy of the target area according to the power grid initial state matrix and a preset random Monte Carlo tree search model; the random Monte Carlo tree search model is obtained by pruning nodes related to an initial power grid reconstruction strategy in the initial random Monte Carlo tree search model according to recursive rewards of all power grid nodes in the initial power grid reconstruction strategy output by the initial random Monte Carlo tree search model, wherein the recursive rewards of all power grid nodes are counter-propagated along an initial search path corresponding to the initial power grid reconstruction strategy, and are calculated according to rewards of all power grid nodes.

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.