Disclosure of Invention
In view of the above, it is necessary to provide a power distribution network fault recovery method, apparatus, control device and storage medium.
In a first aspect, a method for recovering a fault of a power distribution network is provided, and the method includes:
under the condition that the power distribution network fails, acquiring the current observable state of the power distribution network, wherein the observable state comprises failure state parameters and environmental parameters causing the power distribution network to fail;
inquiring a historical self-healing action database according to the current observable state to obtain a control action set corresponding to the current observable state, wherein a plurality of corresponding relations between the observable state and the control action are stored in the historical self-healing action database;
sequentially simulating and executing the control actions in the control action set, and acquiring the troubleshooting overhead corresponding to the executed control actions after each execution;
determining an optimal control action from the control action set according to the obtained multiple troubleshooting overheads;
and performing fault recovery on the power distribution network based on the optimal control action.
In an optional embodiment of the present application, after each execution, obtaining a troubleshooting overhead corresponding to the executed control action includes: after each execution, determining an execution expense corresponding to the executed control action and a compensation value of the flexible load in the power distribution network to the execution expense; and after each execution, determining the failure elimination overhead according to the execution overhead and the compensation value corresponding to the executed control action.
In an optional embodiment of the present application, determining the troubleshooting cost according to the execution cost corresponding to the executed control action and the compensation value includes: and determining a first difference value between the execution overhead and the compensation value to obtain the troubleshooting overhead.
In an optional embodiment of the present application, determining an optimal control action from the set of control actions according to the obtained plurality of troubleshooting costs includes: respectively calculating second difference values between the plurality of fault elimination overheads and the preset fault elimination overhead to obtain a plurality of second difference values; and determining the corresponding control action of the minimum value in the plurality of second difference values in the control action set as the optimal control action.
In an optional embodiment of the present application, the observable conditions include distributed power, flexible load power and line damage conditions in the power distribution network.
In an optional embodiment of the present application, further comprising: and carrying out normalization processing on the distributed power supply power, the flexible load power and the line damage state to obtain an observable state.
In an optional embodiment of the present application, the set of control actions comprises an action state of a tie switch in the power distribution network.
In a second aspect, a power distribution network fault recovery device is provided, the device comprising: the system comprises a state acquisition module, a control action determination module, an overhead determination module, an optimal action determination module and a fault recovery module.
The state acquisition module is used for acquiring the current observable state of the power distribution network under the condition that the power distribution network fails, wherein the observable state comprises failure state parameters and environment parameters causing the power distribution network to fail;
the control action determining module is used for inquiring a historical self-healing action database according to the current observable state to obtain a control action set corresponding to the current observable state, and a plurality of corresponding relations between the observable state and the control action are stored in the historical self-healing action database;
the overhead determining module is used for sequentially simulating and executing the control actions in the control action set and acquiring the fault elimination overhead corresponding to the executed control actions after each execution;
the optimal action determining module is used for determining an optimal control action from the control action set according to the obtained multiple troubleshooting expenses;
the fault recovery module is used for carrying out fault recovery on the power distribution network based on the optimal control action.
In an optional embodiment of the present application, the overhead determination module comprises: a first overhead module and a second overhead module. The first overhead module is used for determining execution overhead corresponding to the executed control action and a compensation value of the flexible load in the power distribution network to the execution overhead after each execution; the second overhead module is used for determining the fault elimination overhead according to the execution overhead and the compensation value corresponding to the executed control action after each execution.
In an optional embodiment of the present application, the second overhead module is specifically configured to determine a first difference between the execution overhead and the compensation value, so as to obtain the troubleshooting overhead.
In an optional embodiment of the present application, the optimal action determination module comprises: a first action determination submodule and a second action determination submodule. The first action determining submodule is used for respectively calculating second differences between the plurality of fault elimination overheads and the preset fault elimination overhead to obtain a plurality of second differences; the second action determining submodule is used for determining the corresponding control action of the minimum value in the second difference values in the control action set as the optimal control action.
In an optional embodiment of the present application, the observable conditions include distributed power, flexible load power and line damage conditions in the power distribution network.
In an optional embodiment of the present application, the system further includes a processing module, where the processing module is configured to perform normalization processing on the distributed power source power, the flexible load power, and the line damage state, so as to obtain an observable state.
In an optional embodiment of the present application, the set of control actions comprises an action state of a tie switch in the power distribution network.
In a third aspect, a control device is provided, comprising a memory storing a computer program and a processor implementing the steps of the above method when executing the computer program.
In a fourth aspect, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, realizes the steps of the above method.
The embodiment of the application provides a power distribution network fault recovery method, which is characterized in that under the condition that a power distribution network has a fault, an observable state of the power distribution network is obtained in real time, and a historical self-healing action database is inquired according to the current observable state to obtain a control action set. And finally, performing fault recovery on the power distribution network based on the optimal control action. In the whole fault recovery process, a model does not need to be established, a large amount of calculation is not needed, only the historical self-healing action database is required to be inquired, and after the fault recovery is simulated, an optimal control action is selected according to a plurality of fault removal expenses to carry out fault recovery on the power distribution network. The power distribution network fault recovery method provided by the embodiment of the application solves the technical problems that in the prior art, the calculation amount is large in the solving process, the complexity is high, and an optimal solution is difficult to find, and achieves the technical effect of greatly reducing the calculation amount and the calculation complexity on the premise of ensuring that the optimal solution can be found.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
At present, for the post-disaster recovery of the power distribution network, the recovery fault is generally controlled by adopting a traditional optimization planning method such as a commercial optimization solution method. However, in extreme disasters, the fault derivation may pass through different stages, and different fault recovery models need to be continuously established or updated to adapt to fault recovery in different stages. The fault recovery model is complex to establish and takes long time, so that the current fault recovery model only carries out modeling aiming at very limited typical working conditions and scenes, but under an extreme external environment or in the actual fault recovery process, due to the complex environment, various fault types and serious action linkage effect, the calculation amount and the complexity in the solving process are large, and an optimal solution is difficult to find.
In view of this, the embodiment of the present application provides a power distribution network fault recovery method, which acquires an observable state of a power distribution network in real time when the power distribution network fails, and queries a historical self-healing action database according to the current observable state to obtain a control action set. And finally, performing fault recovery on the power distribution network based on the optimal control action. In the whole fault recovery process, a model does not need to be established, a large amount of calculation is not needed, only the historical self-healing action database is required to be inquired, and after the fault recovery is simulated, an optimal control action is selected according to a plurality of fault removal expenses to carry out fault recovery on the power distribution network. The power distribution network fault recovery method provided by the embodiment of the application solves the technical problems that in the prior art, the calculation amount is large in the solving process, the complexity is high, and an optimal solution is difficult to find, and achieves the technical effect of greatly reducing the calculation amount and the calculation complexity on the premise of ensuring that the optimal solution can be found.
In the following, a brief description will be given of an implementation environment related to the power distribution network fault recovery method provided by the embodiment of the present application.
Referring to fig. 1, the method for recovering a fault of a power distribution network provided in the embodiment of the present application is applied to the power distribution network, where the power distribution network includes a power distribution network body, a monitoring device, a control device, and a control terminal. The monitoring equipment is arranged at different positions of the power grid body and used for detecting running signals and environment signals of different nodes or equipment in the power grid body, the control equipment is in signal connection with different electric equipment in the power grid body and used for controlling the work of the different electric equipment, the control terminal is in signal connection with the monitoring equipment and the control equipment respectively and used for sending control instructions to the control equipment to control the action of the control equipment according to the running signals and the environment signals collected by the monitoring equipment and sending control instructions to the control equipment according to the running signals and the environment signals. Wherein, supervisory equipment can include voltage acquisition equipment, current acquisition equipment, resistance appearance, temperature acquisition equipment, humidity measurement appearance, meteorological instrument etc. and this application embodiment does not do specifically and restricts.
Referring to fig. 2, an embodiment of the present application provides a method for recovering a fault of a power distribution network, which may be applied to the power distribution network, and the following embodiment is applied to the control device in fig. 1, and is used to specifically describe the fault recovery of the power distribution network, including the following steps 201 to 205:
step 201, under the condition that the power distribution network has a fault, the control terminal obtains a current observable state of the power distribution network, wherein the observable state comprises fault state parameters and environment parameters causing the power distribution network to have the fault.
The monitoring equipment is used for acquiring parameters such as voltage, current, resistance, on-off of a switch and the like of the power distribution network in real time, so that the fault state parameters are obtained. The monitoring equipment is used for acquiring the temperature, the humidity, the illumination and the like in the power distribution network in real time, so that the environmental parameters which cause the power distribution network to break down are obtained. It should be noted that the fault state parameters are not limited to voltage, current, resistance, and switch on/off parameters, but may include any other parameters that can characterize the fault state of the power distribution network. Similarly, the environmental parameters are not limited to temperature, humidity and illumination parameters, but may also include any other parameters that can represent the environment causing the power distribution network to malfunction.
Step 202, the control terminal queries a historical self-healing action database according to the current observable state to obtain a control action set corresponding to the current observable state, and the historical self-healing action database stores a plurality of corresponding relations between the observable state and the control action.
The historical self-healing action database is prestored in the control terminal and can also be stored in other storage devices of the power distribution network, the control terminal can access the historical self-healing action database in real time, and when the control terminal accesses the historical self-healing action database, all corresponding relations can be inquired. The historical self-healing action database comprises a plurality of corresponding relations, and one or more control actions corresponding to one observable state in each corresponding relation. Firstly, the control terminal carries out information matching in the historical self-healing action database, finds the observable state which is the same as the current observable state, and then determines the control action historically executed in the observable state according to the corresponding relation. In the historical failure recovery, a plurality of different control actions may occur in the same observable state, and the plurality of different control actions constitute the control action set.
And step 203, the control terminal sequentially simulates and executes the control actions in the control action set, and acquires the troubleshooting overhead corresponding to the executed control actions after each execution.
The control terminal stores a power grid fault model in advance, the control terminal inputs the control actions in the control action set to the power grid fault model in sequence, the control actions act in the power grid fault model, and the state of the power grid fault model changes. At this time, the control terminal determines the current troubleshooting overhead according to the current power grid fault model recovery situation and other factors such as the electric quantity expenditure spent in fault recovery, wherein the troubleshooting overhead refers to the electric quantity cost or other expenditures required by the power grid fault needing to be cleared. For example, when the troubleshooting overhead is a cost of electricity, it may be determined by determining the amount of electricity needed to perform the control action when the failure occurs.
And step 204, the control terminal determines the optimal control action from the control action set according to the obtained plurality of troubleshooting overheads.
The control terminal obtains a fault removal expense each time the control action is executed by the power grid fault model, and obtains a plurality of fault removal expenses after a plurality of control actions are executed respectively. And the control terminal determines a minimum cost according to the plurality of fault elimination costs, then inquires a control action corresponding to the minimum cost in the control action set, and determines that the control action is the optimal control action.
And step 205, the control terminal performs fault recovery on the power distribution network based on the optimal control brake.
After the optimal control action is determined, the control terminal controls and adjusts the power distribution network body through the control equipment, for example, the operation state of the power distribution network body is adjusted through adjusting certain node switches in the control equipment or restarting certain controllers, so that the fault of the power distribution network is recovered to be normal, the fault in the power distribution network is eliminated, and the purpose of recovering the fault of the power distribution network is achieved.
The embodiment of the application provides a power distribution network fault recovery method, which is characterized in that under the condition that a power distribution network has a fault, an observable state of the power distribution network is obtained in real time, and a historical self-healing action database is inquired according to the current observable state to obtain a control action set. And finally, performing fault recovery on the power distribution network based on the optimal control action. In the whole fault recovery process, a model does not need to be established, a large amount of calculation is not needed, only the historical self-healing action database is required to be inquired, and after the fault recovery is simulated, an optimal control action is selected according to a plurality of fault removal expenses to carry out fault recovery on the power distribution network. The power distribution network fault recovery method provided by the embodiment of the application solves the technical problems that in the prior art, the calculation amount is large in the solving process, the complexity is high, and an optimal solution is difficult to find, and achieves the technical effect of greatly reducing the calculation amount and the calculation complexity on the premise of ensuring that the optimal solution can be found.
In an optional embodiment of the present application, the observable conditions include distributed power, flexible load power and line damage conditions in the power distribution network.
In a first aspect, the control terminal collects the current and the voltage of the distributed power source and the flexible load in the power distribution network in real time through the current collecting device and the voltage collecting device in the monitoring device, and then calculates the power of the distributed power source and the power of the flexible load respectively through a power calculation formula. Flexible loads refer to loads that can be interrupted in a power distribution network. Alternatively, the observable state may be determined by the following equation:
wherein, in the formula (1), S
tRepresenting the observable state of the distribution network at time t,
representing the output power, P, of the ith distributed power supply at time t
t jRepresenting the power of the compliant load at time t,
indicating the damaged status of the ith line at time t, N
DGRepresenting a collection of distributed power sources, N
LRepresenting a set of flexible loads, N
fRepresenting a collection of vulnerable lines.
In a second aspect, the control terminal detects whether a line in the power distribution network is damaged through an infrared detector and the like in the monitoring device, and under another possible condition, the current data and the voltage data acquired by the control terminal in real time through the current acquisition device and the voltage acquisition device are compared with a preset normal working current range and a preset normal working voltage range to determine whether the current line is damaged. Wherein, the damaged state includes two states of damaged and not damaged.
In an optional embodiment of the present application, the set of control actions comprises an action state of a tie switch in the power distribution network.
The interconnection switch is an important control device in the power distribution network and controls whether different power transmission lines or electrical devices work or not and electrical connection among the different power transmission lines or the electrical devices. Simultaneously, the contact switch is comparatively fragile in the distribution network, an easy impaired node, consequently carries out fault recovery with this contact switch as the control point to this distribution network, and the coverage is wider, and recovery effect is better. The state of the tie switch is only closed and open, and can be determined, for example, by the following action function:
wherein, a
tWhich is indicative of a control action to be taken,
at the moment of t +1, the state variable of the mth interconnection switch only has two values of 0 or 1, the value of 0 represents the connection, the value of 1 represents the disconnection, and N represents the disconnection
conThe branch set of the interconnection switch in the power distribution network.
In an optional embodiment of the present application, further comprising: and the control terminal performs normalization processing on the distributed power supply power, the flexible load power and the line damage state to obtain an observable state.
In a first aspect, taking the power of the flexible load as an example, the power can be determined by the following formula:
wherein the power P of the flexible loadt jThe maximum value and the minimum value of the self-healing action can be obtained from a historical self-healing action database or obtained by inquiring according to historical power grid data records.
In the second aspect, in the case of a line damage state, if the line is broken due to a fault at time t, the line is broken
And taking the value as 1, otherwise, taking the value as 0 by default. Therefore, the values of all the states after normalization are all [0,1]]In between, reduce the calculated amount greatly, improve the efficiency of distribution network fault recovery of this application embodiment.
Referring to fig. 3, in an alternative embodiment of the present application, step 203 includes steps 301-302:
step 301, after each execution, the control terminal determines an execution overhead corresponding to the executed control action and a compensation value of the flexible load in the power distribution network to the execution overhead.
After the control terminal executes the control action on the power grid fault model each time, the control terminal calculates and determines the execution overhead in the process of executing the control action and the compensation value of the flexible load in the power distribution network. The execution overhead refers to the amount of power consumed by the distribution grid when performing control actions on the grid fault model. The flexible load refers to a load which can be interrupted in the power distribution network, namely the flexible load can be subjected to power failure treatment in the actual fault recovery process, and the power consumption is not needed. In this embodiment, the simulation result of the power grid fault model is combined with actual factors to determine the fault elimination overhead, so as to improve the reliability of the power distribution network fault recovery method in the embodiment of the present application when fault recovery is actually performed.
Step 302, after each execution, the control terminal determines a troubleshooting overhead according to the execution overhead and the compensation value corresponding to the executed control action.
The control terminal may determine a first difference between the execution overhead and the compensation value to obtain a troubleshooting overhead. The troubleshooting overhead may be determined by the following equation:
wherein r is
t+1A failure-recovery overhead is indicated and,
for the cost per unit output power of the distributed power supply,
representing the output power of the ith distributed power supply at time t,
cost per unit power for the jth flexible load, P
t jRepresenting the power of the flexible load at time t, N
DGRepresenting a collection of distributed power sources, N
ILIs the set of flexible loads in the system.
Optionally, in the embodiment of the present application, the troubleshooting overhead may also be calculated by using an electricity generation ratio method based on the flexible load power failure loss characteristic:
wherein λ iskThe power failure loss of unit electric quantity when power failure occurs to the k flexible load of the distribution network node, L represents the power generation ratio, betakAnd the weight of the flexible load of the node k of the power distribution network.
Referring to fig. 4, in an alternative embodiment of the present application, step 204 includes steps 401-402:
step 401, the control terminal calculates second differences between the plurality of troubleshooting costs and the preset troubleshooting costs, respectively, to obtain a plurality of second differences.
The preset troubleshooting overhead is stored in the control terminal in advance, and when the control terminal obtains one troubleshooting overhead each time, the troubleshooting overhead is compared with the preset troubleshooting overhead stored in the control terminal, and a difference is calculated to obtain a second difference. After the control terminal carries out comparison for a plurality of times, a plurality of second difference values can be obtained. The preset troubleshooting overhead may be determined by historical experience or comprehensive consideration of various factors, and the embodiment is not particularly limited.
Step 402, the control terminal determines that the control action corresponding to the minimum value in the plurality of second difference values in the control action set is the optimal control action.
The control terminal calculates a plurality of second difference values, the second difference values are used for representing the difference value between the corresponding fault elimination overhead and the preset fault elimination overhead when each control action is executed, the smaller the second difference value is, the closer the fault elimination overhead is to the preset fault elimination overhead is, especially when the second difference value is a negative value, the fault elimination overhead when the fault recovery is actually carried out is far lower than the preset fault elimination overhead, and the control action is more optimal.
Referring to fig. 5, in an alternative embodiment of the present application, the control terminal may further perform network training through DDQN algorithm to obtain the above-mentioned optimal control action:
step 501, the control terminal determines an action evaluation function according to the fault state of the power distribution network and the environmental parameters causing the fault of the power distribution network.
The control terminal respectively uses two neural networks to fit a state estimation function network of the observable state to fit a fault state estimation function V(s) in the observable statet) And a merit estimation function A(s) for each control action in the current set of control actionst,at) Wherein the fault state evaluation function V(s)t) Is a function of the fault state parameters of the power distribution network, and is a dominance estimation function A(s)t,at) Is about the above-mentioned ring causing the failure of the distribution networkFunction of environmental parameters, estimating function V(s) based on the fault statet) And the merit estimation function A(s)t,at) Obtaining an action valuation function:
wherein, Q (S)t,at) The action valuation function, A represents the set of control actions, | A | represents the number of control actions in the set of control actions, V(s)t) An evaluation function representing the fault state, A(s)t,at) Representing the merit estimation function.
In this embodiment, since only one optimal control action can be selected in each observable state, only one Q value can be obtained, and the optimal control action cannot be decomposed into a unique state estimation function V value and an action dominance function a value, the action dominance function is set as an individual action dominance function minus an average value of all action dominance functions in the current state, so as to remove redundant degrees of freedom and improve algorithm stability.
Step 502, the control terminal determines an optimal motion estimation function according to the motion estimation function.
Optionally, the control terminal obtains a target value of the optimal action estimation function by using a bellman equation:
updating the action valuation function according to the current observable state, wherein the specific formula is as follows:
wherein, in the above formulas (7) to (8), the discount factor λ ∈ [0,1], and the learning rate 0< α ≦ 1.
Referring to fig. 6, in an alternative embodiment of the present application, step 502 includes steps 601-603:
601, the control terminal introduces an epsilon-greedy strategy to select the control actions in the control action set:
wherein, the fixed constant in the strategy with epsilon-greedy is a random number, T is the total training times, and T is the current training times.
Step 602, the control terminal screens observable state samples in the historical self-healing action database according to a preset model.
To eliminate timing dependencies between samples in the short term, memory replay is employed to store a historical self-healing action database. Establishing an experience pool with the capacity of N, storing corresponding relation samples of the observable state and the control action of the power distribution network into the experience pool in each training period, randomly extracting a small batch of samples from the experience pool when the number of the samples exceeds the playback starting capacity of M, carrying out artificial neural network training, and training a neural network by randomly extracting the samples to avoid the occurrence of phenomena such as overfitting. If the number of samples exceeds the maximum capacity of the experience pool, the earliest sample is removed and stored in a new sample, and the latest observation state of the neural network learning is guaranteed.
Step 603, the control terminal determines a loss function according to the action estimation function of each control action.
The DDQN performs forward calculation to obtain Q values of all control actions, in the embodiment, the loss function refers to the mean square error between the target Q value and the predicted Q value output by the neural network, and the neural network loss function is determined by formulas (7) - (9), wherein the formula is
Training a neural network by using a small batch gradient descent method, then acquiring a real-time observable state of the power distribution network by using monitoring equipment, and selecting an estimated value maximum action a by using the trained neural networkkI.e. byTo control the optimal strategy, i.e. the optimal control action described above.
It should be understood that, although the steps in the flowchart are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in the figures may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of execution of the steps or stages is not necessarily sequential, but may be performed alternately or in alternation with other steps or at least some of the other steps or stages.
Referring to fig. 7, an embodiment of the present application provides a power distribution network fault recovery apparatus 10, which includes: a state acquisition module 100, a control action determination module 200, an overhead determination module 300, an optimal action determination module 400, and a failure recovery module 500.
The state acquisition module 100 is configured to acquire a current observable state of the power distribution network when the power distribution network fails, where the observable state includes a fault state parameter and an environmental parameter that causes the power distribution network to fail;
the control action determining module 200 is configured to query a historical self-healing action database according to the current observable state to obtain a control action set corresponding to the current observable state, where multiple correspondence relationships between observable states and control actions are stored in the historical self-healing action database;
the overhead determining module 300 is configured to sequentially simulate and execute the control actions in the control action set, and obtain a troubleshooting overhead corresponding to the executed control action after each execution;
the optimal action determining module 400 is configured to determine an optimal control action from the control action set according to the obtained multiple troubleshooting costs;
the fault recovery module 500 is configured to perform fault recovery on the power distribution network based on the optimal control action.
In an optional embodiment of the present application, the overhead determining module 300 includes: a first overhead module and a second overhead module.
The first overhead module is used for determining execution overhead corresponding to the executed control action and a compensation value of the flexible load in the power distribution network to the execution overhead after each execution;
the second overhead module is used for determining the troubleshooting overhead according to the execution overhead corresponding to the executed control action and the compensation value after each execution.
In an optional embodiment of the present application, the second overhead module is specifically configured to determine a first difference between the execution overhead and the compensation value, so as to obtain the troubleshooting overhead.
In an optional embodiment of the present application, the optimal action determination module 400 includes: a first action determination submodule and a second action determination submodule.
The first action determining submodule is used for respectively calculating a plurality of second differences between the fault elimination overhead and the preset fault elimination overhead to obtain a plurality of second differences;
the second action determining submodule is configured to determine that the control action corresponding to the minimum value in the plurality of second difference values in the control action set is the optimal control action.
In an optional embodiment of the present application, the observable conditions include distributed power, flexible load power and line damage conditions in the power distribution network.
In an optional embodiment of the present application, the system further includes a processing module, where the processing module is configured to perform normalization processing on the distributed power source power, the flexible load power, and the line damage state, so as to obtain the observable state.
In an optional embodiment of the application, the set of control actions comprises an action state of a tie switch in the power distribution network.
For specific limitations of the power distribution network fault recovery device 10, reference may be made to the above limitations of the power distribution network fault recovery method, which are not described herein again. The various modules in the power distribution network fault recovery apparatus 10 described above may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent from a processor in the control device, and can also be stored in a memory in the control device in a software form, so that the processor can call and execute operations corresponding to the modules.
Fig. 8 is a schematic diagram of an internal structure of a control device in an embodiment of the present application, where the control device may be a server. As shown in fig. 8, the control device includes a processor, a memory, and a communication component connected by a system bus. Wherein the processor is used for providing calculation and control capability and supporting the operation of the whole control device. The memory may include a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The computer program can be executed by a processor for implementing a power distribution network fault recovery method provided by the above embodiments. The internal memory provides a cached execution environment for the operating system and computer programs in the non-volatile storage medium. The control device may communicate with other control devices (e.g., STAs) through the communication component.
It will be appreciated by those skilled in the art that the configuration shown in fig. 8 is a block diagram of only a portion of the configuration associated with the present application and does not constitute a limitation on the control device to which the present application is applied, and a particular control device may include more or less components than those shown in the figures, or combine certain components, or have a different arrangement of components.
In one embodiment, there is provided a control apparatus including: the system comprises a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to realize the following steps:
under the condition that a power distribution network fails, acquiring the current observable state of the power distribution network, wherein the observable state comprises failure state parameters and environment parameters causing the power distribution network to fail;
inquiring a historical self-healing action database according to the current observable state to obtain a control action set corresponding to the current observable state, wherein the historical self-healing action database stores a plurality of corresponding relations between observable states and control actions;
sequentially simulating and executing the control actions in the control action set, and acquiring the troubleshooting overhead corresponding to the executed control actions after each execution;
determining an optimal control action from the control action set according to the obtained plurality of troubleshooting costs;
and performing fault recovery on the power distribution network based on the optimal control brake.
In one embodiment of the application, the processor when executing the computer program further performs the steps of: after each execution, determining an execution overhead corresponding to the executed control action and a compensation value of a flexible load in the power distribution network to the execution overhead; and after each execution, determining the troubleshooting cost according to the execution cost corresponding to the executed control action and the compensation value.
In one embodiment of the application, the processor when executing the computer program further performs the steps of: and determining a first difference value between the execution overhead and the compensation value to obtain the troubleshooting overhead.
In one embodiment of the application, the processor when executing the computer program further performs the steps of: respectively calculating second difference values between the plurality of fault elimination overheads and the preset fault elimination overhead to obtain a plurality of second difference values; and determining the control action corresponding to the minimum value in the plurality of second difference values in the control action set as the optimal control action.
In one embodiment of the present application, the observable conditions include distributed power, flexible load power, and line damage conditions in the power distribution network.
In one embodiment of the application, the processor when executing the computer program further performs the steps of: and normalizing the distributed power supply power, the flexible load power and the line damage state to obtain the observable state.
In one embodiment of the application, the set of control actions includes an action state of a tie switch in the power distribution network.
The implementation principle and technical effect of the control device provided in the embodiment of the present application are similar to those of the method embodiment described above, and are not described herein again.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:
under the condition that a power distribution network fails, acquiring the current observable state of the power distribution network, wherein the observable state comprises failure state parameters and environment parameters causing the power distribution network to fail;
inquiring a historical self-healing action database according to the current observable state to obtain a control action set corresponding to the current observable state, wherein the historical self-healing action database stores a plurality of corresponding relations between observable states and control actions;
sequentially simulating and executing the control actions in the control action set, and acquiring the troubleshooting overhead corresponding to the executed control actions after each execution;
determining an optimal control action from the control action set according to the obtained plurality of troubleshooting costs;
and performing fault recovery on the power distribution network based on the optimal control brake.
In one embodiment of the application, the computer program when executed by the processor further performs the steps of: after each execution, determining an execution overhead corresponding to the executed control action and a compensation value of a flexible load in the power distribution network to the execution overhead; and after each execution, determining the troubleshooting cost according to the execution cost corresponding to the executed control action and the compensation value.
In one embodiment of the application, the computer program when executed by the processor further performs the steps of: and determining a first difference value between the execution overhead and the compensation value to obtain the troubleshooting overhead.
In one embodiment of the application, the computer program when executed by the processor further performs the steps of: respectively calculating second difference values between the plurality of fault elimination overheads and the preset fault elimination overhead to obtain a plurality of second difference values; and determining the control action corresponding to the minimum value in the plurality of second difference values in the control action set as the optimal control action.
In one embodiment of the present application, the observable conditions include distributed power, flexible load power, and line damage conditions in the power distribution network.
In one embodiment of the application, the computer program when executed by the processor further performs the steps of: and normalizing the distributed power supply power, the flexible load power and the line damage state to obtain the observable state.
In one embodiment of the application, the set of control actions includes an action state of a tie switch in the power distribution network.
The implementation principle and technical effect of the computer-readable storage medium provided by this embodiment are similar to those of the above-described method embodiment, and are not described herein again.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in M forms, such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), synchronous Link (SyMchliMk) DRAM (SLDRAM), RaMbus (RaMbus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.