CN114362221B

CN114362221B - Regional intelligent power grid partition evaluation method based on deep reinforcement learning

Info

Publication number: CN114362221B
Application number: CN202210048549.2A
Authority: CN
Inventors: 华昊辰; 陈星莺; 董正涛; 余昆; 甘磊; 梅飞
Original assignee: Hohai University HHU
Current assignee: Hohai University HHU
Priority date: 2022-01-17
Filing date: 2022-01-17
Publication date: 2023-10-13
Anticipated expiration: 2042-01-17
Also published as: CN114362221A

Abstract

The invention discloses a regional intelligent power grid partition evaluation method based on deep reinforcement learning, which comprises the following steps: (1) establishing a multi-micro-grid system partition evaluation index system; (2) setting a multi-microgrid system power balance limit; (3) establishing an evaluation index weight intelligent agent; (4) Constructing an evaluation index system of the intra-zone and inter-zone dividing effect; (5) designing a multi-micro-grid partition comprehensive evaluation index; (6) A multi-microgrid repartitioning mechanism is designed that accounts for system node variations. According to the regional intelligent power grid partition evaluation index system, multiple indexes are considered, and the regional intelligent power grid partition evaluation index system is more comprehensive than the existing regional intelligent power grid partition indexes; the invention adopts a deep reinforcement learning method, determines the weights of various indexes based on the historical data of each node, is easier to resist the influence of bad data on the weights, and has stronger robustness; the invention can update and adjust the partition evaluation index system in time according to the state change of the network nodes so as to ensure the rationality and coordination of the partition evaluation index.

Description

Regional intelligent power grid partition evaluation method based on deep reinforcement learning

Technical Field

The invention relates to the application field of crossing technology of power system theory and artificial intelligence, in particular to a regional intelligent power grid partition evaluation method based on deep reinforcement learning.

Background

At present, the traditional energy sources such as petroleum, natural gas and the like are gradually exhausted, and the problems of environmental pollution, excessive carbon emission and the like in the development and utilization of the energy sources are increasingly severe. Development of renewable energy and construction of micro-grids are effective ways to promote green transformation of energy systems and achieve the goals of "peak carbon, carbon neutralization". In a regional smart grid system comprising a plurality of micro-grids and containing a high proportion of renewable energy sources, a distributed power source is connected with a load and an energy storage device through an energy router to promote energy to be changed from unidirectional supply to bidirectional interaction. Due to the development of the sensing technology of the Internet of things and the power electronic technology, various distributed renewable energy sources can be effectively fused with the regional intelligent power grid system, and the distributed renewable energy sources can respond quickly along with the change of load demands. But the access of controllable equipment greatly increases the decision complexity and the system control problem dimension of the regional smart grid system energy management. Therefore, in the scene of containing high proportion renewable energy sources, how to design reasonable energy scheduling and management and control strategies in order to reduce the overall operation and maintenance cost of the intelligent power grid system to the greatest extent, promote the full in-situ consumption of the renewable energy sources, improve the energy utilization efficiency and reduce the carbon emission becomes a problem to be solved urgently. And the implementation of energy scheduling and management and control strategies directly aiming at regional intelligent power grid scenes of high-proportion renewable energy sources can encounter dimension disasters. Therefore, the dimension of the control problem needs to be reduced, the control problem is decomposed into a plurality of sub-problems, namely, the overall control problem of the regional intelligent power grid system is converted into the control problem of a plurality of sub-regions, and the regional intelligent power grid system is partitioned, so that 'local autonomous first and wide area interconnection later' are realized. However, how do reasonably effective partitioning proceed? The determination of regional smart grid system partition boundaries typically depends on certain partition indicators. The construction of a reasonable, comprehensive and efficient partition index system is particularly important.

The traditional regional intelligent power grid partition index has the following defects:

(1) The partition index is too simple. Only the system power distribution, the energy flow and the geographic factors are considered, comprehensive consideration on various indexes is lacked, and the system power distribution, the energy flow and the geographic factors cannot be effectively matched with actual conditions. Multiple indexes are required to be incorporated into an evaluation system to perform scientific and reasonable evaluation on the partition method.

(2) Conventional partitioning metrics typically take into account line loss, power balancing, and other factors. However, with the development of energy storage technology in recent years, more and more energy storage devices are used in regional smart grid systems. Conventional partition indicators are not effective in this situation.

(3) The weight value is set manually, and errors exist. The conventional partition indexes are generally based on index weight values set by people, and have errors with actual conditions, so that the follow-up partition results are affected.

Disclosure of Invention

The invention aims to: the invention aims to provide a regional intelligent power grid partition evaluation method based on deep reinforcement learning, which has intelligence, coordination, sociality and self-learning.

The technical scheme is as follows: the regional intelligent power grid partition evaluation method based on deep reinforcement learning provided by the invention comprises the following steps of:

(1) Establishing a multi-micro-grid system partition evaluation index system, wherein the multi-micro-grid system partition evaluation index system comprises an intra-area evaluation index system and an inter-area evaluation index system;

(1.1) establishing an intra-area evaluation index system of the multi-micro-grid system partition, wherein the evaluation indexes in the intra-area evaluation index system comprise:

renewable energy utilization rate:P _c (t) is the actual power of renewable energy source at time t, P _st (t) is the rated power of renewable energy source at the moment t, and t is the moment of the multi-microgrid system;

degree of supply and demand balance in a zone:P _prc (t) is the power generated by the energy supply side at time t, P _u (t) is the user load at time t;

distributed power supply adjustable capacity: e (E) _in,3 ＝P _st (t)-P _c (t)，P _c (t) is the actual power of renewable energy source at time t, P _st (t) is the rated power of renewable energy source at time t;

degree of electrical coupling between nodes: e (E) _in,4 When there is energy interaction between the nodes, its value is 1; when there is no interaction between the nodes, its value is 0;

load and distributed power supply matching degree:P _c (t) is the actual power of renewable energy source at time t, P _u (t) is the user load at time t;

energy storage device adjustable capacity: e (E) _in,6 ＝P _ba,st (t)-P _ba,c (t)，P _ba,c (t) is the actual power generated by the energy storage device at the moment t, P _ba,st And (t) is the rated power of the energy storage device at the moment t.

(1.2) establishing a section evaluation index system of a multi-micro-grid system partition, wherein evaluation indexes in the section evaluation index system comprise:

interval average electrical distance:L _cir,i sigma L is the electrical distance between the current partition and the surrounding ith partition _cir,i N is the sum of the electrical distances of the current zone and the surrounding zones _cir The number of surrounding partitions;

interval power interaction capacity: e (E) _ex,2 ＝∑P _cir,i (t)，P _cir,i (t) is the interaction power of the partition and the ith partition around the partition at time t, ΣP _cir,i (t) is the sum of the interaction power of the current partition and the surrounding partitions at time t;

power scheduling cost: e (E) _ex,3 For each interaction of the partition and surrounding partitions, 1kElectric energy loss generated by Wh electric energy;

line loss: e (E) _ex,4 The sum of the power loss generated by the power interaction of the partition and the surrounding partitions is the sum of the power loss generated by the power interaction of the partition and the surrounding partitions;

whether each sub-area is connected to the backbone: e (E) _ex,5 When the partition is connected with the backbone network, the value is 1; when the partition is not connected with the backbone network, the value is 0;

energy storage equipment breakage cost: e (E) _ex,6 The loss of the energy storage device generated for each time of storing/releasing 1kWh of electric energy by the energy storage device;

E _in,i and E is _ex,i Is the main component of the corresponding in-zone and out-zone evaluation index.

(2) Setting power balance limit of the multi-micro-grid system; the power balance relation corresponding to the power balance limit in the step (2) is as follows:

P _trans (t)＝P _prc (t)-P _u (t)

wherein P is _trans (t) represents the interaction power of the micro-grid and the backbone network at the moment t, P _prc (t) represents the power generated by the energy supply side at time t, P _u And (t) represents the user coincidence power at the moment t.

(3) Establishing an evaluation index weight intelligent agent;

(3.1) setting the observed state quantity required by the agent: the method comprises the following steps of (1) and (2) and comprises the following steps of:

s _t ∈S:{E _in,i,t ,E _ex,i,t }

in the state space S, E _in,i,t Representing the state quantity of the index in the ith zone at the time t, E _ex,i,t Representing the state quantity of the ith interval index at the time t, s _t The state of the intelligent agent at the time t is shown, and t is the time at which the multi-micro-grid system is located;

(3.2) setting an action value of the agent: the actions of the agent include increasing the weight, decreasing the weight and not changing, the action space is as follows:

a _t ∈A{0,1,2}

in the motion space a, 0 means decreasing weight, 1 means increasing weight, and 2 means not changing;

(3.3) setting an agent rewarding function when the agent is in state s _t Action a is taken at that time _t The obtained rewards are the running cost of the multi-micro network system at the time t, and the running cost is as follows:

r _t ＝P _trans ×R _t ×Δt

wherein r is _t Representing the running cost of the multi-micro-grid system at the time t, R _t The method comprises the steps of expressing the interactive electricity price of a multi-micro-grid system and a backbone network at the moment t, wherein deltat represents the action time interval;

(3.4) simulating the motion of the agent using the deep neural network, inputting the state quantity in the motion space into the deep neural network, and outputting the index weight value λ (s, a) in the observed state by the deep neural network:

λ(s,a)＝E(r _t +λ(s _t+1 ,a _t+1 )|s _t ,a _t )

wherein lambda (s, a) represents the state s of observation of the agent _t When and as desired to take action, E () represents the desired value.

(4) Constructing an evaluation index system of the intra-zone and inter-zone dividing effect; the evaluation index systems of the zone division effect and the zone division effect in the step (4) are respectively as follows:

G _in ＝∑ _i λ _in,i E _in,i ,G _ex ＝∑ _i λ _ex,i E _ex,i

wherein G is _in And G _ex Respectively, is an evaluation index of the intra-zone and inter-zone dividing effect, lambda _in,i And lambda (lambda) _ex,i The higher index value represents a better dynamic partitioning strategy for the principal component contribution coefficient; the partition evaluation index system changes with the change of the state information of the micro-grid.

(5) Designing a multi-micro-grid partition comprehensive evaluation index; the formula of the multi-micro-grid partition comprehensive evaluation index in the step (5) is as follows:

wherein G is _in,j G is the j-th in-zone evaluation index _ex,j And n is the number of partitions for the j-th interval evaluation index.

(6) Designing a multi-micro-network re-partition mechanism taking system node change into consideration; the step (6) specifically comprises the following steps: considering the change of the nodes of the multi-micro-grid system along with the environment and time, if a certain block of the system is partitioned, comprehensively evaluating the index M _IE Less than 90% of the original, the multi-micro-network needs to be partitioned again; if the comprehensive evaluation index M of all partitions in the system _IE And if the current partition strategy is greater than 90% of the original partition strategy, the current partition strategy is better, and re-partition is not needed.

A computer storage medium having stored thereon a computer program which, when executed by a processor, implements a regional smart grid partition assessment method based on deep reinforcement learning as described above.

A computer device comprises a storage, a processor and a computer program which is stored in the storage and can be run on the processor again, wherein the processor realizes the regional intelligent power grid partition evaluation method based on deep reinforcement learning when executing the computer program.

The beneficial effects are that: compared with the prior art, the invention has the following advantages:

1. according to the regional intelligent power grid partition evaluation index system, multiple indexes such as renewable energy utilization rate, in-region supply and demand balance degree, distributed power supply adjustable capacity, electric coupling degree among nodes, load and distributed power supply matching degree, energy storage equipment adjustable capacity, interval average electric distance, interval power interaction capacity, power scheduling cost, line loss, whether each subarea is connected with a main network, energy storage equipment damage cost and the like are considered, and the regional intelligent power grid partition evaluation index system is more comprehensive than the existing regional intelligent power grid partition indexes;

2. the invention adopts a deep reinforcement learning method, determines the weights of various indexes based on the historical data of each node, is easier to resist the influence of bad data on the weights, and has stronger robustness;

3. the invention can update and adjust the partition evaluation index system in time according to the state change of the network nodes so as to ensure the rationality and coordination of the partition evaluation index.

Drawings

FIG. 1 is a flow chart of the method of the present invention;

FIG. 2 is a schematic diagram of partitioning a multi-microgrid system according to a partition evaluation index system.

Detailed Description

The technical scheme of the invention is further described below with reference to the accompanying drawings.

As shown in fig. 1 and 2, a regional smart grid partition evaluation method based on deep reinforcement learning includes the following steps:

power scheduling cost: e (E) _ex,3 The power loss generated for each interaction of 1kWh of power between the partition and the surrounding partitions;

P _trans (t)＝P _prc (t)-P _u (t)

(3) Establishing an evaluation index weight intelligent agent;

s _t ∈S:{E _in,i,t ,E _ex,i,t }

a _t ∈A{0,1,2}

r _t ＝P _trans ×R _t ×Δt

λ(s,a)＝E(r _t +λ(s _t+1 ,a _t+1 )|s _t ,a _t )

G _in ＝∑ _i λ _in,i E _in,i ,G _ex ＝∑ _i λ _ex,i E _ex,i

(6) Designing a multi-micro-network re-partition mechanism considering system node change, wherein the step (6) specifically comprises the following steps: considering the change of the nodes of the multi-micro-grid system along with the environment and time, if a certain block of the system is partitioned, comprehensively evaluating the index M _IE Less than 90% of the original, the multi-micro-network needs to be partitioned again; if the comprehensive evaluation index M of all partitions in the system _IE All are greater than 90% of the original, the current partition strategy is better and notRepartitioning is required.

Claims

1. The regional intelligent power grid partition evaluation method based on deep reinforcement learning is characterized by comprising the following steps of:

energy storage device adjustable capacity: e (E) _in,6 ＝P _ba,st (t)-P _ba,c (t)，P _ba,c (t) is the actual power generated by the energy storage device at the moment t, P _ba,st (t) is the rated power of the energy storage device at the moment t;

energy storage equipment is rolled overLoss cost: e (E) _ex,6 The loss of the energy storage device generated for each time of storing/releasing 1kWh of electric energy by the energy storage device;

E _in,i and E is _ex,i Is the main component of the corresponding in-zone and out-zone evaluation index;

(2) Setting power balance limit of the multi-micro-grid system;

(3) Establishing an evaluation index weight intelligent agent;

s _t ∈S:{E _in,i,t ,E _ex,i,t }

a _t ∈A{0,1,2}

r _t ＝P _trans ×R _t ×Δt

λ(s,a)＝E(r _t +λ(s _t+1 ,a _t+1 )|s _t ,a _t )

wherein lambda (s, a) represents the state s of observation of the agent _t When and as desired to take action, E () represents a desired value;

(4) Constructing an evaluation index system of the intra-zone and inter-zone dividing effect;

(5) Designing a multi-micro-grid partition comprehensive evaluation index;

(6) A multi-microgrid repartitioning mechanism is designed.

2. The regional smart grid partition evaluation method based on deep reinforcement learning according to claim 1, wherein the power balance relation corresponding to the power balance limit in step (2) is as follows:

P _trans (t)＝P _prc (t)-P _u (t)

3. The regional smart grid partition evaluation method based on deep reinforcement learning according to claim 1, wherein the evaluation index systems of the intra-region and inter-region partition effects in the step (4) are respectively:

G _in ＝∑ _i λ _in,i E _in,i ,G _ex ＝∑ _i λ _ex,i E _ex,i

4. The regional smart grid partition evaluation method based on deep reinforcement learning according to claim 1, wherein the formula of the multi-micro-grid partition comprehensive evaluation index in step (5) is as follows:

5. The regional smart grid partition evaluation method based on deep reinforcement learning according to claim 1, wherein the step (6) is specifically: if a block of the system is partitioned to comprehensively evaluate the index M _IE Less than 90% of the original, the multi-micro-network needs to be partitioned again; if the comprehensive evaluation index M of all partitions in the system _IE And if the current partition strategy is greater than 90% of the original partition strategy, the current partition strategy is better, and re-partition is not needed.

6. A computer storage medium having a computer program stored thereon, which when executed by a processor implements a deep reinforcement learning based regional smart grid partition assessment method as claimed in any one of claims 1 to 5.

7. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements a regional smart grid partition assessment method based on deep reinforcement learning as claimed in any one of claims 1 to 5 when the computer program is executed by the processor.