CN112952847A

CN112952847A - Multi-region active power distribution system peak regulation optimization method considering electricity demand elasticity

Info

Publication number: CN112952847A
Application number: CN202110366783.5A
Authority: CN
Inventors: 唐昊; 曹永伦; 王正风; 吴旭; 李智; 吕凯; 谭琦
Original assignee: Hefei University of Technology
Current assignee: Hefei University of Technology; State Grid Anhui Electric Power Co Ltd
Priority date: 2021-04-06
Filing date: 2021-04-06
Publication date: 2021-06-11
Anticipated expiration: 2041-04-06
Also published as: CN112952847B

Abstract

The invention discloses a peak regulation optimization method of a multi-region active power distribution system considering power demand elasticity, which comprises the steps of firstly establishing a mathematical model of a photovoltaic and all-vanadium redox flow battery energy storage system and a multi-region flexible load scheduling unit under a power elasticity environment; then, establishing a DTMDP model by considering the peak regulation optimization problem of the multi-region active power distribution system with the elasticity of power demand; and finally, solving the mathematical model by combining reinforcement learning and an intelligent algorithm to obtain a multi-region scheduling optimization control strategy meeting peak regulation requirements. The layered learning mechanism avoids the problem of dimension disaster of reinforcement learning to a certain extent, and promotes the rapid solution of the scheduling strategy; meanwhile, the combination of reinforcement learning and an intelligent algorithm further enhances the exploration capability of the algorithm, and is beneficial to solving the optimal peak regulation strategy; potential scheduling information of the active power distribution system can be further obtained by considering the elasticity of power demand, and smooth and safe operation of the system is promoted.

Description

Multi-region active power distribution system peak regulation optimization method considering electricity demand elasticity

Technical Field

The invention belongs to the field of dispatching optimization of a multi-region active power distribution system, and particularly relates to a dynamic dispatching optimization method of the multi-region active power distribution system, which takes the peak regulation demand and the power consumption demand elasticity of a power grid into consideration and aims at achieving stable and economic operation of the system.

Background

Currently, the research focus of the active power distribution system includes planning design, hierarchical control, operation management, and the like. The research in the planning aspect mainly develops around the aspects of the optimal configuration of the distributed power supply, the net rack planning and the like; the research of the hierarchical coordination control provides technical support for scheduling and managing various resources, and the overall optimization is achieved by managing the hierarchical distributed energy sources; and the aspects of multi-enclosure reactive power compensation, scheduling optimization and the like are researched in the aspect of operation management.

The traditional power grid peak regulation problem research mainly considers the starting, stopping and output regulation control of a generating side unit, and particularly the combined operation optimization of a multi-energy complementary system containing an energy storage device is proved to be capable of effectively relieving the peak regulation pressure of the system. With the development of an active power distribution system and the application of a flexible load scheduling technology, various types of high-quality peak shaving resources on a demand side are scheduled and optimized, so that economic peak shaving and energy utilization rate improvement are realized, an important trend of current power grid peak shaving research is formed, and the method is also an effective supplement for peak shaving on a power generation side.

The current research is more in the analysis and research of the whole load of the regional power grid, the research on the load characteristics of different industries is less, the influence caused by the difference and the proportional change of the loads of different industries is ignored to a certain extent, and the change rule of the power load is not easy to grasp more accurately. Meanwhile, the electric power of different types of electric users is flexibly depicted, which is not beneficial to guiding electric power users to select reasonable electric power utilization time and adjusting the electric power utilization of the users in the planning process of an electric power system so as to achieve better peak clipping and valley filling effects.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides a peak shaving optimization method of a multi-region active power distribution system considering the elasticity of power consumption requirements, so that a power consumer can be guided to select reasonable power consumption time and adjust the power consumption of the power consumer, and the effects of peak shaving and valley filling are achieved, so that the power consumption load rate is improved, and the safety and the stability of power grid operation are improved.

The invention adopts the following technical scheme for solving the technical problems:

the invention relates to a peak regulation optimization method of a multi-region active power distribution system considering power demand elasticity, which is characterized by comprising the following steps of:

step 1, constructing a multi-region active power distribution system, comprising: the system comprises a dispatching center, an industrial park dispatching area, a commercial park dispatching area and municipal and residential life park dispatching areas; recording any one of the industrial park dispatching area, the commercial park dispatching area and the municipal and residential life park dispatching areas as an area i;

the area i comprises: the system comprises an ith PV generating unit, an ith VRB energy storage unit and an ith load scheduling unit; the load type of the ith load scheduling unit comprises: the ith load-reducible load and the ith rigid load; wherein, the load type of the load scheduling unit in the industrial park scheduling area further comprises: the load can be transferred;

step 2, determining predicted values of power grid peak regulation requirements, photovoltaic output and various types of load requirements based on historical data of the area i at any decision time t in a dispatching day; wherein, the peak load regulation demand of the power grid is recorded as

Photovoltaic output is noted

Reducible load in various types of load demandsThe load demand is recorded as

And transferable load demand is noted

Step 3, establishing mathematical models of various load scheduling units, VRB energy storage units and PV power generation units in industrial, commercial, municipal and life parks, a region scheduling time attribute mathematical model and a peak shaving task allocation mechanism considering region elasticity amplitude:

step 3.1, establishing mathematical models of various types of loads of multiple areas under the elastic environment:

obtaining the minimum and maximum load reduction quantity constraints of the region i at the decision time t by using the formula (1):

in the formula (1), the reaction mixture is,

the maximum load amount of the reducible load of the area i at the decision time t is reduced;

actually reducing the load of the area i at the decision time t;

the transferable load force constraint is obtained by using the formula (2) to the formula (4):

in the formulae (2) to (4),

and

respectively determining the maximum allowable load increase and the maximum allowable load decrease of the transferable load in the area i at the decision time t;

the load increment of the transferable load corresponding to the area i at the decision time t;

the load reduction quantity of the area i at the decision time t;

defining the accumulated increase and decrease load quantity after action is taken at each decision time from the initial time to the decision time t as

Suppose a single scheduling day has t_K-1At the moment of decision, then

Is the remaining period t_K-1-t is a margin of resilience to transferable load;

obtaining margin constraints of the region i at the decision time t by using the equations (5) and (6)

And transfer direction constraints

Thereby obtaining transferable load constraint of the area i at the decision time t

In the formulae (5) and (6), a_acctrIs the elastic margin coefficient; e is a natural constant; alpha is alpha_dirSelecting a direction coefficient for the transfer motion;

the load amount is increased or decreased by the accumulated transfer to the t-1 moment;

when the load is increased and decreased correspondingly after the action is taken for the transferable load at the current decision time t and the load is set to be increased,

the value of (d) is positive, and when the load amount is reduced,

the value of (d) is negative, when not active,

is 0;

3.2, establishing a mathematical model of the output of the VRB energy storage unit:

establishing the constraint condition of the VRB energy storage unit in one scheduling day by using an equation (7) to an equation (10), wherein the constraint condition comprises the following steps: terminal voltage constraint, charge-discharge power constraint, charge state constraint and initial and final charge state consistency constraint:

in the formulae (7) to (10),

the upper limit and the lower limit of the end voltage of the VRB energy storage unit of the area i,

the minimum and maximum charge-discharge power of the VRB energy storage unit in the area i at the decision time t,

the actual charging and discharging power of the VRB energy storage unit at decision time t for region i,

the remaining capacity of the VRB energy storage cells for region i constrains the upper and lower limits,

for the remaining capacity of the VRB energy storage unit at decision time t for region i, t_s、t_eTo schedule the beginning and end of the day, C_conSetting the expected value of the state of charge of the VRB energy storage unit;

3.3, establishing a mathematical model of photovoltaic power generation output:

obtaining a predicted value of the photovoltaic output power of the area i at the decision time t by using the formula (11)

In the formula (11), eta_pvThe photoelectric conversion efficiency; n is_pvThe number of the photovoltaic cell panels; s_pvThe surface area of the photovoltaic cell panel receiving illumination is increased;

the solar radiation intensity of the area i at the decision time t; alpha is alpha_pvIn order to be the temperature conversion coefficient,

the outdoor temperature of the area i at the decision time t;

step 3.4, establishing a mathematical model of the regional scheduling time attribute:

obtaining the time attribute T of the area i at the decision time T by using the formula (12)^i,t：

In the formula (12), the reaction mixture is,

is the time magnitude parameter of the region i; c is a constant; p is a radical of^i,tThe power output at decision time t for region i,

maximum output power of the region i on the dispatching day;

step 3.5, establishing a peak regulation task allocation mechanism considering the elastic amplitude of the region:

setting the elastic amplitude of the area i at the decision time t as E^i,tElastic amplitude E^i,tUpper bound of (2) which can reduce the upper bound of the load

+ transferable increased load upper bound

+ an energy storage discharge margin; elastic amplitude E^i,tLower bound of (2) reducible load lower bound + transferable reduced lower bound

+ an energy storage charging margin; the elastic amplitude of the region i is formed by the span between the upper and lower bounds;

step 4, modeling of continuous variable discretization and uncertain random variable dynamic change processes:

step 4.1, establishing a multi-region power grid peak regulation demand uncertainty model:

at decision time t, issuing the power grid to the maximum range interval of the random uncertain part of the peak regulation demand instruction of the area i in real time

Is dispersed into

In total

A plurality of levels, wherein,

for the maximum value of the upward fluctuation based on the peak shaver demand predicted power for the region i at the decision time t,

the maximum value of downward fluctuation based on the peak load demand prediction power of the area i at the decision time t;

maximum discrete levels of upward and downward fluctuation amounts based on the predicted power of the peak shaving demand of the region i;

and (3) obtaining the actual peak regulation demand of the power grid of the area i at the decision time t by using the formula (13):

in the formula (13), the reaction mixture is,

predicting power for the peak shaver demand of the power grid in the area i at the decision time t,

the power level of the uncertain part of the peak shaving demand of the power grid in the area i at the decision time t,

the minimum unit power of the power grid peak regulation instruction uncertainty part discretization of the area i at the decision time t is obtained;

step 4.2, establishing a photovoltaic output uncertain model;

the maximum range interval of the photovoltaic output uncertain part of the area i at the decision time t

Is dispersed into

In total

A plurality of levels, wherein,

the maximum value of the upward fluctuation based on the photovoltaic output predicted power of the area i at the decision time t,

the maximum value of the downward fluctuation based on the photovoltaic output predicted power of the area i at the decision time t;

is illuminated by an area iMaximum discrete levels of upward and downward fluctuation amounts based on the predicted power of the volt-ampere output;

obtaining the actual photovoltaic output of the region i at the decision time t by using the formula (14)

In the formula (14), the compound represented by the formula (I),

the power is predicted for the photovoltaic output at decision time t for zone i,

the power level of the uncertain photovoltaic output part of the area i at the decision time t is determined;

the minimum unit power of the area i after the photovoltaic output uncertain part is dispersed at decision time t is obtained;

step 4.3, establishing a multi-region uncertain model of each type of load demand:

the maximum range interval of the random uncertain part of the reducible load and the transferable load of the area i at the decision time t

And

respectively dispersed into corresponding state grades

And

in total

And

a plurality of levels, wherein,

and

respectively the maximum value of the upward fluctuation based on the forecasted power of the reducible load and the transferable load demand of the area i at the decision time t,

and

maximum values of downward fluctuation based on demand predicted power of reducible load and transferable load in the area i at decision time t respectively;

and

maximum discrete levels of upward and downward fluctuation amounts based on predicted power required by reducible load and transferable load in the region i respectively;

obtaining the actual power demand of the region i at the decision time t, which can reduce the load, by using the equations (15) to (16)

And the actual power demand of the load can be transferred

In the formulae (15) to (16),

power is predicted for the demand that region i can shed loads and transferable loads at decision time t,

respectively the power levels of the uncertain parts of the reducible load and the transferable load demand power in the area i at the decision time t;

respectively the minimum unit power of the region i after the uncertain part of the reducible load and the transferable load demand power is dispersed under the decision time t;

step 5, establishing a corresponding DTMDP model according to the peak regulation optimization problem of the multi-region active power distribution system considering the elasticity of power consumption requirements:

step 5.1, system state space and action set of the DTMDP model:

dividing a scheduling day into K e {0, 1.,. K-1}, and K decision periods; the time length of each decision period is delta t, and the decision time of the kth decision period is t_kThe end time of the scheduling day is t_K-1；

The formula (17) -formula (18) is used for obtaining the decision time t of the scheduling center_kState of

In the formulae (17) to (18),

as a decision time t_kThe real-time peak regulation demand state grade of the lower power grid,

for region i at decision time t_kEnvironmental information of the environment, photovoltaic output state class

VRB energy storage unit charging and discharging state grade

Multi-type load scheduling unit load demand state grade

Elastic margin state rating

And zone elastic amplitude state level

Composition is carried out; s_upThe state space of a dispatching center, N is the number of regions;

setting the multi-type load containing type as M, if the area i does not consider the load j of a certain type, corresponding to the state number

Is 0; the total number of states N is obtained by the formula (19)_up,s：

In the formula (19), N_peakThe maximum state grade of the real-time peak regulation demand of the power grid,

the photovoltaic output maximum state grade of the region i,

for the region iVRB energy storage cell charge-discharge maximum state level,

for the region i elasticity margin maximum state level,

for the state class with the largest elastic amplitude in the region i,

the maximum state grade of the load demand of the region i is set;

the dispatching center is arranged at decision time t_kMaximum interval of lower random peak shaving demand power

The dispersion is 0 to N_ap-1 to N_apA plurality of levels, wherein,

for the scheduling centre at decision time t_kLower total peak shaver power requirement, N_apThe maximum discrete level is required for the total peak regulation of the dispatching center;

the peak-shaving task quantity distributed to the area i by the dispatching center is obtained by using the formula (20)

In the formula (20), the reaction mixture is,

for adjustingDegree center at decision time t_kDescending the peak shaving task action allocated to the area i;

the peaker task action assignment constraint is established using equation (21):

in the formula (21), the compound represented by the formula,

A_ia set of all possible peak shaver task motion vectors for the region i;

the formula (22) is used for obtaining the decision time t of the dispatching center_kAction vector of

In the formula (22), A_upA set of all possible action vectors, namely an action set, for a scheduling center; the total number of actions of the dispatching center is N_up,a＝N_ap；

Obtaining region i at decision time t using equation (23)_kState of

In the formula (23), the compound represented by the formula,

is the state space of the region i;

the total number of states of the region i is obtained by equation (24)

Obtaining region i at decision time t using equation (25)_kDownward movement

In the formula (25), the reaction mixture,

for region i at decision time t_kThe lower VRB energy storage unit acts, and the three values are respectively a discharging action, an idle action and a charging action;

for region i at decision time t_kThe lower load scheduling unit adjusting action comprises load reduction action capable of reducing load

Load shifting actions

For region i at decision time t_kNext different actuation control actions;

is the set of all possible motion vectors in the region i, i.e. the motion set of the region i;

the total number of operations of the region i is obtained by the equation (26)

Step 5.2, defining the state transition process of the DTMDP model:

the state-of-charge transfer equation for the VRB energy storage unit is established using equation (27):

in the formula (27), N is the number of the single batteries connected in series by the electric pile, I_dIn order to charge and discharge the current,

the total capacity of the VRB energy storage unit;

VRB energy storage unit of area i at current decision time t_kThe state of charge of the battery,

taking charging and discharging actions for VRB energy storage unit

A later state of charge;

a reducible load state transition equation is established using equation (28):

in the formula (28), the reaction mixture is,

for region i at decision time t_kTake a curtailment action

The latter can reduce the load demand situation,

for region i at decision time t_kCan reduce the predicted power of the load demand,

decision time t_kThe uncertain part of the load demand can be reduced,

the maximum discrete level of the load demand can be reduced;

the transferable load state transfer equation is established using equation (29):

in the formula (29), the reaction mixture,

for region i at decision time t_kTake transfer action down

The latter transferable load demand situation,

at the end decision time t for region i_K-1The next transfer action to be taken is,

for region i at decision time t_kThe transferable load demand of (a) predicts the power,

as a decision time t_kThe uncertain portion of the lower transferable load demand,

the maximum discrete level of transferable load demand;

step 5.3, establishing an objective function of the DTMDP model:

the upper layer cost in the decision period k is obtained by using the formula (30)

In the formula (30), c^i,kGenerating a cost for the region i in the state transition process of the decision period k;

the starting and ending state of charge consistency constraint of the VRB energy storage unit is established by using the formula (31):

in the formula (31), the reaction mixture,

is the weight coefficient of the last state of the VRB energy storage unit,

and

respectively setting the actual capacity grade of the VRB energy storage unit at the last decision moment and the expected capacity grade;

step 5.4, establishing an optimization target of the DTMDP model:

obtaining scheduling center-in-strategy pi by using formula (32)_upThe initial state is s₀For a limited period of time to optimize the performance criterion

The upper layer optimization target is in a strategy set omega_upFind the optimal strategy

Obtaining region i in strategy pi using equation (33)_dow,iThe initial state is s₀For a limited period of time to optimize the performance criterion

The lower layer optimization target is in a strategy set omega_dow,iFind the optimal strategy

Step 6, solving the DTMDP model established in the step 5 by adopting Q learning based on simulated annealing;

firstly, initializing parameters, learning parameters, upper and lower layer Q value tables, current learning step numbers and decision periods of a DTMDP model; then the upper and lower layers randomly select the corresponding action of the current state according to the strategy, generate the corresponding cost and update the Q value table; and repeatedly and iteratively updating the Q value table until the termination condition is met, and obtaining a scheduling strategy of each scheduling resource in each decision period meeting the peak regulation requirement of the scheduling center within one scheduling day.

Compared with the prior art, the invention has the beneficial effects that:

1. aiming at the scheduling optimization problem of a multi-region active power distribution system, the invention considers the peak regulation and the power demand elasticity of a power grid, utilizes various power elasticity resources to carry out cooperative optimization, and carries out strategy solution through a Q learning algorithm, thereby improving the load utilization rate and the peak regulation participation degree of each power consumption main body to a certain extent.

2. The invention adopts a layered learning algorithm, decomposes the active power distribution system into a plurality of different subsystems by copying boundary nodes, each subsystem is relatively independent, and performs overall coordination only by exchanging boundary node data, decomposes the original huge knowledge matrix into a plurality of smaller knowledge matrices, reduces the number of state action pairs, avoids dimension disaster to a certain extent, and is suitable for the problem of cooperative scheduling of a plurality of regions;

3. aiming at the scheduling problem of a multi-region active power distribution system, the simulated annealing algorithm is combined with Q learning, so that the exploration capability of the algorithm is further enhanced, the phenomenon of local optimization is avoided, and the most peak regulation strategy is more favorably obtained; meanwhile, the scheduling center peak shaving task allocation is carried out under the condition of considering the elastic amplitude of each region, so that the optimization time can be reduced to a certain extent; potential scheduling information of the active power distribution system can be further obtained by considering the elasticity of power demand, an intelligent solution is provided for scheduling of the active power distribution system, and stable and safe operation of the system is promoted.

Drawings

FIG. 1 is a schematic diagram of a multi-zone active power distribution system according to the present invention;

FIG. 2 is a flow chart of hierarchical coordinated scheduling as employed by the present invention;

fig. 3 is a flowchart of an algorithm for solving the problem of dynamic scheduling of a multi-zone active power distribution system according to the present invention.

Detailed Description

In this embodiment, a method for dynamically scheduling and optimizing a multi-region active power distribution system is applied to the multi-region active power distribution system shown in fig. 1, where the elastic power resources include: photovoltaic arrays, VRB energy storage devices and multi-type flexible loads in the region; in the process that the dispatching center distributes the peak shaving tasks to the areas, various electric power elastic resources and the day-ahead prediction of peak shaving requirements in each period are used as initial input, dispatching plan distribution is carried out under the condition that the elastic amplitude of each area is considered, and the dispatching plans are corrected in a rolling mode on a real-time scale.

Referring to fig. 2, the scheduling optimization method for the multi-zone active power distribution system is performed according to the following steps:

step 1, constructing a multi-region active power distribution system, comprising: the system comprises a dispatching center, an industrial park dispatching area, a commercial park dispatching area and municipal and residential life park dispatching areas; recording any one of an industrial park dispatching area, a commercial park dispatching area and municipal and residential life park dispatching areas as an area i;

the area i includes: the system comprises an ith PV generating unit, an ith VRB energy storage unit and an ith load scheduling unit; the load types of the ith load scheduling unit include: the ith load-reducible load and the ith rigid load; wherein, the load type of load scheduling unit in the industrial park scheduling region still includes: the load can be transferred;

Photovoltaic output is noted

Reducible load demand among various types of load demands is noted

And transferable load demand is noted

in the formula (1), the reaction mixture is,

actually reducing the load of the area i at the decision time t;

compared with the power environment without considering the elasticity of the power demand, the variable excitation control is set, so that the load can be reduced to generate a certain degree of upward deviation on the basis of the originally declared reduction upper limit.

The transferable load is characterized in that a user only needs to change the service time of electric energy without reducing the self life requirement or interrupting the production task, and the total electricity consumption of the user in a period is ensured to be unchanged, so that the increase and the decrease of the transferable load can be balanced. Considering that users have certain uncontrollable basic loads in the scheduling process, namely, the load increasing and decreasing amount in each time interval has certain limitation, the output constraint of transferable loads is obtained by using the formulas (2) to (4):

in the formulae (2) to (4),

and

the load reduction quantity of the area i at the decision time t;

Suppose a single scheduling day has t_K-1At the moment of decision, then

Is the remaining period t_K-1T is an elastic margin of the transferable load, wherein the elastic margin can represent the fluctuation change of the transferable load range in the remaining time period in real time so as to influence the scheduling decision of the transferable load;

And transfer direction constraints

the value of (d) is positive, and when the load amount is reduced,

the value of (d) is negative, when not active,

is 0;

in order to fully reflect the dynamic change characteristics of terminal voltage, terminal current, energy storage state of charge (SOC) and the like in the charging and discharging process of the all-vanadium redox flow battery (VRB), the constraint conditions of the VRB energy storage unit in one scheduling day are established by using an equation (7) to an equation (10), and the constraint conditions comprise the following steps: terminal voltage constraint, charge-discharge power constraint, charge state constraint and initial and final charge state consistency constraint:

in the formulae (7) to (10),

in actual operation, the SOC of the energy storage device is controlled to be 0.2-0.8 so as to ensure that the energy storage device works in a safety area, and the charging and discharging efficiency of the battery is improved.

3.3, establishing a mathematical model of photovoltaic power generation output:

In the formula (11), eta_pvThe photoelectric conversion efficiency; n is_pvThe number of the photovoltaic cell panels; s_pvThe surface area of the photovoltaic cell panel receiving illumination is m²；

Solar radiation intensity in kW/m for area i at decision time t²；α_pvThe order of magnitude is 10^ -3 in general for the temperature conversion coefficient,

outdoor temperature in units of area i at decision time t;

when the cooperative scheduling of different areas is involved, the time dimension is introduced by considering the difference of different area characteristics. When a scheduling center issues peak shaving tasks and multi-elastic resource coordination adjustment, compared with the situation that longitudinal time difference is not considered, adjustment sequences or adjustment amount changes may exist in adjustment of different regional main bodies, and potential scheduling information of the active power distribution system is further mined. Considering that the scheduling time attribute is related to the self electricity utilization level and the electricity utilization characteristics of the region, the time attribute T of the region i at the decision time T is obtained by using an equation (12) and taking the absolute value of the difference between the electricity utilization load in each time period and the daily electricity utilization peak value of the region as an independent variable on the basis of the typical daily load curve of each region^i,t：

In the formula (12), the reaction mixture is,

is the time magnitude parameter of the region i; c is a constant, the magnitude of the order is visible

And in turn, the temperature of the molten metal is controlled,

when the value is MW, c can be 10^ 2; p is a radical of^i,tThe power output at decision time t for region i,

maximum output power of the region i on the dispatching day;

considering that transferable loads have a time strong constraint condition, the scheduling time attribute is temporarily limited in reducible loads.

+ transferable increased load upper bound

compared with random task allocation, the method has the advantages that the peak regulation task allocation decision made by considering the elastic amplitude state information of each region at the current moment is taken into consideration when the peak regulation task is allocated, the system electric power elastic resource information is further utilized, and the power utilization efficiency is improved to a certain extent. And then, on the basis, performing rolling correction on the scheduling plan, and establishing an in-day real-time scheduling model considering the multi-period response characteristic of the demand response resource.

Is dispersed into

In total

A plurality of levels, wherein,

in the formula (13), the reaction mixture is,

make decisions for region iThe peak shaving demand of the power grid at time t predicts power,

step 4.2, establishing a photovoltaic output uncertain model;

Is dispersed into

In total

A plurality of levels, wherein,

maximum discrete levels of upward and downward fluctuation amounts based on the photovoltaic output predicted power of the region i;

In the formula (14), the compound represented by the formula (I),

And

respectively dispersed into corresponding state grades

And

in total

And

a plurality of levels, wherein,

and

and

and

And the actual power demand of the load can be transferred

In the formulae (15) to (16),

the process of the variable uncertain part changing along with the time is approximately described by a first-order Markov process, and the transition probability at each moment follows discrete Gaussian distribution taking the state of the variable uncertain part as the center.

step 5.1, system state space and action set of the DTMDP model:

In the formulae (17) to (18),

VRB energy storage unit charging and discharging state grade

Multi-type load scheduling unit load demand state grade

Elastic margin state rating

And zone elastic amplitude state level

Is 0; the total number of states N is obtained by the formula (19)_up,s：

is a regionThe maximum state class of photovoltaic output of the field i,

for the region iVRB energy storage cell charge-discharge maximum state level,

for the region i elasticity margin maximum state level,

for the state class with the largest elastic amplitude in the region i,

the maximum state grade of the load demand of the region i is set;

The dispersion is 0 to N_ap-1 to N_apA plurality of levels, wherein,

In the formula (20), the reaction mixture is,

for the scheduling centre at decision time t_kDescending the peak shaving task action allocated to the area i;

in the formula (21), the compound represented by the formula,

A_ia set of all possible peak shaver task motion vectors for the region i;

Obtaining region i at decision time t using equation (23)_kState of

In the formula (23), the compound represented by the formula,

is the state space of the region i;

the total number of states of the region i is obtained by equation (24)

Obtaining region i at decision time t using equation (25)_kDownward movement

In the formula (25), the reaction mixture,

Load shifting actions

For region i at decision time t_kNext different actuation control actions;

the total number of operations of the region i is obtained by the equation (26)

Step 5.2, defining the state transition process of the DTMDP model:

the total capacity of the VRB energy storage unit;

taking charging and discharging actions for VRB energy storage unit

A later state of charge;

a reducible load state transition equation is established using equation (28):

in the formula (28), the reaction mixture is,

for region i at decision time t_kTake a curtailment action

Later can reduce the load demandIn the case of a situation in which,

decision time t_kThe uncertain part of the load demand can be reduced,

the maximum discrete level of the load demand can be reduced;

considering that the transferable load needs to meet the constraint that the total amount is unchanged before and after the transfer, a state transfer equation of the transferable load is established by using an equation (29):

in the formula (29), the reaction mixture,

for region i at decision time t_kTake transfer action down

The latter transferable load demand situation,

the maximum discrete level of transferable load demand; the total amount before and after the transfer is not changed and restrained by the last moment adjusting action:

in the case of positive numbers, 0 and negative numbers,

the values of (a) correspond to-1, 0 and 1, respectively;

step 5.3, establishing an objective function of the DTMDP model:

Upper layer cost

Returning a cost c for each step of operation of each region of the lower layer^i,kSuperposition of (2):

the active power distribution system scheduling is considered to be periodic, and the initial and final charge states of the energy storage device also meet the consistent constraint. The starting and ending state of charge consistency constraint of the VRB energy storage unit is established by using the formula (31):

in the formula (31), the reaction mixture,

is the weight coefficient of the last state of the VRB energy storage unit,

and

step 5.4, establishing an optimization target of the DTMDP model:

And 6, referring to fig. 3, solving the established DTMDP model by adopting simulated annealing-based Q learning according to the following steps:

and 6.1, initializing system model parameters. Comprising a single sample blockThe strategy cycle number K, the maximum level N of the task allocation of the dispatching center_peakMaximum photovoltaic output rating in region i

Maximum grade of output force of energy storage device

Multi-type flexible load power adjustment maximum grade

Time of use electricity price

Coefficient of rebound load beta₁、β₂、β₃Operation weight coefficient gamma_vrb、λ_pvEtc.;

and 6.2, initializing system learning parameters. Including the total number of sample tracks M, the learning rate of the dispatching center alpha_upDiscount factor gamma_upAnd learning rate update coefficient eta_up(ii) a Learning rate of region i

Discount factor

And learning rate update coefficient

Simulated annealing temperature T_tempAnd simulated annealing coefficient eta_temp；

Step 6.3, initializing Q value table Q of dispatching center_upAnd Q value table of area i

Scheduling center and state data of each region to determine the state of the scheduling center

And initializing current learningThe step number m is equal to 0, and the current decision period k is equal to 0;

step 6.4 according to Q_upAnd greedy policy greedypolicy_upSelecting the current state

Greedy action for down-per-region peak shaving task allocation

Simultaneously randomly selecting valid actions

If it is

The current scheduling center action is

Otherwise

Assigning peaking tasks to actions

Transmitting the status to each region, and observing the status of each region

Step 6.5, according to region i

And greedy strategy

Selecting a current state

Greedy action for lower corresponding region i

Simultaneously randomly selecting valid actions

If it is

The action of the current area i is

Otherwise

Observing the next period state of the dispatching center

And (3) counting the cost in the process and feeding the cost back to the dispatching center, and updating the Q value table of each area by using a formula (34):

step 6.6, update Q with equation (35)_upIf K is less than K, skipping to step 6.4; otherwise, jumping to step 6.7:

step 6.7, executing the action selected by the current dispatching center and each area i

Calculating the cost generated in the process of executing action state transition in the decision period K

Updating the Q value table Q of the scheduling center and each area i by using the formula (36)_up,

And (2) enabling m to be m +1:

step 6.8, if M is less than M, updating the learning rate alpha_up:＝η_upα_up，

And updates the temperature T_temp:＝η_tempT_tempReturning to the step 6.4; otherwise, finishing the learning optimization method to obtain a scheduling strategy of each scheduling resource in each decision period meeting the peak regulation requirement of the scheduling center within one scheduling day.

In conclusion, the invention can effectively deal with the randomness of each elastic resource in a multi-region active power distribution system and ensure the stable and safe operation of the system.

Claims

1. A peak regulation optimization method of a multi-region active power distribution system considering power demand elasticity is characterized by comprising the following steps:

Photovoltaic output is noted

Reducible load demand among various types of load demands is noted

And transferable load demand is noted

in the formula (1), the reaction mixture is,

actually reducing the load of the area i at the decision time t;

in the formulae (2) to (4),

and

the load reduction quantity of the area i at the decision time t;

Suppose a single scheduling day has t_K-1At the moment of decision, then

Is the remaining period t_K-1-t is a margin of resilience to transferable load;

And transfer direction constraints

the value of (d) is positive, and when the load amount is reduced,

the value of (d) is negative, when not active,

is 0;

in the formulae (7) to (10),

3.3, establishing a mathematical model of photovoltaic power generation output:

the outdoor temperature of the area i at the decision time t;

In the formula (12), the reaction mixture is,

maximum output power of the region i on the dispatching day;

Transferable increased load upper bound

Energy storage and discharge allowance; elastic amplitude E^i,tLower bound of (2) reducible load lower bound + transferable reduced lower bound

Energy storage charging allowance; the elastic amplitude of the region i is formed by the span between the upper and lower bounds;

Is dispersed into

In total

A plurality of levels, wherein,

in the formula (13), the reaction mixture is,

step 4.2, establishing a photovoltaic output uncertain model;

Is dispersed into

In total

A plurality of levels, wherein,

In the formula (14), the compound represented by the formula (I),

for the indeterminate part of the photovoltaic output of zone i at decision time tA power level;

And

respectively dispersed into corresponding state grades

And

in total

And

a plurality of levels, wherein,

and

and

and

And the actual power demand of the load can be transferred

In the formulae (15) to (16),

the uncertain part of the required power of the reducible load and the transferable load in the area i at the decision time tA fractional power level;

step 5.1, system state space and action set of the DTMDP model:

In the formulae (17) to (18),

VRB energy storage unit charging and discharging state grade

Multi-type load scheduling unit load demand state grade

Elastic margin state rating

And zone elastic amplitude state level

Is 0; the total number of states N is obtained by the formula (19)_up,s：

the photovoltaic output maximum state grade of the region i,

for the region iVRB energy storage cell charge-discharge maximum state level,

is a regioni the maximum state level of the margin of elasticity,

for the state class with the largest elastic amplitude in the region i,

the maximum state grade of the load demand of the region i is set;

The dispersion is 0 to N_ap-1 to N_apA plurality of levels, wherein,

In the formula (20), the reaction mixture is,

in the formula (21), the compound represented by the formula,

A_ia set of all possible peak shaver task motion vectors for the region i;

Obtaining region i at decision time t using equation (23)_kState of

In the formula (23), the compound represented by the formula,

is the state space of the region i;

the total number of states of the region i is obtained by equation (24)

Obtaining region i at decision time t using equation (25)_kDownward movement

In the formula (25), the reaction mixture,

Load shifting actions

For region i at decision time t_kNext different actuation control actions;

the total number of operations of the region i is obtained by the equation (26)

Step 5.2, defining the state transition process of the DTMDP model:

the total capacity of the VRB energy storage unit;

taking charging and discharging actions for VRB energy storage unit

A later state of charge;

a reducible load state transition equation is established using equation (28):

in the formula (28), the reaction mixture is,

for region i at decision time t_kTake a curtailment action

The latter can reduce the load demand situation,

decision time t_kThe uncertain part of the load demand can be reduced,

the maximum discrete level of the load demand can be reduced;

in the formula (29), the reaction mixture,

for region i at decision time t_kTake transfer action down

The latter transferable load demand situation,

the maximum discrete level of transferable load demand;

step 5.3, establishing an objective function of the DTMDP model:

in the formula (31), the reaction mixture,

is the weight coefficient of the last state of the VRB energy storage unit,

and

step 5.4, establishing an optimization target of the DTMDP model:

obtaining scheduling center-in-strategy pi by using formula (32)_upThe initial state is s₀Is optimized within a limited period of timePerformance criteria