CN109449925B

CN109449925B - Self-adaptive dynamic planning method for multi-objective joint optimization scheduling

Info

Publication number: CN109449925B
Application number: CN201811265172.6A
Authority: CN
Inventors: 马明; 汪宁渤; 董海鹰; 马彦宏; 张宏; 何世恩; 贠韫韵; 吕清泉; 韩旭杉; 李晓虎; 韩自奋; 丁坤; 李津; 王定美; 周强; 张健美; 王明松; 陈钊; 赵龙; 周识远
Original assignee: STATE GRID GASU ELECTRIC POWER RESEARCH INSTITUTE; Wind Power Technology Center Of State Grid Gansu Provincial Electric Power Co; State Grid Gansu Electric Power Co Ltd; Lanzhou Jiaotong University
Current assignee: STATE GRID GASU ELECTRIC POWER RESEARCH INSTITUTE; Wind Power Technology Center Of State Grid Gansu Provincial Electric Power Co; State Grid Gansu Electric Power Co Ltd; Lanzhou Jiaotong University
Priority date: 2018-10-29
Filing date: 2018-10-29
Publication date: 2022-09-20
Anticipated expiration: 2038-10-29
Also published as: CN109449925A

Abstract

The invention discloses a self-adaptive dynamic planning method for multi-objective joint optimization scheduling, which is characterized in that waste heat in exhaust gas of a fuel engine is stored, a power grid scheduling model of a composite system is established by using a mechanism analysis method, and a multi-objective function is established according to the minimum power generation cost and the minimum environmental cost; and finally, giving a principle and a target of the adaptive dynamic programming, and representing the process of obtaining the optimal scheduling scheme by a heuristic adaptive dynamic programming algorithm. The invention greatly reduces the randomness, intermittence and fluctuation of wind power and photovoltaic output, reduces the impact on the stable operation and active balance of a power system caused by low prediction precision, reduces the possibility of wind abandon or light abandon, smoothes the wind-light-storage-gas integrated output, realizes peak clipping and valley filling, greatly reduces the system operation cost and the emission control cost of a power generation system, and improves the system operation benefit, thereby ensuring the safe, stable and economic operation of the power system.

Description

Self-adaptive dynamic planning method for multi-objective joint optimization scheduling

Technical Field

The invention relates to a self-adaptive dynamic planning method for multi-objective joint optimization scheduling, belongs to the technical field of wind power integration, and is applied to optimization scheduling of a distributed renewable energy power generation system.

Background

In recent years, uncertainty and volatility of intermittent new energy output bring new challenges to power dispatching, and in order to guarantee safe and economic operation of a power grid and promote consumption of new energy, combined optimal dispatching is carried out by concentrating multiple types of power supplies, so that optimal output and optimal dispatching strategies of a unit are obtained.

The scheduling center usually obtains the optimal scheduling strategy according to a dynamic planning method, but the method has the defects that a huge state space and the number of decision stages cause dimension disasters, and a computer cannot bear the huge calculation amount.

Disclosure of Invention

The invention aims to overcome the defects in the prior art and provides a self-adaptive dynamic programming method for multi-objective joint optimization scheduling. The method breaks through the limitation of independently coordinating and scheduling the stored energy or the conventional energy and the wind power, constructs a coordinated operation mechanism with multiple sources and establishes a wind-light-stored-gas multi-objective combined optimization scheduling model.

In order to achieve the purpose, the technical scheme adopted by the invention is as follows: a self-adaptive dynamic planning method for multi-objective joint optimization scheduling stores waste heat in gas exhausted by a fuel engine, establishes a power grid scheduling model of a composite system by using a mechanism analysis method, and establishes a multi-objective function according to the minimum power generation cost and the minimum environmental cost; and finally, giving a principle and a target of the adaptive dynamic programming, and representing the process of obtaining the optimal scheduling scheme by a heuristic adaptive dynamic programming algorithm.

Further, the establishing of the multi-objective function means:

the objective function is established with the minimum power generation cost as the objective 1 as follows:

in the formula, t in the first term represents any period of a scheduling cycle; t is the total time period number of the scheduling period, k ^w Penalty factor for wind curtailment, Δ P _t ^w Is the air loss quantity in the t-th time period, delta t ^h Total hours for any one period; second item N _g Total number of gas units, k ^gas The gas is used as the gas price coefficient,

the fuel gas consumption of the ith gas unit in the t period,

the mode conversion cost of the ith gas unit in the t period; third item N _p The total number of the pumped storage units is,

for the starting cost of the ith pumped storage unit at the moment t under the power generation working condition,

starting cost of the pumped storage unit i at the moment t under the pumping working condition; item four with N _m Total number of photothermal units, k ^opt Is the unit price of the photo-thermal unit for generating electricity,

generating power for the ith photothermal unit in the t period;

the objective function is established with the minimum emission control cost of the power generation system as the objective 2 as follows:

in the formula, k _poll In order to increase the cost of the pollutant emissions,

the total output power of the ith gas turbine set in the t period;

the overall multi-objective function of the system is constructed by the two objective functions of the formulas (1) and (2) as follows:

Z＝min(f ₁ ,f ₂ ) (3)。

further, carrying out system constraint on the multi-objective function of the whole system, wherein the system constraint comprises a real-time energy balance constraint, a positive standby constraint, a negative standby constraint and a branch capacity constraint, and the real-time energy balance constraint is expressed as an expression (4):

in the formula, k _ps If the variable is +/-1, the water pumping and energy storing unit is 1 under the water discharging working condition, otherwise, the variable is-1;

the output power of the ith pumped storage point station in the t time period; n is a radical of _n The total number of the photo-thermal heat storage units; k is a radical of _cr The variation is +/-1, the heat quantity of the heat storage machine assembly is-1, otherwise, the heat quantity is 1;

and storing the heat quantity of the ith heat storage unit in the t time period. P _t ^w Predicted output power for wind power generation for a t-th time period; d _t The total load of the power grid in the t-th time period;

the positive and negative standby constraints are expressed as equations (5) and (6),

in the formula (I), the compound is shown in the specification,

the maximum output power and the minimum output power of the ith gas turbine set in the t-th time period are respectively; r is _t The standby power value of the compound power generation system in the t-th time period;

the branch capacity constraint is expressed as equation (7),

in the formula (I), the compound is shown in the specification,

is the maximum power that line l can deliver; a represents any node in the power grid; n is a radical of _a The total number of nodes in the composite power generation system network; p _a,t Represents the power absorbed by node a from the hybrid power generation system during the t-th period;

is the element of the power transfer factor matrix associated with line inode n.

Further, performing pumped storage unit constraint on the multi-objective function of the whole system:

the charge and discharge power constraint of the pumped storage unit is expressed as formulas (8) and (9)

In the formula (I), the compound is shown in the specification,

the discharge power of the ith pumping and storage unit in the t period is obtained;

the discharge state of the ith pumping storage unit in the t period is 0, which represents that the unit is in a charging or running stopping state, otherwise, the discharge state is 1;

the discharge power minimum value of the ith pumped storage unit is obtained;

the maximum value of the discharge power of the ith pumped storage unit is obtained;

in the formula (I), the compound is shown in the specification,

charging power of the ith pumped storage unit in the t period;

the charging state of the ith pumped storage unit in the t period is 0, which represents that the unit is in a discharging or running stopping state, otherwise, the charging state is 1; p is _i ^ps,c The constant charging power of the ith unit is represented, and when economic factors are not ignored, the pumped storage unit generally performs a charging process at the constant power;

the power equality constraint is expressed as equation (10)

In the formula (I), the compound is shown in the specification,

the total generated power of the pumped storage unit is represented;

the charge-discharge state constraint is expressed by the formula (11)

In the formula (I), the compound is shown in the specification,

the discharge state and the charge state of any m and n pumped storage units in the t period are represented, and the units in the pumped storage units are ensured to be in the consistent working state;

the energy state constraint of the pumped storage group is expressed as a formula (12)

In the formula (I), the compound is shown in the specification,

storing the minimum value of energy for the ith pumped storage unit; tau is any one time interval in the past t time intervals;

charging power and discharging power of the ith pumped storage unit in the period tau; eta ^ps The conversion efficiency of the pumped storage unit during charging and discharging is realized;

the initial energy of the ith pumped storage unit;

storing the maximum value of energy for the ith pumped storage unit;

the final energy constraint is expressed as formula (13)

Further, it is right that the holistic multiple objective function of system carries out light and heat unit restraint:

the equation constraint of the energy flow of the photothermal unit regards the heat transfer fluid in the photothermal unit as a node in an electric network, and the power balance equation of the photothermal unit is expressed as an equation (14) without considering the energy loss of the photothermal unit in the heat transfer fluid

Wherein S represents a light field; h represents a heat transfer fluid; t represents a heat storage module; p represents a thermodynamic cycle module; p _t ^th,S-H 、P _t ^th,H-P 、P _t ^th,T-H 、P _t ^th,H-T The heat exchange power among different modules of the photo-thermal unit is respectively;

for thermodynamic cycle module at time tA variable of 0-1 is started, and 0 represents the stop of the operation;

power consumed for startup of the heat exchange module;

the output of the photothermal unit in the hybrid power generation system is expressed as formula (15)

In the formula, P _t ^th,opt Representing the output power of the photo-thermal unit in a t period; eta _SF To the light-to-heat conversion efficiency; s _SF The area of a light collecting field of the photo-thermal unit is adopted;

is the direct emissivity of sunlight at time t.

The relationship between the power supplied by the photothermal unit used in the composite system and the input value and the amount of light rejected is expressed by the following formula (16)

P _t ^th,S-H ＝P _t ^th,opt -P _t ^th,cut (16)

In the formula, P _t ^th,cut The light abandon quantity in the t period;

the charge/discharge efficiency of the heat storage system is expressed by the formulas (17), (18)

P _t ^th,c ＝η _c P _t ^th,H-T (17)

In the formula, P _t ^th,c And P _t ^th,d The charging power and the discharging power of the heat storage system are respectively in the period t; eta _c For the charging efficiency of the heat storage system, eta _d The heat release efficiency of the heat storage system;

the heat storage state equation is expressed as formula (19)

In the formula (I), the compound is shown in the specification,

the total energy in the heat storage device is t and t-1 time periods;

the charging and discharging power of the heat storage system is t-1; gamma is a dissipation coefficient; Δ t is the time interval;

linearized yide type (20)

The energy flow of the thermodynamic cycle module is represented by the formula (21)

P _t ^th,H-P ＝g(P _t ^e ) (21)

In the formula, P _t ^e Representing the thermal flow cycle module electrical power;

the inequality constraint of the operation of the photothermal unit is expressed as the formulas (22), (23), (24), (25), (26), (27) and (28)

In the formula, P _t ^opt,up And P _t ^opt,down The upper and lower power standby values of the turboset are respectively;

and

maximum and minimum power output values, respectively;

1 represents starting up for the working state of the steam turbine set in any time period t; τ represents an arbitrary time within a prescribed time after the t period;

and

the shortest working time and the shortest stopping time of the unit are obtained;

and

respectively are starting and stopping variables of the steam turbine set, and 1 represents that the steam turbine set starts/stops working at the moment t;

and

the maximum up-slope and down-slope capacities of the turboset are respectively;

the minimum energy storage constraint is expressed as formula (29)

In the formula (I), the compound is shown in the specification,

is the minimum reserve of the heat storage system; rho ^TES The maximum storage capacity of the heat storage system in FLH (full-load hour);

the heat-storage charge/discharge power constraint is expressed by the formulas (30), (31), (32)

P _t ^th,c P _t ^th,d ＝0 (32)

In the formula (I), the compound is shown in the specification,

in order to be the maximum charging power,

is the maximum discharge power;

other constraints are expressed as equations (33), (34)

P _t ^th,cut ≥0 (33)

In the formula, P _t ^up And P _t ^down The upper power standby value and the lower power standby value of the conventional unit are respectively set; equations (33) and (34) determine the amount of light discarded and the upper and lower standby nonnegatives of the unit, respectively.

Further, carrying out gas independent constraint on the multi-objective function of the whole system:

consumption curve is expressed as formula (35)

In the formula (I), the compound is shown in the specification,

the output power of the ith gas turbine set in the n mode in the t period;

the lower limit value of the output power of the ith gas turbine set in the n mode in the t period;

the upper limit value of the output power of the ith gas turbine set in the n mode in the t period;

the fuel cost corresponding to the output power of the ith gas turbine set in the n mode in the t period;

the minimum value of the fuel cost corresponding to the output power of the ith gas turbine set in the n mode in the t period;

the maximum value of the fuel cost corresponding to the output power of the ith gas unit in the n mode in the t period;

weights of an upper limit and a lower limit of output power of the ith gas turbine set in the n mode at the t time and corresponding upper and lower limits of fuel cost are respectively set;

for the corresponding on-off state of the ith gas turbine set in the n mode in the t period, 1 represents that the ith gas turbine set is in the n mode, and 0 represents that the ith gas turbine set is in other modes; m _g The number of the operation modes of the gas turbine set is shown;

the power equality constraint is expressed as equation (36)

In the formula, P _t ^gas The output power of the ith gas turbine set in the n mode at the t time period;

the mode transition constraint is expressed as equation (37)

In the formula, A _m,n A conversion feasibility coefficient representing conversion from the mode m to the mode n;

representing the corresponding on-off state of the ith gas unit in the n mode in the (t-1) time period; if the t-1 period is m-mode, then the t period must be at A _m,n N mode with value 1, which implements the mode conversion constraint;

the conversion cost relaxation expression is (38)

In the formula (I), the compound is shown in the specification,

the mode conversion cost of the ith group of gas turbine units in the t-th period;

is the start-up and shut-down cost of the ith group of gas turbine units when switching from mode m to mode n.

Further, the process of obtaining the optimal scheduling scheme by the heuristic adaptive dynamic programming algorithm is as follows:

first, a reference neural network RNN is trained, the reference neural network structure having N +1 input neurons and N simultaneously _h One hidden layer neuron and 1 output neuron; the n +1 inputs are respectively a state vector and a control vector of each scheduling period, the output is an internal signal, and a hidden layer and an output layer of the reference network are Sigmoid functions;

the training of the reference neural network comprises: a forward calculation process and an error back propagation process for updating a reference network weight matrix; the reverse error propagation process is realized by minimizing errors by using a gradient descent method;

secondly, CNN training is carried out on an evaluation network, wherein the evaluation network structure has N +2 input neurons, N _h One hidden layer neuron and 1 output neuron; n +2 inputs are respectively a state vector, a control vector and an internal vector of a kth scheduling period, the output is an optimal performance index, a hidden layer adopts a Sigmoid function, and an output layer adopts a linear Pureline function;

training of the evaluation network comprises: a forward calculation process and an error back propagation process for updating and evaluating a network weight matrix; the reverse error propagation process is realized by minimizing errors by using a gradient descent method;

finally, a network ANN training is performed, the network structure is provided with N input neurons, N _h One hidden layer neuron and 1 output neuron. The n inputs are respectively state vectors of the kth scheduling period, the output is an optimal scheduling decision, and a Sigmoid function is adopted by the hidden layer and the output layer; the network training is executed by two parts: a forward calculation process and an error back propagation process for updating and executing a network weight matrix; the inverse error propagation process utilizes a gradient descent methodOver-minimization error implementation;

the target representation heuristic dynamic programming structure can estimate the total operation cost and the emission control cost of the power generation system in the scheduling process by training the evaluation network, the execution network and the reference neural network on line; and calculating the optimal value of the objective function through repeated iteration so as to obtain an optimal solution set.

Furthermore, in order to derive the required optimal compromise solution from the optimal solution set, the idea of fuzzy logic is adopted, and a fuzzy membership function is defined to represent the satisfaction degree of each pareto solution corresponding to each objective function, which is expressed as a formula (39)

In the formula (I), the compound is shown in the specification,

as an objective function f _i The value of the degree of membership of (a),

indicating a complete satisfaction of a certain objective,

it is indicated as completely unsatisfactory. f. of _i Is the ith objective function value; f. of _i ^min And f _i ^max Respectively the minimum value and the maximum value of the ith objective function;

the average satisfaction of the kth pareto optimal solution is expressed as formula (40)

In the formula, mu _k And taking the Pareto optimal solution with the maximum average satisfaction degree as a final compromise solution, wherein the average satisfaction degree of the kth Pareto optimal solution is the average satisfaction degree of the kth Pareto optimal solution, and N is the number of objective functions.

The beneficial technical effects of the invention are as follows: aiming at the phenomena that new energy is difficult to be consumed due to insufficient adjusting capacity of a power network, the limitation of independently coordinating and scheduling stored energy or conventional energy and wind power is broken through, a coordinated operation mechanism with multiple sources is provided, a wind-light-stored-gas multi-target combined optimization scheduling model which takes the minimum power generation cost and the minimum emission control cost of a power generation system as optimization targets is established, the model jointly models flexible power supply gas and the energy storage system, a heuristic dynamic programming algorithm is used for solving the multi-target problems, and a training process of the algorithm is given; through wind-light-storage-gas multi-objective combined optimization scheduling of self-adaptive dynamic planning, the impact on stable operation and active power balance of a power system caused by randomness, intermittence and fluctuation of wind power and photovoltaic output and low prediction precision is greatly reduced, the possibility of wind abandoning or light abandoning is reduced, the wind-light-storage-gas integrated output is smoothed, and peak clipping and valley filling are realized. Meanwhile, the system operation cost and the emission treatment cost of the power generation system are greatly reduced, and the system operation benefit is improved, so that the safe, stable and economic operation of the power system is ensured.

Drawings

The invention is further elucidated with reference to the drawings and the embodiments.

FIG. 1 is a schematic diagram of the GrHDP structure of the present invention;

FIG. 2 is a schematic diagram of the RNN structure of the present invention;

FIG. 3 is a schematic diagram of a CNN structure according to the present invention;

FIG. 4 is a schematic diagram of the ANN of the present invention.

Detailed Description

Example 1

A self-adaptive dynamic planning method for multi-objective joint optimization scheduling stores waste heat in gas exhausted by a fuel engine, establishes a power grid scheduling model of a composite system by using a mechanism analysis method, and establishes a multi-objective function according to the minimum power generation cost and the minimum environmental cost; and finally, giving a principle and a target of the adaptive dynamic programming, and representing the process of obtaining the optimal scheduling scheme by a heuristic adaptive dynamic programming algorithm.

When the wind power and the photo-thermal power station form a combined system for power generation, the photo-thermal unit can reduce the uncertainty of the wind power, but the wind and the photo-thermal output power have high randomness and volatility, and an energy storage system of the photo-thermal unit is not enough to enable the whole system to stably supply power to a power grid, so a stable backup power supply with high controllability is needed. A wind-light-storage-gas multi-objective joint optimization scheduling strategy based on self-adaptive dynamic programming is provided. In the analysis of the operation mechanism of each part, considering that the waste heat in the gas exhausted by the gas turbine can be stored, a mechanism analysis method is used for establishing a power grid dispatching model of the composite system, and a multi-objective function is established according to the minimum power generation cost and the minimum environmental cost; finally, the principle of the self-adaptive dynamic programming and the process of obtaining the optimal scheduling scheme by a target representation heuristic self-adaptive dynamic programming algorithm are given.

Example 2

A wind-light-storage-gas multi-objective joint optimization scheduling method based on self-adaptive dynamic programming comprises the following specific implementation processes:

1. establishment of optimized scheduling model

1.1. Objective function

1.1.1. Target 1: minimum cost of power generation

In order to establish an economic dispatching model for realizing the goal of preferential consumption of new energy, an energy penalty term of the new energy is introduced into a target function, namely the product of an energy penalty coefficient and the non-consumption of the new energy is determined as the energy penalty term, and the numerical value of the energy penalty term directly reflects the economic efficiency of a gas turbine set so as to promote the inclination degree of the consumption of the new energy. The objective function is as follows:

in the formula, t in the first term represents an arbitrary period of a scheduling cycle; t is the total time period number of the scheduling period, k ^w Penalty factor for wind curtailment, Δ P _t ^w Is the air abandon quantity in the t time period, delta t ^h Is the total number of hours for any one time period. Second item N _g Total number of gas units, k ^gas The gas is used as the gas price coefficient,

the fuel gas consumption of the ith gas unit in the t period,

the mode conversion cost of the ith gas turbine set in the t period. Item III N _p The total number of the pumped storage units is,

and (4) starting cost of the pumped storage unit i at the moment t under the pumping working condition. Item four wherein N _m Total number of photothermal units, k ^opt Is the unit price of the photo-thermal unit for generating electricity,

the generated power of the ith photo-thermal unit in the tth time period.

1.1.2. Target 2: emission abatement costs for power generation systems are minimized

the total output power of the ith gas turbine set in the t period. The contaminants mainly include the following three types: NO _x 、SO ₂ And CO ₂ The treatment costs for these three pollutants are shown in table 1, where α is the treatment cost for the pollutants and β is the amount of pollutant produced per megawatt hour.

TABLE 1 pollutant remediation costs

The multi-objective function of the whole system can be constructed by the two objective functions as follows:

Z＝m in(f ₁ ,f ₂ ) (3)

1.2 constraint conditions

1.2.1. System constraints

1) Real-time energy balance constraints

In the formula, k _ps If the variable is +/-1, the variable is 1 when the pumped storage unit is in a water discharging working condition, otherwise, the variable is-1;

the output power of the ith pumped storage point station in the t period is obtained; n is a radical of hydrogen _n The total number of the heat storage units of the photo-thermal unit; k is a radical of _cr The variation is +/-1, the heating capacity of the heat storage machine assembly is-1, otherwise, the heating capacity is 1;

and storing the heat quantity of the ith heat storage unit in the t time period. P _t ^w Predicted output power for the wind power generation for the t-th time period; delta P _t ^w The power of the abandoned wind at the t-th moment; d _t The total load of the power grid in the t-th time period;

2) positive and negative standby constraints

In the formula (I), the compound is shown in the specification,

the maximum output power and the minimum output power of the ith gas turbine set in the t-th time period are respectively; r is _t The standby power value of the compound power generation system in the t-th time period.

3) Branch capacity constraint

In the formula (I), the compound is shown in the specification,

is the maximum power that line l can deliver; a represents any node in the power grid; n is a radical of _a The total number of the nodes in the composite power generation system network; p _a,t Represents the power absorbed by node a from the hybrid power generation system during the t-th period;

1.2.2. Pumped storage unit restraint

4) Charge and discharge power constraint of pumped storage unit

In the formula (I), the compound is shown in the specification,

the discharge state of the ith pumping and storage unit in the t period is 0, which represents that the unit is in a charging or running stopping state, otherwise, the discharge state is 1;

the discharge power minimum value of the ith pumped storage unit is obtained;

and the maximum value of the discharge power of the ith pumped storage unit is obtained.

In the formula (I), the compound is shown in the specification,

charging power of the ith pumped storage unit in the t time period;

the charging state of the ith pumped storage unit in the t period is 0, which represents that the unit is in a discharging or running stopping state, otherwise, the charging state is 1; p _i ^ps,c Representing the power of the constant charge of the ith unit.

When economic factors are not ignored, the pumped storage unit generally performs a charging process at constant power;

1) constraint of power equation

In the formula (10), the compound represented by the formula (10),

and the total generated power of the pumped storage unit is represented.

2) Charge and discharge state constraints

In the formula (11), the reaction mixture is,

the discharge state and the charge state of the m and n pumping energy storage units in the t period are shown, and the pumping energy storage units are ensuredThe units are in the same working state, and the condition that the running states of different units are different can not occur.

3) Pumped storage group energy state constraints

In the formula (I), the compound is shown in the specification,

storing the minimum value of energy for the ith pumped storage unit; tau is any one time interval in the past t time interval;

the initial energy of the ith pumped storage unit;

storing the maximum value of energy for the ith pumped storage unit;

4) end energy restraint

1.2.3. Photo-thermal unit constraint

1) Equality constraint of photothermal unit energy flow

Considering the heat transfer fluid in the photothermal unit as a node in an electric network, regardless of the energy loss of the photothermal unit in the heat transfer fluid, the power balance equation can be derived as follows:

a variable 0-1 is started at the moment t by the thermodynamic cycle module, and 0 represents the stop of the operation;

the power consumed by the heat exchange module when it is started.

The output of the photo-thermal unit in the composite power generation system is as follows:

in the formula, P _t ^th,opt Representing the output power of the photo-thermal unit in a t period; eta _SF To the light-to-heat conversion efficiency; s. the _SF The area of a light collecting field of the photo-thermal unit is adopted;

the direct emissivity of the sunlight at the t-th moment;

the power provided by the photo-thermal unit that can be used by the composite system is related to both the input value and the amount of abandoned light:

P _t ^th,S-H ＝P _t ^th,opt -P _t ^th,cut (16)

in the formula, P _t ^th,cut The light abandoning amount in the period t;

the charge/discharge of heat storage systems, which result in different losses of heat, can be characterized by the use of charge/discharge efficiency:

P _t ^th,c ＝η _c P _t ^th,H-T (17)

in the formula, P _t ^th,c And P _t ^th,d The charging power and the discharging power of the heat storage system are respectively in the period t; eta _c For the charging efficiency of the heat storage system, eta _d The heat release efficiency of the heat storage system.

In the heat storage state equation:

in the formula (I), the compound is shown in the specification,

the total energy in the heat storage device is t and t-1 time periods;

the charging and discharging power of the heat storage system is t-1; gamma is a dissipation coefficient; Δ t is the time interval.

After linearization, the following results are obtained:

energy flow of the thermodynamic cycle module:

P _t ^th,H-P ＝g(P _t ^e ) (21)

in the formula (21), P _t ^e Representing the thermal flow cycle bad module electrical power.

2) Inequality constraint of operation of photo-thermal unit

and

maximum and minimum power output values, respectively;

1, representing the starting of the steam turbine set in the working state of the steam turbine set in any time period t; τ represents an arbitrary time within a prescribed time after the t period;

and

and

respectively are start and stop variables of the steam turbine set, and 1 represents that the steam turbine set starts/stops working at the moment t;

and

the maximum up-slope and down-slope capacities of the turboset are respectively.

Minimum energy storage constraint:

in the formula (I), the compound is shown in the specification,

the minimum reserve for the heat storage system; rho ^TES The maximum storage capacity of the heat storage system in FLH (full-load hour);

the charge/discharge power constraints for heat storage are:

P _t ^th,c P _t ^th,d ＝0 (32)

in the formula (I), the compound is shown in the specification,

in order to be the maximum charging power,

is the maximum discharge power.

Other constraints are as follows:

P _t ^th,cut ≥0 (33)

in the formula, P _t ^up And P _t ^down The upper power reserve value and the lower power reserve value of the conventional unit are respectively.

Equations (33) and (34) determine the amount of light discarded and the upper and lower standby nonnegatives of the unit, respectively.

1.2.4. Independent gas restraint

The combined cycle is the most common operation mode in the gas turbine units, the gas turbine unit refers to a combined cycle unit, namely, a plurality of gas turbine units drive a steam turbine unit to jointly operate, and hot waste gas generated by the gas turbine units can be used as power of the steam turbine unit. The gas model adopts a general operation mode that two gas units drive one steam engine, and the general operation mode comprises five seed operation modes: (GT denotes a gas turbine plant, ST denotes a steam turbine plant) 1GT, 1GT +1ST, 2GT and shutdown mode 0GT +0 ST. The combined cycle unit has the following characteristics:

the method has the advantages that constraints exist when different modes are converted, because ST works by utilizing the exhaust gas of GT, GT and ST can not be started and shut down at the same time;

② there is a mode transition cost in the mode transition, because the mode is accompanied by start-stop, the mode transition cost is essentially the start-stop cost.

11) Consumption curve

In the formula (I), the compound is shown in the specification,

the output power of the ith gas turbine set in the n mode in the t period;

the maximum value of the fuel cost corresponding to the output power of the ith gas turbine set in the n mode in the t period;

for the corresponding on-off state of the ith gas unit in the n mode in the t period, 1 represents that the gas unit is in the n mode, and 0 represents that the gas unit is in other modes; m _g The number of the operation modes of the gas unit.

12) Constraint of power equation

In the formula, P _t ^gas And outputting power under the n mode for the ith gas unit in the t period.

13) Mode transition constraints

representing the corresponding on-off state of the ith gas unit in the n mode in the (t-1) time period; if the t-1 period is m-mode, then the t period must be at A _m,n An n-mode with a value of 1, which implements the mode transition constraint.

14) Conversion cost relaxation

In the formula (I), the compound is shown in the specification,

the mode conversion cost of the ith group of gas turbine units in the t-th time period;

GrHDP algorithm and implementation process

The target Representation Heuristic Dynamic Programming (GrHDP) adopted here is developed on the basis of executing the Heuristic Dynamic Programming, does not need to establish a model network, and is suitable for the condition that a system model is difficult to obtain. Target representation heuristic dynamic programming adds a target Network (GN) on the basis of the previous three different networks (model Network, evaluation Network and execution Network), and the specific structure is shown in fig. 1. The idea of GrHDP is to automatically and adaptively generate signals that guide the whole system to perform the processes of on-line learning, optimization and control.

GrHDP implementation process:

1. firstly, initializing, and then setting 'Lfts ═ 1';

2. applying control u (k-1) at the previous moment to a controlled object to obtain a state x (k) at the current moment, and calculating u (k) by x (k) according to ANN;

3. calculating sr (k) from x (k) and u (k) according to RNN, calculating J (k) from x (k), u (k) and sr (k) according to CNN, and calculating the enhancement signal renif (k);

4. modifying the RNN weight, and recalculating sr (k) from x (k) and u (k) according to the RNN;

5. according to CNN, recalculating J (k) by x (k), u (k) and sr (k), and determining whether "E" is satisfied _r <T _r Or cyc>N _ref If yes, entering the step 6, otherwise, returning to the step 4;

6. modifying the weight of CNN, and recalculating J (k) from x (k), u (k) and sr (k) according to CNN, if "E" is satisfied _c <T _c Or cyc>N _crit If not, the step 7 is carried out again, otherwise, the step 6 is carried out again;

7. modifying the weight of ANN, and recalculating u (k) by x (k) according to ANN;

8. recalculating J (k) from x (k), u (k) and sr (k) according to CNN, if "E" is satisfied _a <T _a Or cyc>N _act If not, returning to the step 8;

10. if Lift reaches the maximum value, outputting the result; otherwise, returning to the step 2.

In the target representation heuristic dynamic programming, the evaluation network, the execution network and the reference neural network all adopt BP neural networks, the input of the BP neural networks is the system state, and the output of the BP neural networks is a power distribution scheme of a scheduling process. Through online network training and iterative scheduling, the output of the network tends to be the optimal system power allocation. The network online training process is as follows:

2.2.1. reference Neural Network (RNN) training

As shown in FIG. 2, the reference neural network structure has N +1 input neurons and N simultaneously _h One hidden layer neuron and 1 output neuron. The n +1 inputs are respectively a state vector and a control vector of each scheduling period, the output is an internal signal, and the hidden layer and the output layer of the reference network are Sigmoid functions.

The training of the reference neural network comprises the following parts: a forward calculation process and an error back propagation process for updating the reference network weight matrix. The inverse error propagation process is implemented by minimizing the error using a gradient descent method.

2.2.2. Evaluation network (CNN) training

As shown in FIG. 3, the evaluation network structure has N +2 input neurons, N _h One hidden layer neuron and 1 output neuron. The n +2 inputs are respectively a state vector, a control vector and an internal vector of the kth scheduling period, the output is an optimal performance index, the hidden layer adopts a Sigmoid function, and the output layer adopts a linear Pureline function.

The training of the evaluation network comprises the following parts: a forward calculation process and an error back propagation process for updating and evaluating a network weight matrix. The inverse error propagation process is implemented by minimizing the error using a gradient descent method.

2.2.3. Performing network (ANN) training

The network training is executed by two parts: a forward calculation process and an error back propagation process for updating the execution network weight matrix.

As shown in FIG. 4, the execution network structure has N input neurons, N _h One hidden layer neuron and 1 output neuron. The n inputs are respectively the state vectors of the kth scheduling period, the output is the optimal scheduling decision, and the hidden layer and the output layer both adopt Sigmoid functions. The reverse error propagation process is implemented by minimizing the error using a gradient descent method.

Since the algorithm solution result is the optimal solution set, the scheduling personnel need the optimal scheduling scheme, i.e. the optimal compromise solution. The idea of fuzzy logic is adopted here, and a fuzzy membership function is defined to represent the satisfaction degree of each pareto solution corresponding to each objective function, as shown in equation (39).

In the formula (39), the compound represented by the formula (I),

as an objective function f _i The value of the degree of membership of (a),

indicating a complete satisfaction of a certain objective,

it is indicated as completely unsatisfactory. f. of _i Is the ith objective function value; f. of _i ^min And f _i ^max Respectively, the minimum value and the maximum value of the ith objective function.

Average satisfaction degree:

in the formula (40), mu _k And N is the number of objective functions. And taking the pareto optimal solution with the maximum average satisfaction degree as a final compromise solution.

The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention.

Claims

1. A self-adaptive dynamic programming method for multi-objective joint optimization scheduling is characterized in that: storing waste heat in gas exhausted by a gas turbine, establishing a power grid dispatching model of a composite system by using a mechanism analysis method, and establishing a multi-objective function according to the minimum power generation cost and the minimum environmental cost; finally, the principle and the target of the self-adaptive dynamic programming are given, the process of obtaining the optimal scheduling scheme by a heuristic self-adaptive dynamic programming algorithm is represented, and the establishment of the multi-target function refers to the following steps:

wherein t represents an arbitrary period of the scheduling cycle; t is the total time period number of the scheduling period, k in the first term ^w Penalty factor for wind curtailment, Δ P _t ^w Is the air loss quantity in the t-th time period, delta t ^h Total hours for any one period; second item N _g Total number of gas units, k ^gas The gas is used as the gas price coefficient,

the fuel gas consumption of the ith gas unit in the t period,

the mode conversion cost of the ith gas unit in the t period; item III N _p The total number of the pumped storage units is,

the starting cost of the ith pumped storage unit at the moment t under the power generation working condition,

starting cost of the pumped storage unit i at the moment t under the pumping working condition; item four with N _m Total number of photothermal units, k ^opt Is the unit price of the power generation of the photo-thermal unit,

generating power for the ith photo-thermal unit in the t time period;

the total output power of the ith gas turbine unit in the tth time period;

Z＝min(f ₁ ,f ₂ ) (3)；

and carrying out system constraint on a multi-target function of the whole system, wherein the system constraint comprises real-time energy balance constraint, positive and negative standby constraint and branch capacity constraint, and the real-time energy balance constraint is expressed as a formula (4):

the output power of the ith pumped storage point station in the t period is obtained; n is a radical of _n The total number of the photo-thermal heat storage units; k is a radical of _cr The variation is +/-1, the heat quantity of the heat storage machine assembly is-1, otherwise, the heat quantity is 1;

storing the heat quantity of the ith heat storage unit in the t time period; p _t ^w Predicted output power for wind power generation for a t-th time period; d _t The total load of the power grid in the t-th period;

the positive and negative standby constraints are represented by equations (5), (6),

in the formula (I), the compound is shown in the specification,

the maximum output power and the minimum output power of the ith gas turbine set in the t-th time period are respectively; r _t The standby power value of the compound power generation system in the t-th time period;

the branch capacity constraint is expressed as equation (7),

in the formula (I), the compound is shown in the specification,

the maximum value of power that can be delivered for line l; a represents any node in the power grid; n is a radical of _a The total number of the nodes in the composite power generation system network; p _a,t Represents the power absorbed by node a from the hybrid power generation system during the t-th period;

is the element in the power transfer factor matrix associated with line l node n;

and (3) carrying out pumped storage unit constraint on the multi-objective function of the whole system:

In the formula (I), the compound is shown in the specification,

the discharge power minimum value of the ith pumped storage unit is obtained;

in the formula (I), the compound is shown in the specification,

charging power of the ith pumped storage unit in the t period;

the charging state of the ith pumped storage unit in the t period is 0, which represents that the unit is in a discharging or running stopping state, otherwise, the charging state is 1; p _i ^ps,c The constant charging power of the ith unit is shown, and when economic factors are not ignored, the pumped storage unit is powered onThe charging process is often performed at constant power;

the power equality constraint is expressed as equation (10)

In the formula (I), the compound is shown in the specification,

the total generated power of the pumped storage unit is represented;

the charge-discharge state constraint is expressed by the formula (11)

In the formula (I), the compound is shown in the specification,

the constraint of the energy state of the pumped storage group is expressed as a formula (12)

In the formula (I), the compound is shown in the specification,

the initial energy of the ith pumped storage unit;

storing the maximum value of energy for the ith pumped storage unit;

the final energy constraint is expressed as formula (13)

Carrying out photo-thermal unit constraint on the multi-objective function of the whole system: the equation constraint of the energy flow of the photothermal unit regards the heat transfer fluid in the photothermal unit as a node in an electric network, and the power balance equation of the photothermal unit is expressed as an equation (14) without considering the energy loss of the photothermal unit in the heat transfer fluid

Wherein S represents a light field; h represents a heat transfer fluid; t represents a heat storage module; p represents a thermodynamic cycle module; p _t ^th,S-H 、P _t ^th ^,H-P 、P _t ^th,T-H 、P _t ^th,H-T The heat exchange power among different modules of the photo-thermal unit is respectively;

a variable 0-1 is started at the moment t by the thermodynamic cycle module, and 0 represents that the operation is stopped;

power consumed for startup of the heat exchange module;

the direct emissivity of the sunlight at the t-th moment;

the relationship between the power provided by the photothermal unit used in the composite system and the input value and the amount of light discarded is expressed as formula (16)

P _t ^th,S-H ＝P _t ^th,opt -P _t ^th,cut (16)

In the formula, P _t ^th,cut The light abandoning amount in the period t;

P _t ^th,c ＝η _c P _t ^th,H-T (17)

the heat storage state equation is expressed as formula (19)

In the formula (I), the compound is shown in the specification,

the total energy in the heat storage device is t and t-1 time periods;

linearized yide type (20)

P _t ^th,H-P ＝g(P _t ^e ) (21)

and

maximum and minimum power output values, respectively;

and

and

and

the maximum up-slope and down-slope capacities of the turboset are respectively set;

the minimum energy storage constraint is expressed as formula (29)

In the formula (I), the compound is shown in the specification,

P _t ^th,c P _t ^th,d ＝0 (32)

In the formula (I), the compound is shown in the specification,

in order to be the maximum charging power,

is the maximum discharge power;

other constraints are expressed as equations (33), (34)

P _t ^th,cut ≥0 (33)

In the formula, P _t ^up And P _t ^down The upper power standby value and the lower power standby value of the conventional unit are respectively set; the formula (33) and the formula (34) respectively determine the light abandoning amount and the upper and lower standby nonnegativity of the unit;

the multi-objective function of the whole system is subjected to independent gas constraint, and the consumption curve is expressed as a formula (35)

In the formula (I), the compound is shown in the specification,

the output power of the ith gas turbine set in the n mode in the t period;

the minimum value of the fuel cost corresponding to the output power of the ith gas unit in the n mode in the t period;

for the corresponding on-off state of the ith gas unit in the n mode in the t period, 1 represents that the gas unit is in the n mode, and 0 represents that the gas unit is in other modes; m _g The number of the operation modes of the gas turbine set is shown;

the power equality constraint is expressed as equation (36)

the mode conversion constraint is expressed as equation (37)

representing the corresponding on-off state of the ith gas unit in the n mode in the (t-1) time period; if the t-1 period is m-mode, then the t period must be at A _m,n N mode with value 1, which implements the constraint of mode conversion;

the conversion cost relaxation expression is (38)

In the formula (I), the compound is shown in the specification,

2. The adaptive dynamic programming method for multi-objective joint optimization scheduling of claim 1, wherein: the process of obtaining the optimal scheduling scheme by the heuristic self-adaptive dynamic programming algorithm is as follows:

secondly, CNN training is carried out on an evaluation network, wherein the evaluation network structure has N +2 input neurons, N _h One hidden layer neuron and 1 output neuron; the n +2 inputs are respectively a state vector, a control vector and an internal vector of the kth scheduling period, the output is an optimal performance index, the hidden layer adopts a Sigmoid function, and the output layer adopts a linear Pureline function;

finally, a network ANN training is performed, the network structure is provided with N input neurons, N _h One hidden layer neuron and 1 output neuron, nThe input is the state vector of the kth scheduling period, the output is the optimal scheduling decision, and the hidden layer and the output layer both adopt Sigmoid functions; the network training is executed by two parts: a forward calculation process and an error back propagation process for updating and executing a network weight matrix; the reverse error propagation process is realized by minimizing errors by using a gradient descent method;

3. The method for adaptive dynamic programming for multi-objective joint optimal scheduling of claim 2, wherein: in order to derive the required optimal compromise solution from the optimal solution set, the idea of fuzzy logic is adopted, and a fuzzy membership function is defined to represent the satisfaction degree of each pareto solution corresponding to each objective function, which is expressed as a formula (39)

In the formula (I), the compound is shown in the specification,

as an objective function f _i The value of the degree of membership of (a),

indicating a complete satisfaction of a certain objective,

then it is indicated as being completely unsatisfactory, f _i Is the ith objective function value; f. of _i ^min And f _i ^max Respectively the minimum value and the maximum value of the ith objective function;

the average satisfaction of the kth pareto optimal solution is represented by formula (40)

In the formula, mu _k And taking the Pareto optimal solution with the maximum average satisfaction as a final compromise solution, wherein the average satisfaction of the kth Pareto optimal solution is obtained, and N is the number of objective functions.