CN114239372A - Multi-target unit maintenance double-layer optimization method and system considering unit combination - Google Patents

Multi-target unit maintenance double-layer optimization method and system considering unit combination Download PDF

Info

Publication number
CN114239372A
CN114239372A CN202111536810.5A CN202111536810A CN114239372A CN 114239372 A CN114239372 A CN 114239372A CN 202111536810 A CN202111536810 A CN 202111536810A CN 114239372 A CN114239372 A CN 114239372A
Authority
CN
China
Prior art keywords
unit
power system
target
model
maintenance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111536810.5A
Other languages
Chinese (zh)
Inventor
李远征
郭恒元
黄成�
何尚洋
赵勇
俞耀文
周前
田嘉晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Electric Power Research Institute of State Grid Jiangsu Electric Power Co Ltd
Original Assignee
Huazhong University of Science and Technology
Electric Power Research Institute of State Grid Jiangsu Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology, Electric Power Research Institute of State Grid Jiangsu Electric Power Co Ltd filed Critical Huazhong University of Science and Technology
Priority to CN202111536810.5A priority Critical patent/CN114239372A/en
Publication of CN114239372A publication Critical patent/CN114239372A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/25Design optimisation, verification or simulation using particle-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E40/00Technologies for an efficient electrical power generation, transmission or distribution
    • Y02E40/70Smart grids as climate change mitigation technology in the energy generation sector
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computer Hardware Design (AREA)
  • Economics (AREA)
  • Geometry (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Marketing (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Human Resources & Organizations (AREA)
  • Mathematical Physics (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The invention discloses a multi-target unit maintenance double-layer optimization method and system considering unit combination, and belongs to the technical field of power system optimization. The method comprises the following steps: acquiring unit power generation cost data, unit overhaul data and daily power load data in a power system to be optimized; substituting the data into the multi-target unit maintenance double-layer optimization model considering the unit combination to obtain the multi-target unit maintenance double-layer optimization model considering the unit combination corresponding to the electric power system to be optimized; and solving the multi-target unit maintenance double-layer optimization model to obtain the optimal unit combination, unit maintenance variable, unit output and node electricity price of the power system to be optimized. According to the method, the multi-target unit maintenance double-layer optimization model considering unit combination is established, the total cost of the power system, the reliability of the power system and the node electricity price fluctuation are comprehensively considered, the economic and reliable operation of the power system is realized, and the risk of the node electricity price fluctuation to the power market is reduced.

Description

Multi-target unit maintenance double-layer optimization method and system considering unit combination
Technical Field
The invention belongs to the technical field of power system optimization, and particularly relates to a multi-target unit maintenance double-layer optimization method and system considering unit combination.
Background
In the actual operation process of the power system, the power generation plan and the maintenance plan are mutually influenced and closely related, and both the power generation plan and the maintenance plan essentially achieve the aims of safe and economic operation of a power grid by optimally arranging the on-off state of a unit. In the conventional power system scheduling planning, the generation plan is made based on an inspection plan, which is a constraint condition of the generation plan. The method limits the optimization space of the power generation plan, and the main part of the power generation plan is unit combination scheduling, so that a concept of unit combination and unit overhaul cooperative optimization is provided by scholars. The cooperative optimization makes full use of unit resources, which is beneficial to improving the accuracy and intelligence level of power grid dispatching. Therefore, designing a reasonable and efficient collaborative optimization method aiming at the unit combination and the unit maintenance is a difficult point and a hot point in the related field. The main part of the power generation plan is the unit combination scheduling.
Aiming at the problem of unit combination and unit overhaul cooperative optimization, in the prior art, a single-target model is selected and constructed, such as a total cost minimization target and a social welfare maximization target of a power system, so as to realize the economic operation of the power system. However, safety during the actual operation of the power system, stability of the user's power usage, and reliability of the power system are also critical. Particularly, the node electricity prices in the power market are closely related to the scheduling result of the power system, and the overhaul scheduling result may cause that the electricity prices of some nodes greatly fluctuate in a month, so that the stability of the power market is affected. Therefore, how to establish a multi-target joint cooperative scheduling model with consideration of economy, reliability and stability and design a reasonable and efficient solution optimization method is a problem which needs to be solved urgently at present.
In order to solve the multi-objective optimization model, scholars at home and abroad propose various multi-objective algorithms, wherein the multi-objective particle swarm algorithm is one of the most widely applied multi-objective algorithms. Because the particle swarm algorithm cannot achieve global convergence, scholars such as Sun jun propose a quantum particle swarm algorithm with better solving effect on the basis of the particle swarm algorithm. However, quantum particle swarm optimization relies heavily on the selection of a "contraction-expansion" coefficient, and the performance of the algorithm depends greatly on the coefficient. At present, scholars propose a multi-target quantum particle swarm algorithm with fixed contraction-expansion coefficients and a multi-target quantum particle swarm algorithm with gradually decreased contraction-expansion coefficients, but the algorithms cannot flexibly select optimal coefficients according to the state of a contemporary particle swarm. In the deep Q learning proposed in recent years, the performance of the multi-target quantum particle swarm algorithm can be improved by flexibly selecting a contraction-expansion coefficient. However, how to improve the multi-objective quantum particle swarm optimization by deep Q learning and solve the multi-objective optimization model still is a problem worthy of deep research.
Disclosure of Invention
Aiming at the defects and the improvement requirements of the prior art, the invention provides a multi-target unit maintenance double-layer optimization method and a multi-target unit maintenance double-layer optimization system considering unit combination, and aims to realize the cooperative optimization scheduling of unit maintenance and unit output under the power market environment. The method takes the minimization of the node electricity price fluctuation as a single optimization target, and analyzes the relation between the target and the optimal reliability target and the minimum total cost target of the power system. In order to obtain a better pareto solution set, the invention further provides a multi-target quantum particle swarm algorithm combined with reinforcement learning to solve the multi-target unit overhaul double-layer optimization model.
To achieve the above object, according to a first aspect of the present invention, there is provided a multi-objective unit overhaul double-layer optimization method considering unit combinations, the method including:
s1, acquiring unit power generation cost data, unit maintenance data and daily power load data in a power system to be optimized;
s2, substituting the data into the multi-target unit maintenance double-layer optimization model considering the unit combination to obtain the multi-target unit maintenance double-layer optimization model considering the unit combination corresponding to the power system to be optimized;
s3, solving a multi-target unit maintenance double-layer optimization model considering the unit combination corresponding to the electric power system to be optimized to obtain the optimal unit combination, unit maintenance variable, unit output and node electricity price of the electric power system to be optimized;
the multi-target unit maintenance double-layer optimization model considering unit combination comprises the following steps: an upper layer model and a lower layer model;
the decision variables of the upper model are unit combination, unit overhaul variables, unit output and node electricity price, and the upper model comprises three targets: the method comprises the following steps of minimizing the total cost of the power system, optimizing the reliability of the power system and minimizing the fluctuation of node electricity prices, and comprises two constraints: minimum standby constraint and maximum overhaul unit number constraint;
the decision variable of the lower model is the output of the unit, the objective of the lower model is the minimization of the power generation cost of the power system, and the lower model comprises four constraints: power balance constraint, climbing constraint, unit output upper and lower limit constraint and line tide constraint.
Preferably, each data is encoded by:
Figure BDA0003412853560000031
Figure BDA0003412853560000032
yi(t+1)=max{vi(t+1)-vi(t),0}
wherein x isiA scheduling period, M, indicating the beginning of the overhaul of the ith uniti(t) is 1, which indicates that the ith unit is in a maintenance state in the t scheduling period, Mi(t) is 0, which indicates that the ith unit is not in a maintenance state in the t scheduling period, diThe maintenance continuous scheduling time interval number of the ith unit is shown, the V-shaped mark represents the merging,
Figure BDA0003412853560000033
indicating the output of the ith unit in the t scheduling period,
Figure BDA0003412853560000034
the maximum output of the ith unit is shown,
Figure BDA0003412853560000035
and the minimum output of the ith unit is shown.
Has the advantages that: aiming at a multi-target unit maintenance double-layer optimization model considering unit combination, the invention converts an actual model into a mathematical model by coding a maintenance variable, a unit state variable and a unit starting variable, and further realizes quick and accurate solution on a computer by utilizing programming. Wherein x is used in programmingiThe time interval for starting maintenance of the ith unit is shown, so that the increase of a huge maintenance variable M in the programming can be avoidedi(t), the effect of simplifying the model is achieved.
Preferably, the objective function of the upper model includes:
Figure BDA0003412853560000041
Figure BDA0003412853560000042
Minimize:F2=std(I(t))t=1,2,3,...,Nt
Figure BDA0003412853560000043
Figure BDA0003412853560000044
the constraints of the upper layer model include:
Figure BDA0003412853560000045
Figure BDA0003412853560000046
wherein, F1,F2,F3Respectively representing the total cost of the power system, the reliability of the power system and the fluctuation of node electricity prices, G representing the set of all the units, NtIndicates the number of the scheduling periods,
Figure BDA00034128535600000410
represents the power generation cost of the ith unit, T (t) represents the number of hours of a scheduling period,
Figure BDA0003412853560000047
indicating the starting charge of the ith unit, GmRepresenting the set of units that need to be serviced,
Figure BDA0003412853560000048
indicating maintenance costs of the ith unit, C0i,C1i,C2iThe power generation cost coefficient of the ith unit is shown, std (I (t)) is the standard deviation of I (t), and I (t) isReliability index of power system, PD(t) represents a required power amount of the t-th scheduling period, NjIndicates the number of nodes, Sj(t) represents the node electricity price of the jth node during the tth scheduling period,
Figure BDA0003412853560000049
represents the average node electricity price, R, of the jth node in all scheduling periodsMin(t) represents the minimum net spare capacity of the t-th scheduling period, and K (t) represents the maximum number of the units which can be overhauled simultaneously in the t-th scheduling period.
Has the advantages that: aiming at a multi-target unit maintenance double-layer optimization model considering unit combination, the invention adopts the objective function and the constraint condition as an upper layer model. The objective function considers the minimum total cost of the system, the optimum reliability of the system and the minimum fluctuation of the node electricity price, so that the optimized scheduling solution has three objectives, and the operation of the power system is safer and more stable. The constraint conditions take the net reserve capacity constraint and the unit maintenance and operation constraint into consideration, so that the optimized scheduling solution meets the net reserve capacity requirement of the system, and the unit operation state does not conflict with the unit maintenance state.
Preferably, the objective function of the lower layer model is:
Figure BDA0003412853560000051
the constraints of the underlying model include:
Figure BDA0003412853560000052
Figure BDA0003412853560000053
Figure BDA0003412853560000054
Figure BDA0003412853560000055
wherein F represents the power generation cost of the power system, G represents the set of all the units, and NtIndicates the number of the scheduling periods,
Figure BDA0003412853560000056
represents the power generation cost of the ith unit, T (t) represents the number of hours of a scheduling period, PD(t) represents a required power amount of the t-th scheduling period,
Figure BDA0003412853560000057
represents the maximum descent rate of the ith unit,
Figure BDA0003412853560000058
representing the maximum rise rate, P, of the ith unitl MinRepresents the lower limit of the power flow of the first line, Pl MaxRepresents the upper limit of the current of the first line, Dj(t) represents the required electric quantity of the jth node in the t scheduling period, Gl-iRepresenting the power transfer distribution factor, G, of the line l to the node where the unit i is locatedl-jRepresenting the power transfer distribution factor, N, of line l to node jjIndicating the number of nodes.
Has the advantages that: aiming at a multi-target unit maintenance double-layer optimization model considering unit combination, the invention adopts the objective function and the constraint condition as a lower layer model. The objective function is that the system power generation cost is minimum, and the constraint condition considers the system power balance constraint, the unit operation constraint and the line power flow constraint. In addition, the whole model can be solved only by optimizing the output of the generator set and the node electricity price corresponding to each node according to the lower model.
Preferably, step S3 includes:
s31, taking the unit combination and the unit overhaul variables as positions of particles, and solving an upper layer model in a unit overhaul double-layer optimization scheduling model by adopting an improved multi-target quantum particle swarm algorithm to obtain the unit combination and the unit overhaul variables;
s32, substituting the unit combination and the unit maintenance variables into the lower-layer model, solving by using a Gurobi solver to obtain the unit output, and further calculating the node electricity price;
s33, the output of the unit and the node electricity price are substituted back to an upper layer model, a pareto optimal solution set is solved by using an improved multi-target quantum particle swarm algorithm, and each solution in the solution set consists of a unit combination, a unit maintenance variable, the output of the unit and the node electricity price;
s34, selecting a final scheduling solution from the pareto optimal solution set;
the improved multi-target quantum particle swarm algorithm adopts a deep Q learning method to select the optimal contraction-expansion coefficient required by each generation of evolution in the multi-target quantum particle swarm algorithm.
Has the advantages that: aiming at a multi-target unit maintenance double-layer optimization model considering unit combination, the improved multi-target quantum particle swarm algorithm is adopted to solve the model. According to the algorithm, a reinforcement learning technology is introduced on the basis of the traditional multi-target quantum particle swarm algorithm, so that key parameters in the algorithm are flexibly controlled, the performance of the algorithm is further improved, and a more optimal scheduling solution is found for the model disclosed by the invention.
Preferably, in the deep Q learning method, the state space S is 5 dimensions, the first 3 dimensions represent an average objective function value of the quantum particle swarm in the iteration process, the 4 th dimension represents an average constraint violation value of the quantum particle swarm in the iteration process, and the 5 th dimension is the iteration number; the motion space A is 10 dimensions, the range of the contraction-expansion coefficient in the algorithm is divided into 10 equal parts, each equal-divided region represents one motion in the depth Q learning, and in which region the motion selected by the intelligent agent falls, a random number in the region range is used as the contraction-expansion coefficient during the particle swarm iteration; the reward value R of the intelligent agent is 1 if the global optimal solution of the quantum particle swarm changes in the current iteration, and is-1 if not; the evaluation network is a deep neural network.
Has the advantages that: aiming at the traditional multi-target quantum particle swarm algorithm, the invention introduces a deep Q learning method into the algorithm, one action of deep reinforcement learning corresponds to one interval of a contraction-expansion coefficient in the algorithm, and the best contraction-expansion coefficient in each iteration of the algorithm is selected by selecting the action of the deep reinforcement learning, so that the improved multi-target quantum particle swarm algorithm has better solving effect.
Preferably, step S34 includes:
s341, calculating the weight
Figure BDA0003412853560000071
S342, calculating a standardized matrix vij=ωi(fi +-fij)(fi +-fi -),
S343, calculating a positive ideal solution and a negative ideal solution according to the standardized matrix:
Figure BDA0003412853560000072
s344, calculating the Mahalanobis distance from each solution to the positive ideal solution and the negative ideal solution:
Figure BDA0003412853560000073
s345, calculating the optimal distance ratio of each solution
Figure BDA0003412853560000074
S346, selecting an optimal distance ratio RjThe maximum solution is used as a final scheduling solution;
wherein f isijAn ith objective function value representing the jth pareto solution,
Figure BDA0003412853560000075
j represents the number of pareto solutions.
Has the advantages that: the invention selects the final scheduling solution from the pareto solution set preferably in the mode, can measure a plurality of indexes, meets a plurality of target requirements, and finally provides a reasonable scheduling solution.
To achieve the above object, according to a second aspect of the present invention, there is provided a multi-objective unit overhaul double-layer optimization system considering unit combinations, comprising: a computer-readable storage medium and a processor;
the computer-readable storage medium is used for storing executable instructions;
the processor is configured to read executable instructions stored in the computer-readable storage medium, and execute the multi-objective unit overhaul double-layer optimization method considering unit combination according to the first aspect.
Generally, by the above technical solution conceived by the present invention, the following beneficial effects can be obtained:
aiming at the problem of the combined optimization of unit combination and unit maintenance in the prior art, the invention establishes a multi-target unit maintenance double-layer optimization model considering the unit combination, and comprehensively considers important factors such as the total cost of a power system, the reliability of the power system, the node electricity price volatility and the like, thereby realizing the economic and reliable operation of the power system and simultaneously reducing the risk of the node electricity price fluctuation to the power market.
Drawings
Fig. 1 is a diagram of a multi-objective unit overhaul double-layer optimization model structure considering unit combination provided by the invention.
FIG. 2 is a main loop flow chart of the improved multi-target quantum particle swarm algorithm provided by the invention.
Fig. 3 is a schematic diagram of the daily required electric power of the power system for 30 days according to the embodiment of the present invention.
Fig. 4 is a performance comparison graph of the improved multi-target quantum particle swarm algorithm and the traditional multi-target quantum particle swarm algorithm provided by the embodiment of the present invention, wherein hollow dots represent the hyperbolume index values of the improved multi-target quantum particle swarm algorithm, and solid dots represent the hyperbolume index values of the traditional multi-target quantum particle swarm algorithm.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
The invention provides a multi-target unit maintenance double-layer optimization method considering unit combination, which comprises the following steps:
the method comprises the step S1 of initializing a multi-target unit maintenance double-layer optimization model considering unit combination, wherein the multi-target unit maintenance double-layer optimization model comprises unit power generation cost parameters, unit maintenance parameters and daily power load data of a double-layer optimization scheduling model.
S2, coding variables used in the double-layer optimization scheduling model, wherein the variables comprise the following variables: the method comprises the following steps of generating set output variable, generating set maintenance variable, generating set starting state variable and generating set working state variable.
Output variable of multi-target double-layer optimization model coding using unit
Figure BDA0003412853560000091
Original unit maintenance variable xiMaintenance variable M of machine seti(t), unit working state variable vi(t) and the unit start state variable yi(t), wherein the calculation formulas of the unit overhaul variables, the unit working state variables and the unit starting state variables are as follows:
Figure BDA0003412853560000092
Figure BDA0003412853560000093
yi(t+1)=max{vi(t+1)-vi(t),0} (3)
wherein x isiIndicates the time period, x, during which the i-th unit starts to be overhaulediAnd 3 means that the ith unit starts to be overhauled in 3 days. At MiWhen (t) is 1, the ith unit is in a maintenance state in the tth time period, MiWhen (t) is 0, the ith unit is not in a maintenance state in the tth time period, and diThe V-shaped representation is the maintenance duration period number of the ith machine set,
Figure BDA0003412853560000094
represents the maximum output of the ith unit,
Figure BDA0003412853560000095
representing the minimum output of the ith unit.
And S3, under the electric power market environment, considering constraint conditions such as maintenance constraint, unit operation constraint and electric power system constraint, and constructing a multi-target unit maintenance double-layer optimization model considering unit combination, wherein the target of the upper layer optimization model is the minimum target of the total cost of the electric power system, the optimal target of the reliability of the electric power system and the minimum target of node electricity price fluctuation, and the target of the lower layer optimization model is the minimum target of the power generation cost of the electric power system.
As shown in FIG. 1, the multi-objective two-layer optimization model comprises an upper layer and a lower layer.
The upper model contains three targets: the first objective is to minimize the total cost of the power system, defined in equation (4); the second objective is that the power system reliability is optimal, defined in equation (5), and the standard deviation of the reliability index i (t) is defined in equation (6); the third objective is to minimize node price fluctuations, defined in equation (7). The upper layer model contains two constraints: the minimum reserve constraint and the maximum constraint of the number of the units to be overhauled simultaneously are achieved, the formula (8) ensures that the reserve of each time interval is higher than the minimum reserve of the power system, and the formula (9) ensures that the number of the units to be overhauled simultaneously in the same time interval does not exceed the upper limit.
Figure BDA0003412853560000101
Minimize:F2=std(I(t))t=1,2,3,...,Nt (5)
Figure BDA0003412853560000102
Figure BDA0003412853560000103
Figure BDA0003412853560000104
Figure BDA0003412853560000105
Wherein, PD(t) represents the required electric power amount for the t-th period,
Figure BDA0003412853560000106
the power generation cost of the ith unit, wherein,
Figure BDA0003412853560000107
C0i,C1i,C2irepresents the power generation cost coefficient of the ith unit, T (t) is the number of hours in a period,
Figure BDA0003412853560000108
indicating the start-up cost of the ith unit,
Figure BDA0003412853560000109
representing maintenance costs of the ith unit, G being the set of all units, GmFor sets of units to be serviced, NtIs the number of time periods, NiNumber of units, NjIs the number of nodes, Sj(t) represents the node electricity price of the jth node in the tth period,
Figure BDA00034128535600001010
represents the average node electricity price of the jth node in all periods, RMin(t) is the minimum net spare capacity, R, for the t-th time periodMax(t) represents the maximum net spare capacity in the t-th time period, and K (t) is the maximum number of units which can be overhauled simultaneously in the t-th time period.
The lower layer model is a safety constraint economic dispatching model and is defined in a formula (10) with the aim of minimizing the power generation cost of the power system. The lower layer model comprises power balance constraint, climbing constraint, unit output upper and lower limit constraint and line tide constraint. Equation (11) ensures that the output power of each time interval is equal to the required power, equation (12) ensures that the power variation of the unit in the adjacent time interval does not exceed the climbing rate upper limit, equation (13) represents that the output power of each working unit is between the maximum output and the minimum output, equation (14) represents that the power flow on each line does not exceed the corresponding power flow upper limit, and the defined equations are respectively as follows:
Figure BDA00034128535600001011
Figure BDA0003412853560000111
Figure BDA0003412853560000112
Figure BDA0003412853560000113
Figure BDA0003412853560000114
wherein the content of the first and second substances,
Figure BDA0003412853560000115
representing i-th unitThe maximum rate of decrease is set by the maximum rate of decrease,
Figure BDA0003412853560000116
representing the maximum rise rate, P, of the ith unitl MinRepresents the lower limit of the power flow of the first line, Pl MaxDenotes the upper limit of the current of the first line, Dj(t) represents the required electric quantity of the jth node in the t period, Gl-iRepresenting the power transfer distribution factor, G, of the line l to the node where the unit i is locatedl-jRepresenting the power transfer profile factor of line l to node j.
The output of the unit can be directly obtained by solving the lower layer model, and the node electricity price needs to be obtained through a Lagrange multiplier. And calculating a safety constraint economic dispatching model to obtain a Lagrange multiplier of system power balance constraint and line power flow constraint in each time period, wherein the node electricity price of the node j in the time period t is as follows:
Figure BDA0003412853560000117
wherein λ istLagrange multipliers, which are constraints on the power balance of the system over time period t, L is the number of lines in the system,
Figure BDA0003412853560000118
lagrange multipliers for the line l over time period t line tide limit constraints,
Figure BDA0003412853560000119
and the lagrange multiplier is constrained by the lower limit of the line power flow of the line l in the time period t.
And S4, solving an upper layer model in the unit overhaul double-layer optimization scheduling model by adopting an improved multi-target quantum particle swarm algorithm shown in the figure 2, and solving a lower layer model by using a Gurobi solver to optimize a pareto optimal solution set.
In the multi-target particle swarm algorithm, the encoding variable of each particle determines the position, and the speed of the particle determines the moving direction and distance of the particle. And selecting non-inferior solutions according to the adaptive function values and the constraint violation degrees of the particles, and selecting a multi-target particle swarm global optimal solution from the non-inferior solutions by adopting a self-adaptive grid method.
After a student who is handsome and the like deeply studies the intelligent evolution process of a group, a quantum particle swarm algorithm is provided, the quantum particle behavior is introduced on the basis of the particle swarm algorithm, control parameters are simplified, and only the contraction-expansion coefficient needs to be manually controlled.
The invention introduces reinforcement learning into the multi-target quantum particle swarm algorithm, selects the contraction-expansion coefficient required by each generation of evolution in the multi-target quantum particle swarm algorithm by using the reinforcement learning, and obtains the pareto solution set which is obviously superior to the original multi-target quantum particle swarm algorithm.
The invention introduces a deep Q learning method into a multi-target quantum particle swarm algorithm.
Reinforcement learning is a learning method which interacts with the environment and seeks to maximize environmental benefits. The interactive process is described as a markov decision process, which can be described by a quadruple (S, a, R, P). S is a state space, a is an action space, R: s × A → R is the reward function, P is the transition probability, and for each state S ∈ S and each action a ∈ A, P (S '| S, a) is the probability distribution for taking the action a to transition from state S to state S'. The basic flow of reinforcement learning is as follows: the intelligent agent firstly obtains an initial state from the environment, then selects an action according to an initial strategy, and simultaneously obtains rewards related to the quality degree of the action, and the intelligent agent continuously optimizes the action taken later through the obtained rewards. Throughout the process, the goal of the agent is to obtain an optimal strategy and maximize the cumulative rewards based on that strategy.
In deep Q learning, the action cost function Q (s, a) represents the future cumulative reward that an agent receives after performing action a in state s, according to which the optimal action can be taken in each state, thereby maximizing the reward. To cope with complex environments and continuous state spaces, Q (s, a) often employs a deep neural network (Q-network) to achieve accurate approximations. In addition, in order to promote the deep neural network training effect, deep Q learning also adopts two key technologies: the first key technology is to use a target network (target network) with the same structure as the Q-network. And copying the parameters of the Q-network to a target network every certain iteration number in the training process. The second key technique is to use an experience pool structure in deep Q learning. The empirical data generated in the training process can be stored in an empirical pool D, so that the strong correlation among the data can be broken, and the stable convergence of the algorithm can be ensured.
The invention utilizes a deep Q learning method to create an intelligent agent and automatically searches for an optimal strategy through interaction with an algorithm. The state space S used by the algorithm is 5 dimensions, the first 3 dimensions represent the average objective function value of the quantum particle swarm in the iteration process, the 4 th dimension represents the average constraint violation value of the quantum particle swarm in the iteration process, and the 5 th dimension is the iteration times. Wherein, the constraint violation degree G (x) is calculated as shown in formula (16). The motion space a used by the deep Q learning method is 10 dimensions, and the range of the "contraction-expansion" coefficient in the algorithm is equally divided by 10, and each equally divided region represents one motion a in the deep Q learning. And in which region the action selected by the intelligent agent falls, using a random number in the region as a contraction-expansion coefficient in the particle swarm iteration. The value range of the contraction-expansion coefficient in the invention is 0.4-0.6, if the action selected by the agent is 3, the contraction-expansion coefficient in the iteration is a random number between 0.44-0.46. And regarding the reward value R of the intelligent agent, if the global optimal solution of the quantum particle swarm changes in the current iteration, the reward is 1, and if not, the reward is-1.
Figure BDA0003412853560000131
Wherein p is the number of equality constraints, q is the number of inequality constraints, gm(x) 0 is the mth equality constraint, hm(x) And ≦ 0 is the nth equality constraint.
The deep reinforcement learning training process adopted by the invention is as follows:
1) initializing a maximum training time TRmax, wherein the current training time Train is 1;
2) initializing a deep neural network (Q-network), namely an evaluation network;
3) initializing a target network, wherein parameters of the target network are copied from the Q-network;
4) initializing an experience pool D;
5) initialization Environment (Quantum particle swarm) XiCalculating the objective function value and constraint violation value of each particle to obtain the initial optimal position and the maximum iteration number TmaxThe current iteration time t is 1;
6)forTrain=1:TRmax
7) obtaining an initialization state st
8)fort=1:Tmax
9) Inputting the current state into the evaluation network, outputting the reward value, and selecting the action a with the maximum reward valuet
10) Executing corresponding action in environment, namely selecting 'contraction-expansion' coefficient corresponding to the action by quantum particle group for population iteration, and obtaining reward rtAnd the next state at+1
11) Will quadruple(s)t,at,rt,at+1) Storing the experience into an experience pool D;
12)End
13) train every T1Selecting a minimum batch of quadruples from the D to Train the evaluation network to update the parameters, and Train every T2(4T1) The iteration times are used for copying the evaluation network parameters to the target network;
14)End
15) and storing the trained deep neural network.
Training the neural network in the deep Q learning until a satisfactory effect is obtained (the reward value is positive most of the time), and then inputting the times of current iteration, the average objective function value and the average constraint violation value into the trained deep neural network so as to select the optimal 'contraction-expansion' coefficient in the iteration of the multi-target quantum particle swarm algorithm. The main loop flow chart of the improved multi-target quantum particle swarm algorithm is shown in figure 3.
And S5, selecting a final scheduling solution from the pareto solution set. The method comprises the following steps:
(1) calculating weights
Figure BDA0003412853560000141
(2) Calculating a normalized matrix vij=ωi(fi +-fij)(fi +-fi -),
(3) Calculating a positive ideal solution and a negative ideal solution from the normalized matrix:
Figure BDA0003412853560000142
(4) the mahalanobis distance of each solution to the positive ideal solution and the negative ideal solution is calculated:
Figure BDA0003412853560000143
(5) calculating an optimal distance ratio for each solution
Figure BDA0003412853560000144
(6) Selecting an optimal distance ratio RjThe maximum solution is used as a final scheduling solution;
wherein f isijAn ith objective function value representing the jth pareto solution,
Figure BDA0003412853560000151
j represents the number of pareto solutions.
Examples
In this embodiment, the power system has 118 nodes and 32 units, wherein 10 units need to be overhauled, a scheduling period is set to be one day, and the power system performs overhaul scheduling and economic scheduling within 30 days. The population size and the maximum iteration number of the improved multi-target quantum particle swarm algorithm are respectively set to be 100 and 100. The unit parameters used in the power system are shown in table 1. The time interval length is set to one day, and the maintenance time of each unit is shown in table 2. The load demand in the power system is shown in figure 3.
TABLE 1
Figure BDA0003412853560000152
Figure BDA0003412853560000161
TABLE 2
Figure BDA0003412853560000162
In order to compare the advantages and disadvantages of the improved multi-target quantum particle swarm algorithm and the traditional multi-target quantum particle swarm algorithm, a Hypervolume index is introduced to compare the advantages and disadvantages of the two algorithms. The Hypervolume index evaluation method was first proposed by ziegler et al, which represents the volume of a hypercube enclosed by individuals in a pareto solution set and a reference point in a target space. The Hypervolume index evaluation method can intuitively judge the advantages and disadvantages of pareto solution sets obtained by the algorithm, and if one solution set S is superior to the other solution set S ', the Hypervolume index of the solution set S is also greater than that of the solution set S'. The Hypervolume indexes of the two algorithms are plotted along with the change of particle swarm evolutionary algebra, and the Hypervolume indexes are shown in a figure 4. The hollow dots represent the Hypervolume index obtained by the improved multi-target quantum particle swarm algorithm, and the solid dots represent the Hypervolume index obtained by the traditional multi-target quantum particle swarm algorithm.
And analyzing the advantages and the disadvantages of the multi-target unit maintenance double-layer optimization model according to the final scheduling solution. The advantages and the disadvantages of the model provided by the invention can be obtained by comparing the unit combination and unit maintenance single-target combined optimization model provided by other documents with the multi-target unit maintenance double-layer optimization model considering the unit combination provided by the invention. Three objective function values of the final scheduling solution calculated by the two models are calculated, see table 3. Wherein the objective function F1 represents the minimum total cost, the objective function F2 represents the optimal reliability of the power system, and the objective function F3 represents the minimum fluctuation of the node electricity price.
It can be clearly found that although the total cost of the optimized power system is slightly larger than the result obtained by the single-target optimization model, the optimal objective function value of the reliability of the power system and the minimum objective function value of the node electricity price fluctuation are greatly superior to the result obtained by the single-target optimization model, so that the double-layer optimization model provided by the invention can effectively improve the reliability of the power system and reduce the node electricity price fluctuation although some economic benefits are lost, and has certain practical application value.
TABLE 3
Figure BDA0003412853560000171
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (8)

1. A multi-target unit maintenance double-layer optimization method considering unit combination is characterized by comprising the following steps:
s1, acquiring unit power generation cost data, unit maintenance data and daily power load data in a power system to be optimized;
s2, substituting the data into the multi-target unit maintenance double-layer optimization model considering the unit combination to obtain the multi-target unit maintenance double-layer optimization model considering the unit combination corresponding to the power system to be optimized;
s3, solving a multi-target unit maintenance double-layer optimization model considering the unit combination corresponding to the electric power system to be optimized to obtain the optimal unit combination, unit maintenance variable, unit output and node electricity price of the electric power system to be optimized;
the multi-target unit maintenance double-layer optimization model considering unit combination comprises the following steps: an upper layer model and a lower layer model;
the decision variables of the upper model are unit combination, unit overhaul variables, unit output and node electricity price, and the upper model comprises three targets: the method comprises the following steps of minimizing the total cost of the power system, optimizing the reliability of the power system and minimizing the fluctuation of node electricity prices, and comprises two constraints: minimum standby constraint and maximum overhaul unit number constraint;
the decision variable of the lower model is the output of the unit, the objective of the lower model is the minimization of the power generation cost of the power system, and the lower model comprises four constraints: power balance constraint, climbing constraint, unit output upper and lower limit constraint and line tide constraint.
2. The method of claim 1, wherein each data is encoded by:
Figure FDA0003412853550000011
Figure FDA0003412853550000012
yi(t+1)=max{vi(t+1)-vi(t),0}
wherein x isiA scheduling period, M, indicating the beginning of the overhaul of the ith uniti(t) is 1, which indicates that the ith unit is in a maintenance state in the t scheduling period, Mi(t) is 0, which indicates that the ith unit is not in a maintenance state in the t scheduling period, diThe number of the maintenance continuous scheduling periods of the ith unit is shown, V is the merging,
Figure FDA0003412853550000021
indicating the output of the ith unit in the t scheduling period,
Figure FDA0003412853550000022
the maximum output of the ith unit is shown,
Figure FDA0003412853550000023
and the minimum output of the ith unit is shown.
3. The method of claim 2, wherein the objective function of the upper layer model comprises:
Minimize:
Figure FDA0003412853550000024
Figure FDA0003412853550000025
Minimize:F2=std(I(t))t=1,2,3,...,Nt
Figure FDA0003412853550000026
Figure FDA0003412853550000027
the constraints of the upper layer model include:
Figure FDA0003412853550000028
Figure FDA0003412853550000029
wherein, F1,F2,F3Respectively representing the total cost of the power system, the reliability of the power system and the fluctuation of node electricity prices, G representing the set of all the units, NtIndicates the number of the scheduling periods,
Figure FDA00034128535500000213
represents the power generation cost of the ith unit, T (t) represents the number of hours of a scheduling period,
Figure FDA00034128535500000210
indicating the starting charge of the ith unit, GmRepresenting the set of units that need to be serviced,
Figure FDA00034128535500000211
indicating maintenance costs of the ith unit, C0i,C1i,C2iRepresenting the power generation cost coefficient of the ith unit, std (I (t)) representing the standard deviation of I (t), I (t) being the reliability index of the power system, PD(t) represents a required power amount of the t-th scheduling period, NjIndicates the number of nodes, Sj(t) represents the node electricity price of the jth node during the tth scheduling period,
Figure FDA00034128535500000212
represents the average node electricity price, R, of the jth node in all scheduling periodsMin(t) represents the minimum net spare capacity of the t-th scheduling period, and K (t) represents the maximum number of the units which can be overhauled simultaneously in the t-th scheduling period.
4. The method of claim 2, wherein the objective function of the underlying model is:
Figure FDA0003412853550000031
the constraints of the underlying model include:
Figure FDA0003412853550000032
Figure FDA0003412853550000033
Figure FDA0003412853550000034
Figure FDA0003412853550000035
wherein F represents the power generation cost of the power system, G represents the set of all the units, and NtIndicates the number of the scheduling periods,
Figure FDA00034128535500000310
represents the power generation cost of the ith unit, T (t) represents the number of hours of a scheduling period, PD(t) represents a required power amount of the t-th scheduling period,
Figure FDA0003412853550000036
represents the maximum descent rate of the ith unit,
Figure FDA0003412853550000037
represents the maximum rising rate of the ith unit,
Figure FDA0003412853550000038
represents the lower limit of the power flow of the ith line,
Figure FDA0003412853550000039
denotes the l-th barUpper tidal current limit of the line, Dj(t) represents the required electric quantity of the jth node in the t scheduling period, Gl-iRepresenting the power transfer distribution factor, G, of the line l to the node where the unit i is locatedl-jRepresenting the power transfer distribution factor, N, of line l to node jjIndicating the number of nodes.
5. The method of claim 1, wherein step S3 includes:
s31, taking the unit combination and the unit overhaul variables as positions of particles, and solving an upper layer model in a unit overhaul double-layer optimization scheduling model by adopting an improved multi-target quantum particle swarm algorithm to obtain the unit combination and the unit overhaul variables;
s32, substituting the unit combination and the unit maintenance variables into the lower-layer model, solving by using a Gurobi solver to obtain the unit output, and further calculating the node electricity price;
s33, the output of the unit and the node electricity price are substituted back to an upper layer model, a pareto optimal solution set is solved by using an improved multi-target quantum particle swarm algorithm, and each solution in the solution set consists of a unit combination, a unit maintenance variable, the output of the unit and the node electricity price;
s34, selecting a final scheduling solution from the pareto optimal solution set;
the improved multi-target quantum particle swarm algorithm adopts a deep Q learning method to select the optimal contraction-expansion coefficient required by each generation of evolution in the multi-target quantum particle swarm algorithm.
6. The method of claim 5, wherein in the deep Q learning method, the state space S is 5 dimensions, the first 3 dimensions represent the average objective function values of the quantum particle swarm during the iteration, the 4 th dimension represents the average constraint violation values of the quantum particle swarm during the iteration, and the 5 th dimension is the iteration number; the motion space A is 10 dimensions, the range of the contraction-expansion coefficient in the algorithm is divided into 10 equal parts, each equal-divided region represents one motion in the depth Q learning, and in which region the motion selected by the intelligent agent falls, a random number in the region range is used as the contraction-expansion coefficient during the particle swarm iteration; the reward value R of the intelligent agent is 1 if the global optimal solution of the quantum particle swarm changes in the current iteration, and is-1 if not; the evaluation network is a deep neural network.
7. The method of claim 5, wherein step S34 includes:
s341, calculating the weight
Figure FDA0003412853550000041
S342, calculating a standardized matrix vij=ωi(fi +-fij)(fi +-fi -),
S343, calculating a positive ideal solution and a negative ideal solution according to the standardized matrix:
Figure FDA0003412853550000042
s344, calculating the Mahalanobis distance from each solution to the positive ideal solution and the negative ideal solution:
Figure FDA0003412853550000043
s345, calculating the optimal distance ratio of each solution
Figure FDA0003412853550000044
S346, selecting an optimal distance ratio RjThe maximum solution is used as a final scheduling solution;
wherein f isijAn ith objective function value representing the jth pareto solution,
Figure FDA0003412853550000045
j represents the number of pareto solutions.
8. A multi-objective unit overhaul double-layer optimization system considering unit combination is characterized by comprising: a computer-readable storage medium and a processor;
the computer-readable storage medium is used for storing executable instructions;
the processor is used for reading executable instructions stored in the computer readable storage medium and executing the multi-target unit overhaul double-layer optimization method considering unit combination according to any one of claims 1 to 7.
CN202111536810.5A 2021-12-15 2021-12-15 Multi-target unit maintenance double-layer optimization method and system considering unit combination Pending CN114239372A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111536810.5A CN114239372A (en) 2021-12-15 2021-12-15 Multi-target unit maintenance double-layer optimization method and system considering unit combination

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111536810.5A CN114239372A (en) 2021-12-15 2021-12-15 Multi-target unit maintenance double-layer optimization method and system considering unit combination

Publications (1)

Publication Number Publication Date
CN114239372A true CN114239372A (en) 2022-03-25

Family

ID=80756518

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111536810.5A Pending CN114239372A (en) 2021-12-15 2021-12-15 Multi-target unit maintenance double-layer optimization method and system considering unit combination

Country Status (1)

Country Link
CN (1) CN114239372A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116451876A (en) * 2023-06-15 2023-07-18 国网江西省电力有限公司信息通信分公司 Power distribution network fault prediction and active overhaul system based on artificial intelligence
CN117077368A (en) * 2023-07-07 2023-11-17 华中科技大学 Comprehensive energy system crowd target planning method considering industrial comprehensive demand response

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116451876A (en) * 2023-06-15 2023-07-18 国网江西省电力有限公司信息通信分公司 Power distribution network fault prediction and active overhaul system based on artificial intelligence
CN116451876B (en) * 2023-06-15 2023-09-22 国网江西省电力有限公司信息通信分公司 Power distribution network fault prediction and active overhaul system based on artificial intelligence
CN117077368A (en) * 2023-07-07 2023-11-17 华中科技大学 Comprehensive energy system crowd target planning method considering industrial comprehensive demand response
CN117077368B (en) * 2023-07-07 2024-02-06 华中科技大学 Comprehensive energy system crowd target planning method considering industrial comprehensive demand response

Similar Documents

Publication Publication Date Title
CN112529256B (en) Multi-uncertainty-considered distributed power supply cluster day-ahead scheduling method and system
Cai et al. A multi-objective chaotic particle swarm optimization for environmental/economic dispatch
CN112465181A (en) Two-stage optimization scheduling method supporting source-network-load-storage multi-element ubiquitous coordination
CN107769237B (en) Multi-energy system coordinated dispatching method and device based on electric car access
CN114239372A (en) Multi-target unit maintenance double-layer optimization method and system considering unit combination
Huang et al. A control strategy based on deep reinforcement learning under the combined wind-solar storage system
Cau et al. A co-evolutionary approach to modelling the behaviour of participants in competitive electricity markets
CN110516843A (en) A kind of virtual plant capacity optimization method, equipment and system
CN111401664A (en) Robust optimization scheduling method and device for comprehensive energy system
CN112952847B (en) Multi-region active power distribution system peak regulation optimization method considering electricity demand elasticity
Zhou et al. Deep learning-based rolling horizon unit commitment under hybrid uncertainties
CN113326994A (en) Virtual power plant energy collaborative optimization method considering source load storage interaction
CN112200348A (en) Regional comprehensive energy system multi-target operation decision method considering comprehensive demand response
CN115423207A (en) Wind storage virtual power plant online scheduling method and device
CN115409645A (en) Comprehensive energy system energy management method based on improved deep reinforcement learning
CN116090730A (en) Virtual power plant load optimal scheduling method and system based on excitation demand response
CN114123256B (en) Distributed energy storage configuration method and system adapting to random optimization decision
CN115514014A (en) Novel power system flexibility resource supply and demand game optimization scheduling method containing high-proportion wind power
CN115795992A (en) Park energy Internet online scheduling method based on virtual deduction of operation situation
Yin et al. Deep Stackelberg heuristic dynamic programming for frequency regulation of interconnected power systems considering flexible energy sources
CN114462854A (en) Hierarchical scheduling method and system containing new energy and electric vehicle grid connection
CN113972645A (en) Power distribution network optimization method based on multi-agent depth determination strategy gradient algorithm
CN111652413B (en) Industrial power load prediction method based on multi-Agent distributed mass data processing
CN115936265B (en) Robust planning method for electric hydrogen energy system by considering electric hydrogen coupling
CN110599032A (en) Deep Steinberg self-adaptive dynamic game method for flexible power supply

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination