CN110635476B - Knowledge migration-based cross-regional interconnected power grid dynamic scheduling rapid optimization method - Google Patents

Knowledge migration-based cross-regional interconnected power grid dynamic scheduling rapid optimization method Download PDF

Info

Publication number
CN110635476B
CN110635476B CN201910932990.5A CN201910932990A CN110635476B CN 110635476 B CN110635476 B CN 110635476B CN 201910932990 A CN201910932990 A CN 201910932990A CN 110635476 B CN110635476 B CN 110635476B
Authority
CN
China
Prior art keywords
power
region
time
knowledge
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910932990.5A
Other languages
Chinese (zh)
Other versions
CN110635476A (en
Inventor
唐昊
金国平
吕凯
王珂
王刚
杨胜春
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN201910932990.5A priority Critical patent/CN110635476B/en
Publication of CN110635476A publication Critical patent/CN110635476A/en
Application granted granted Critical
Publication of CN110635476B publication Critical patent/CN110635476B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/04Circuit arrangements for ac mains or ac distribution networks for connecting networks of the same frequency but supplied from different sources
    • H02J3/06Controlling transfer of power between connected networks; Controlling sharing of load between connected networks
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/38Arrangements for parallely feeding a single network by two or more generators, converters or transformers
    • H02J3/46Controlling of the sharing of output between the generators, converters, or transformers

Landscapes

  • Engineering & Computer Science (AREA)
  • Power Engineering (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The invention discloses a knowledge migration-based method for rapidly optimizing dynamic scheduling of a trans-regional interconnected power grid, which comprises the steps of firstly, in a source task pre-learning stage, storing an optimal knowledge matrix after optimization of each source task into a knowledge base as experience knowledge; then, in a target task learning stage, a source task with the highest similarity to the target task is obtained, and the optimal knowledge matrix is migrated to obtain an initial knowledge matrix of the target task, so that the target task is quickly optimized; and finally, storing the optimal knowledge matrix of the target task as experience knowledge in a knowledge base. Under the obtained strategy, the scheduling mechanism can select a reasonable action scheme according to the actual running state of the power grid at the scheduling moment, so as to realize the dynamic scheduling of the cross-region interconnected power grid. The mechanism of layered learning and knowledge migration in the invention can avoid the problem of dimension disaster of reinforcement learning to a certain extent, accelerate the convergence speed of the algorithm and promote the rapid solution of the scheduling strategy.

Description

Knowledge migration-based cross-regional interconnected power grid dynamic scheduling rapid optimization method
Technical Field
The invention belongs to the field of cross-region interconnected power grid dispatching, and particularly relates to a knowledge migration-based cross-region interconnected power grid dynamic dispatching rapid optimization method.
Background
The cross-regional power grid interconnection is one of important means for realizing the national optimal allocation of resources and improving the utilization efficiency, the cross-provincial and cross-regional interconnected power grids are constructed, the various benefits of surplus and shortage conditioning, resource optimal allocation, standby sharing, accident support and the like of a large power grid can be fully exerted, and the consumption level of new energy can be greatly improved.
The existing research on the joint optimization of the junctor between areas and the units in the areas of the cross-regional interconnected power grid system is few, although some researches apply reinforcement learning to the solution of the junctor transmission plan of the cross-regional interconnected power grid, the problem of 'dimension disaster' caused by the continuous expansion of the problem scale is not considered. In addition, the traditional reinforcement learning method considers that different learning tasks are independent of each other, and needs to perform re-modeling and re-solving aiming at different tasks, but in fact, different learning tasks are often related to each other.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a method for quickly optimizing the dynamic scheduling of a trans-regional interconnected power grid based on knowledge migration. By utilizing a layered Q learning algorithm, a huge knowledge matrix required by original cooperative scheduling is decomposed into a plurality of smaller knowledge matrices, so that the number of state action pairs can be reduced, and the problem of dimension disaster is avoided to a certain extent. In addition, the invention provides a method for measuring the similarity between the source scheduling task and the target scheduling task by digging the relation between the tasks, and provides a mechanism of knowledge migration based on the method, so that the learning of the target task can be accelerated by using the past learning experience, the convergence speed of the algorithm is accelerated, and the learning cost is reduced.
The invention adopts the following technical scheme for solving the technical problems:
a method for quickly optimizing the dynamic scheduling of a trans-regional interconnected power grid based on knowledge migration is carried out according to the following steps:
step 1, in a multi-region cross-regional interconnected power grid connected by a direct-current tie line, assuming that a wind power output power predicted value of a region z at any time t in a dispatching day is
Figure GDA0002792886730000011
The predicted value of the photovoltaic output power is
Figure GDA0002792886730000012
The predicted value of the total power demand of the load is
Figure GDA0002792886730000021
Step 2, determining the actual value of wind power output in the region z
Figure GDA0002792886730000022
At time t relative to the predicted value
Figure GDA0002792886730000023
State class of output deviation power
Figure GDA0002792886730000024
Step 3, determining the actual value of the photovoltaic power generation output in the area z
Figure GDA0002792886730000025
At time t relative to the predicted value
Figure GDA0002792886730000026
State class of output deviation power
Figure GDA0002792886730000027
Step 4, determining the actual power P of the load demand at the moment t in the area z before DLC implementationl z,tRelative to predicted value
Figure GDA0002792886730000028
State level of load demand deviation power
Figure GDA0002792886730000029
And load power cut-off level within decision period k at DLC implementation
Figure GDA00027928867300000210
The load demand power of the area z at any time t in the decision period k after the DLC is implemented
Figure GDA00027928867300000211
Can be characterized by formula (1):
Figure GDA00027928867300000212
wherein the content of the first and second substances,
Figure GDA00027928867300000213
power is removed for the DLC load in decision period k for region z,
Figure GDA00027928867300000214
DLC load bounce power for region z within decision period k;
step 5, determining the power adjustment level of I, II-type thermal generator sets in the area z at the time t to be
Figure GDA00027928867300000215
Figure GDA00027928867300000216
And real-time generated power rating
Figure GDA00027928867300000217
Obtaining the real-time generating power of the III-class thermal power generating unit through a regional power balance formula;
step 6, determining the power adjustment level of the cross-regional interconnected network inter-regional tie line l at the moment t
Figure GDA00027928867300000218
And power transmission class
Figure GDA00027928867300000219
Step 7, determining the cross-region interconnected power grid system at decision time tkThe upper and lower layer states and actions, the upper layer state can be characterized by formula (2):
Figure GDA00027928867300000220
wherein the content of the first and second substances,
Figure GDA00027928867300000221
deciding time t for region zkThe status information of the state,
Figure GDA00027928867300000222
Figure GDA00027928867300000223
determining a time t for a DC linkkA lower power level; z is the total number of the cross-regional interconnected power grid system regions, and L is the total number of the inter-region tie lines. The upper layer action may be characterized by formula (3):
Figure GDA0002792886730000031
the lower layer state can be characterized by formula (4):
Figure GDA0002792886730000032
wherein L iszFor the total number of tie lines connected to the zone z in the cross-regional interconnected grid system,
Figure GDA0002792886730000033
for direct current links l connected to the zone zzThe transmission power level of. The underlying action may be characterized by equation (5):
Figure GDA0002792886730000034
step 8, determining the total cost generated by the upper layer and the lower layer of the cross-region interconnected power grid system in the decision period k
Figure GDA0002792886730000035
Determining the optimization targets of the upper layer and the lower layer of the system;
step 9, performing pre-learning on the source task by adopting a hierarchical Q learning algorithm, which comprises the following specific steps:
9.1, initializing the upper knowledge matrix Q of the cross-region interconnected power grid systemupAnd underlying knowledge matrices of regions
Figure GDA0002792886730000036
9.2, initializing system model parameters and learning parameters;
step 9.3, initializing the current learning step number m to be 0, and the current decision period k to be 0;
step 9.4, determining the upper-layer state of the system at the current decision time
Figure GDA0002792886730000037
Step 9.5, the upper layer is according to QupAnd greedy strategy, selecting decision time tkAct of
Figure GDA0002792886730000038
Step 9.6, lower zone z receives upper actions
Figure GDA0002792886730000039
Determining the decision time tkState of (1)
Figure GDA00027928867300000310
Step 9.7, lower zone z is based on
Figure GDA00027928867300000311
And greedy strategy, selecting decision time tkAct of
Figure GDA00027928867300000312
Step 9.8, calculating the cost of the lower layer region z in the decision period k
Figure GDA00027928867300000313
Simultaneous update of knowledge matrix of underlying region z
Figure GDA00027928867300000314
Step 9.9, the cost of each lower layer in the decision period k
Figure GDA00027928867300000315
Feeding back to the upper layer, and calculating to obtain the total cost of the upper layer
Figure GDA00027928867300000316
Updating the upper knowledge matrix Qup
Step 9.10, making k ═ k + 1; if K is less than the total number K of the decision periods, returning to the step 9.4; otherwise, making k equal to 0;
step 9.11, making m: ═ m + 1; if M is less than the total learning step number M, updating the learning rate and returning to the step 9.4; otherwise, ending the program, and storing the source load power prediction information of the source task and the optimal knowledge matrix in the step 9 into a knowledge base as experience knowledge;
step 10: and (3) quickly optimizing the target task by adopting a knowledge migration-based hierarchical Q learning algorithm:
step 10.1, defining the net load prediction power as a similarity element, calculating the similarity distance between the net load prediction power of the target task and the net load prediction power of the source task in each area, and measuring the similarity between the target task and the source task according to the similarity distance;
step 10.2, the source task with the minimum similar distance to the target task is used for migration, each knowledge matrix of the target task is initialized, and then an upper knowledge matrix Q of the target taskupAnd the lower layer region z knowledge matrix
Figure GDA0002792886730000041
Can be characterized by the formulae (6), (7):
Figure GDA0002792886730000042
Figure GDA0002792886730000043
wherein the content of the first and second substances,
Figure GDA0002792886730000044
respectively an upper-layer optimal knowledge matrix and a lower-layer region z optimal knowledge matrix of the minimum similar distance source task;
and step 10.3, initializing model parameters and learning parameters of the cross-region interconnected power grid scheduling optimization target task, and realizing rapid optimization of the target task, wherein the steps are the same as the steps 9.3-9.11, and are not repeated.
The method for rapidly optimizing the dynamic scheduling of the cross-regional interconnected power grid based on knowledge migration is characterized in that in the step 10.1, the similarity distance between the source task and the target task is calculated according to the following steps:
step 1, reflecting difference information between specific numerical values of a time sequence by using Euclidean distance:
in the operation of a power grid, in consideration of uncertainty of loads in a region and intermittent randomness of new energy power generation, a concept of net load is introduced, intermittent new power generation is considered as reversed load, namely the net load in the region is the total load minus the total new energy power generation output, and in a source task psi and a target task phi, the net load predicted power of a region z at a time t can be respectively represented as an equation (8) and an equation (9):
Figure GDA0002792886730000045
Figure GDA0002792886730000046
wherein the content of the first and second substances,
Figure GDA0002792886730000047
respectively predicting the net load power of the region z in the source task psi and the target task phi at the time t;
Figure GDA0002792886730000048
respectively in the central region of the source task psiThe load demand predicted power, the wind power predicted power and the photovoltaic power generation predicted power of the domain z at the moment t;
Figure GDA0002792886730000049
respectively predicting the load demand predicted power, the wind power predicted power and the photovoltaic power generation predicted power of the region z in the target task phi at the moment t;
are respectively paired
Figure GDA0002792886730000051
Sampling is performed assuming a time series length of NsIf the sampling interval Δ T is equal to T/NsTo obtain two time sequences
Figure GDA0002792886730000052
Characterized by the formulae (10), (11), respectively:
Figure GDA0002792886730000053
Figure GDA0002792886730000054
time series
Figure GDA0002792886730000055
And
Figure GDA0002792886730000056
the euclidean distance between can be characterized by equation (12):
Figure GDA0002792886730000057
step 2, reflecting time series trend and fluctuation information by using the dynamic time bending distance:
respectively carrying out power derivative functions on an endogenous task psi and a target task phi net load in a z region at a sampling interval delta t
Figure GDA0002792886730000058
Sampling to obtain two time sequences
Figure GDA0002792886730000059
Can be characterized by formulas (13), (14), respectively:
Figure GDA00027928867300000510
Figure GDA00027928867300000511
construction of Ns×NsOf (a), the elements within the matrix being characterized by equation (15):
Figure GDA00027928867300000512
the set of each set of adjacent elements in matrix Γ is referred to as a curved path, denoted as H ═ H1,…,hs,…,hmWhere m is the total number of elements in the path, element hsIs the coordinates of the s-th point on the path. The objective of the dynamic time warping algorithm is to find an optimal warped path, such that the sequence
Figure GDA00027928867300000513
And
Figure GDA00027928867300000514
is minimized and can be characterized by equation (16):
Figure GDA00027928867300000515
wherein
Figure GDA00027928867300000516
For minimum total cost of bending, i.e. time series
Figure GDA00027928867300000517
And
Figure GDA00027928867300000518
dynamic time warping distance between;
step 3, calculating the similar distance between the target task and the source task based on the Euclidean distance and the dynamic time bending distance of the net load prediction power of the target task and the source task in each region
Figure GDA00027928867300000519
Can be characterized by formula (17):
Figure GDA0002792886730000061
wherein λ ise、λdThe weighting coefficients of the euclidean distance and the dynamic time warping distance, respectively.
Aiming at the problem of cross-region interconnected power grid scheduling, the machine learning algorithm is applied to the field of power scheduling optimization, an intelligent solution can be provided for power scheduling, and economic and environment-friendly operation of a power grid is realized. Compared with the prior art, the invention has the beneficial effects that:
1. aiming at the problem of cross-regional interconnected power grid scheduling, the randomness of both sides of a source load is considered, a flexible load is used as a schedulable resource for collaborative optimization, and a strategy is solved through a Q learning algorithm;
2. the invention adopts a layered Q learning algorithm, reduces the scale of the knowledge matrix and can avoid dimension disaster to a certain extent;
3. the invention adopts a knowledge migration mechanism, utilizes the past learning experience, accelerates the learning optimization of the target task, can accelerate the convergence speed of the algorithm and reduces the learning cost.
Drawings
Fig. 1 is a schematic diagram of a cross-regional interconnected power grid system architecture according to the present invention;
fig. 2 is an algorithm flowchart for solving the problem of dynamic scheduling of the cross-regional interconnected power grid according to the present invention.
Detailed Description
The method for optimizing the dynamic scheduling of the cross-region interconnected power grid in the embodiment is applied to a cross-region interconnected power grid system shown in fig. 1, and comprises the following steps: conventional generator sets, photovoltaic generator sets, wind turbine sets, rigid loads, flexible loads and direct current connecting lines connecting the regions in each region; the dispatching mechanism obtains the output condition and the power requirement of each unit of the trans-regional interconnected power grid through the detection and communication equipment at the decision time, and selects the optimal action according to the strategy obtained by the dynamic dispatching optimization method of the trans-regional interconnected power grid to adjust the output power of the conventional generator set, adjust the transmission power of the direct-current connecting line and reduce the flexible load requirement, so that the operation benefit of the trans-regional interconnected power grid system is improved.
Referring to fig. 2, the method for optimizing the dynamic scheduling of the cross-regional interconnected power grid in this embodiment is performed according to the following steps:
step 1, in a multi-region cross-regional interconnected power grid connected by a direct-current tie line, assuming that a wind power output power predicted value of a region z at any time t in a dispatching day is
Figure GDA0002792886730000062
The predicted value of the photovoltaic output power is
Figure GDA0002792886730000063
The predicted value of the total power demand of the load is
Figure GDA0002792886730000064
Step 2, actual wind power output value in the region z is compared with
Figure GDA0002792886730000065
At time t relative to the predicted value
Figure GDA0002792886730000066
Is dispersed as
Figure GDA0002792886730000067
In total
Figure GDA0002792886730000068
The state grade of the wind power output deviation power of the region z at the moment t is
Figure GDA0002792886730000071
Step 3, outputting the actual value of the photovoltaic power generation output in the area z
Figure GDA0002792886730000072
At time t relative to the predicted value
Figure GDA0002792886730000073
Is dispersed as
Figure GDA0002792886730000074
In total
Figure GDA0002792886730000075
The state grade of the photovoltaic output deviation power of the area z at the moment t is
Figure GDA0002792886730000076
Step 4, before DLC implementation, the load demand actual power P at the time t in the area zl z,tRelative to predicted value
Figure GDA0002792886730000077
Is dispersed as
Figure GDA0002792886730000078
In total
Figure GDA0002792886730000079
The state level of the load demand deviation power of the region z at the time t is
Figure GDA00027928867300000710
In DLCIn practice, the DLC load demand in decision period k in region z is adjusted
Figure GDA00027928867300000711
Is dispersed into
Figure GDA00027928867300000712
In total
Figure GDA00027928867300000713
The individual state class, the load power cut-off class in the decision period k in the region z is
Figure GDA00027928867300000714
The load demand power of the area z at any time t in the decision period k after the DLC is implemented
Figure GDA00027928867300000715
Can be characterized by formula (1):
Figure GDA00027928867300000716
wherein the content of the first and second substances,
Figure GDA00027928867300000717
power is removed for the DLC load in decision period k for region z,
Figure GDA00027928867300000718
DLC load bounce power for region z within decision period k;
step 5, dispersing the power change interval of the I-type thermal generator set in the region z within the climbing restriction limit range into
Figure GDA00027928867300000719
In total
Figure GDA00027928867300000720
The power regulation grade of the I-type thermal generator set at the moment t is
Figure GDA00027928867300000721
The allowable output power range of the I-type thermal generator set is dispersed into
Figure GDA00027928867300000722
In total
Figure GDA00027928867300000723
The generated power grade of the I-type thermal generator set at the time t is
Figure GDA00027928867300000724
Similarly, the power change interval of the II-type thermal generator set in the region z within the climbing constraint limit is dispersed into
Figure GDA00027928867300000725
In total
Figure GDA00027928867300000726
The power regulation grade of the individual state grade and the class II thermal generator set at the moment t is
Figure GDA00027928867300000727
The allowable output power range of the II-type thermal generator set is dispersed into
Figure GDA00027928867300000728
In total
Figure GDA00027928867300000729
The power generation power grade of the II-type thermal generator set at the moment t is
Figure GDA00027928867300000730
Obtaining the real-time generating power of the III-class thermal power generating unit through a regional power balance formula;
step 6, dispersing the power change interval of the cross-regional interconnected power grid inter-regional tie line in one period into
Figure GDA0002792886730000081
In total
Figure GDA0002792886730000082
Individual state level, power regulation level of tie line at time t is
Figure GDA0002792886730000083
The power range allowed to be transmitted by the DC link is dispersed into
Figure GDA0002792886730000084
In total
Figure GDA0002792886730000085
A state class of transmission power of the link l at time t
Figure GDA0002792886730000086
Step 7, determining the cross-region interconnected power grid system at decision time tkThe upper and lower layer states and actions, the upper layer state can be characterized by formula (2):
Figure GDA0002792886730000087
wherein the content of the first and second substances,
Figure GDA0002792886730000088
deciding time t for region zkThe status information of the state,
Figure GDA0002792886730000089
Figure GDA00027928867300000810
determining a time t for a DC linkkA lower power level; z is the total number of the cross-regional interconnected power grid system regions, and L is the total number of the inter-region tie lines. The upper layer action may be characterized by formula (3):
Figure GDA00027928867300000811
the lower layer state can be characterized by formula (4):
Figure GDA00027928867300000812
wherein L iszFor the total number of tie lines connected to the zone z in the cross-regional interconnected grid system,
Figure GDA00027928867300000813
for direct current links l connected to the zone zzThe transmission power level of. The underlying action may be characterized by equation (5):
Figure GDA00027928867300000814
step 8, determining the total cost generated by the upper layer and the lower layer of the cross-region interconnected power grid system in the decision period k
Figure GDA00027928867300000815
And determining the optimization targets of the upper layer and the lower layer of the system:
the total cost generated by the lower layer region z of the system in the decision period k
Figure GDA00027928867300000816
Compensation costs including DLC loading
Figure GDA00027928867300000817
Operating costs of thermal power generating units
Figure GDA00027928867300000818
Wind and light abandoning cost
Figure GDA00027928867300000819
Peak to valley difference cost
Figure GDA00027928867300000820
And workRate balancing constraint cost
Figure GDA00027928867300000821
The optimization target of the lower layer area z of the system is to find an optimal strategy on the basis of a given junctor transmission plan
Figure GDA00027928867300000822
Minimizing the daily operating costs of zone z;
cost generated by the upper layer of the system in a decision period k
Figure GDA00027928867300000823
The sum of the costs generated for the regions of the lower layer is optimized by finding an optimal strategy
Figure GDA00027928867300000824
The daily operating cost of the upper layers of the system is minimized,
step 9, performing pre-learning on the source task by adopting a hierarchical Q learning algorithm, which comprises the following specific steps:
9.1, initializing the upper knowledge matrix Q of the cross-region interconnected power grid systemupAnd underlying knowledge matrices of regions
Figure GDA0002792886730000091
9.2, initializing system model parameters and learning parameters;
step 9.3, initializing the current learning step number m to be 0, and the current decision period k to be 0;
step 9.4, determining the upper-layer state of the system at the current decision time
Figure GDA0002792886730000092
Step 9.5, the upper layer is according to QupAnd greedy strategy, selecting decision time tkAct of
Figure GDA0002792886730000093
Step 9.6,Lower zone z receives upper actions
Figure GDA0002792886730000094
Determining the decision time tkState of (1)
Figure GDA0002792886730000095
Step 9.7, lower zone z is based on
Figure GDA0002792886730000096
And greedy strategy, selecting decision time tkAct of
Figure GDA0002792886730000097
Step 9.8, calculating the cost of the lower layer region z in the decision period k
Figure GDA0002792886730000098
Simultaneous update of knowledge matrix of underlying region z
Figure GDA0002792886730000099
Step 9.9, the cost of each lower layer in the decision period k
Figure GDA00027928867300000910
Feeding back to the upper layer, and calculating to obtain the total cost of the upper layer
Figure GDA00027928867300000911
Updating the upper knowledge matrix Qup
Step 9.10, making k ═ k + 1; if K is less than the total number K of the decision periods, returning to the step 9.4; otherwise, making k equal to 0;
step 9.11, making m: ═ m + 1; if M is less than the total learning step number M, updating the learning rate and returning to the step 9.4; otherwise, ending the program, and storing the source load power prediction information of the source task and the optimal knowledge matrix in the step 9 into a knowledge base as experience knowledge;
step 10: and (3) quickly optimizing the target task by adopting a knowledge migration-based hierarchical Q learning algorithm:
step 10.1, defining the net load prediction power as a similarity element, calculating the similarity distance between the net load prediction power of the target task and the net load prediction power of the source task in each area, and measuring the similarity between the target task and the source task according to the similarity distance;
step 10.2, the source task with the minimum similar distance to the target task is used for migration, each knowledge matrix of the target task is initialized, and then an upper knowledge matrix Q of the target taskupAnd the lower layer region z knowledge matrix
Figure GDA00027928867300000912
Can be characterized by the formulae (6), (7):
Figure GDA00027928867300000913
Figure GDA00027928867300000914
wherein the content of the first and second substances,
Figure GDA00027928867300000915
respectively an upper-layer optimal knowledge matrix and a lower-layer region z optimal knowledge matrix of the minimum similar distance source task;
and step 10.3, initializing model parameters and learning parameters of the cross-region interconnected power grid scheduling optimization target task, and realizing rapid optimization of the target task, wherein the steps are the same as the steps 9.3-9.11, and are not repeated.
In a specific implementation, the calculation of the similarity distance in the step 10.1 is performed according to the following steps:
step 1, reflecting difference information between specific numerical values of a time sequence by using Euclidean distance:
in the operation of a power grid, in consideration of uncertainty of loads in a region and intermittent randomness of new energy power generation, a concept of net load is introduced, intermittent new power generation is considered as reversed load, namely the net load in the region is the total load minus the total new energy power generation output, and in a source task psi and a target task phi, the net load predicted power of a region z at a time t can be respectively represented as an equation (8) and an equation (9):
Figure GDA0002792886730000101
Figure GDA0002792886730000102
wherein the content of the first and second substances,
Figure GDA0002792886730000103
respectively predicting the net load power of the region z in the source task psi and the target task phi at the time t;
Figure GDA0002792886730000104
respectively predicting load demand predicted power, wind power predicted power and photovoltaic power generation predicted power of an area z in a source task psi at the moment t;
Figure GDA0002792886730000105
respectively predicting the load demand predicted power, the wind power predicted power and the photovoltaic power generation predicted power of the region z in the target task phi at the moment t;
are respectively paired
Figure GDA0002792886730000106
Sampling is performed assuming a time series length of NsIf the sampling interval Δ T is equal to T/NsTo obtain two time sequences
Figure GDA0002792886730000107
Characterized by the formulae (10), (11), respectively:
Figure GDA0002792886730000108
Figure GDA0002792886730000109
time series
Figure GDA00027928867300001010
And
Figure GDA00027928867300001011
the euclidean distance between can be characterized by equation (12):
Figure GDA00027928867300001012
step 2, reflecting time series trend and fluctuation information by using the dynamic time bending distance:
respectively carrying out power derivative functions on an endogenous task psi and a target task phi net load in a z region at a sampling interval delta t
Figure GDA0002792886730000111
Sampling to obtain two time sequences
Figure GDA0002792886730000112
Can be characterized by formulas (13), (14), respectively:
Figure GDA0002792886730000113
Figure GDA0002792886730000114
construction of Ns×NsOf (a), the elements within the matrix being characterized by equation (15):
Figure GDA0002792886730000115
the set of each set of adjacent elements in matrix Γ is referred to as a curved path, denoted as H ═ H1,…,hs,…,hmWhere m is the total number of elements in the path, element hsIs the coordinates of the s-th point on the path. The objective of the dynamic time warping algorithm is to find an optimal warped path, such that the sequence
Figure GDA0002792886730000116
And
Figure GDA0002792886730000117
is minimized and can be characterized by equation (16):
Figure GDA0002792886730000118
wherein
Figure GDA0002792886730000119
For minimum total cost of bending, i.e. time series
Figure GDA00027928867300001110
And
Figure GDA00027928867300001111
dynamic time warping distance between;
step 3, calculating the similar distance between the target task and the source task based on the Euclidean distance and the dynamic time bending distance of the net load prediction power of the target task and the source task in each region
Figure GDA00027928867300001112
Can be characterized by formula (17):
Figure GDA00027928867300001113
wherein λ ise、λdThe weighting coefficients of the euclidean distance and the dynamic time warping distance, respectively.
The method can effectively deal with the randomness of new energy and load requirements in the cross-region interconnected power grid, ensure the safe and economic operation of the cross-region interconnected power grid, avoid the problem of dimension disaster of reinforcement learning to a certain extent by a mechanism of layered learning and knowledge migration, accelerate the convergence speed of the algorithm and promote the rapid solution of a scheduling strategy.

Claims (2)

1. A method for quickly optimizing the dynamic scheduling of a trans-regional interconnected power grid based on knowledge migration is characterized by comprising the following steps:
step 1, in a multi-region cross-regional interconnected power grid connected by a direct-current tie line, assuming that a wind power output power predicted value of a region z at any time t in a dispatching day is
Figure FDA0002792886720000011
The predicted value of the photovoltaic output power is
Figure FDA0002792886720000012
The predicted value of the total power demand of the load is
Figure FDA0002792886720000013
Step 2, determining the actual value of wind power output in the region z
Figure FDA0002792886720000014
At time t relative to the predicted value
Figure FDA0002792886720000015
State class of output deviation power
Figure FDA0002792886720000016
Step 3, determining the actual value of the photovoltaic power generation output in the area z
Figure FDA0002792886720000017
At time t relative to the predicted value
Figure FDA0002792886720000018
State class of output deviation power
Figure FDA0002792886720000019
Step 4, determining the actual power P of the load demand at the moment t in the area z before DLC implementationl z,tRelative to predicted value
Figure FDA00027928867200000110
State level of load demand deviation power
Figure FDA00027928867200000111
And load power cut-off level within decision period k at DLC implementation
Figure FDA00027928867200000112
The load demand power of the area z at any time t in the decision period k after the DLC is implemented
Figure FDA00027928867200000113
Can be characterized by formula (1):
Figure FDA00027928867200000114
wherein the content of the first and second substances,
Figure FDA00027928867200000115
power is removed for the DLC load in decision period k for region z,
Figure FDA00027928867200000116
DLC load bounce power for region z within decision period k;
step 5, determining the power adjustment level of I, II-type thermal generator sets in the area z at the time t to be
Figure FDA00027928867200000117
Figure FDA00027928867200000118
And real-time generated power rating
Figure FDA00027928867200000119
And obtaining the real-time generating power of the III-class thermoelectric generator set through a regional power balance formula
Figure FDA00027928867200000120
Step 6, determining the power adjustment level of the cross-regional interconnected network inter-regional tie line l at the moment t
Figure FDA00027928867200000121
And power transmission class
Figure FDA00027928867200000122
Step 7, determining the cross-region interconnected power grid system at decision time tkThe upper and lower layer states and actions, the upper layer state can be characterized by formula (2):
Figure FDA0002792886720000021
wherein the content of the first and second substances,
Figure FDA0002792886720000022
deciding time t for region zkThe status information of the state,
Figure FDA0002792886720000023
Figure FDA0002792886720000024
determining a time t for a DC linkkA lower power level; z is the total number of the cross-regional interconnected power grid system regions, and L is the total number of the inter-region tie linesThe number of the particles; the upper layer action may be characterized by formula (3):
Figure FDA0002792886720000025
the lower layer state can be characterized by formula (4):
Figure FDA0002792886720000026
wherein L iszFor the total number of tie lines connected to the zone z in the cross-regional interconnected grid system,
Figure FDA0002792886720000027
for direct current links l connected to the zone zzThe transmission power level of; the underlying action may be characterized by equation (5):
Figure FDA0002792886720000028
step 8, determining the total cost generated by the upper layer and the lower layer of the cross-region interconnected power grid system in the decision period k
Figure FDA0002792886720000029
Determining the optimization targets of the upper layer and the lower layer of the system;
step 9, performing pre-learning on the source task by adopting a hierarchical Q learning algorithm, which comprises the following specific steps:
9.1, initializing the upper knowledge matrix Q of the cross-region interconnected power grid systemupAnd underlying knowledge matrices of regions
Figure FDA00027928867200000210
9.2, initializing system model parameters and learning parameters;
step 9.3, initializing the current learning step number m to be 0, and the current decision period k to be 0;
step 9.4, determining the upper-layer state of the system at the current decision time
Figure FDA00027928867200000211
Step 9.5, the upper layer is according to QupAnd greedy strategy, selecting decision time tkAct of
Figure FDA00027928867200000212
Step 9.6, lower zone z receives upper actions
Figure FDA00027928867200000213
Determining the decision time tkState of (1)
Figure FDA00027928867200000214
Step 9.7, lower zone z is based on
Figure FDA00027928867200000215
And greedy strategy, selecting decision time tkAct of
Figure FDA00027928867200000216
Step 9.8, calculating the cost of the lower layer region z in the decision period k
Figure FDA00027928867200000217
Simultaneous update of knowledge matrix of underlying region z
Figure FDA00027928867200000218
Step 9.9, the cost of each lower layer in the decision period k
Figure FDA00027928867200000219
Feeding back to the upper layer, and calculating to obtain the upper layerTotal cost
Figure FDA0002792886720000031
Updating the upper knowledge matrix Qup
Step 9.10, making k ═ k + 1; if K is less than the total number K of the decision periods, returning to the step 9.4; otherwise, making k equal to 0;
step 9.11, making m: ═ m + 1; if M is less than the total learning step number M, updating the learning rate and returning to the step 9.4; otherwise, ending the program, and storing the source load power prediction information of the source task and the optimal knowledge matrix in the step 9 into a knowledge base as experience knowledge;
step 10: and (3) quickly optimizing the target task by adopting a knowledge migration-based hierarchical Q learning algorithm:
step 10.1, defining the net load prediction power as a similarity element, calculating the similarity distance between the net load prediction power of the target task and the net load prediction power of the source task in each area, and measuring the similarity between the target task and the source task according to the similarity distance;
step 10.2, the source task with the minimum similar distance to the target task is used for migration, each knowledge matrix of the target task is initialized, and then an upper knowledge matrix Q of the target taskupAnd the lower layer region z knowledge matrix
Figure FDA0002792886720000032
Can be characterized by the formulae (6), (7):
Figure FDA0002792886720000033
Figure FDA0002792886720000034
wherein the content of the first and second substances,
Figure FDA0002792886720000035
an upper layer optimal knowledge matrix and a lower layer area which are respectively a minimum similar distance source taskA domain z optimal knowledge matrix;
and 10.3, initializing model parameters and learning parameters of the cross-region interconnected power grid scheduling optimization target task, and realizing rapid optimization of the target task, wherein the steps are the same as the steps 9.3-9.11.
2. The knowledge transfer-based method for dynamically scheduling and rapidly optimizing the trans-regional interconnected power grid according to claim 1, which comprises the following steps:
the calculation of the similarity distance in step 10.1 is performed as follows:
step 1, reflecting difference information between specific numerical values of a time sequence by using Euclidean distance:
in the operation of a power grid, in consideration of uncertainty of loads in a region and intermittent randomness of new energy power generation, a concept of net load is introduced, intermittent new power generation is considered as reversed load, namely the net load in the region is the total load minus the total new energy power generation output, and in a source task psi and a target task phi, the net load predicted power of a region z at a time t can be respectively represented as an equation (8) and an equation (9):
Figure FDA0002792886720000036
Figure FDA0002792886720000041
wherein the content of the first and second substances,
Figure FDA0002792886720000042
respectively predicting the net load power of the region z in the source task psi and the target task phi at the time t;
Figure FDA0002792886720000043
respectively predicting load demand predicted power, wind power predicted power and photovoltaic power generation predicted power of an area z in a source task psi at the moment t;
Figure FDA0002792886720000044
respectively predicting the load demand predicted power, the wind power predicted power and the photovoltaic power generation predicted power of the region z in the target task phi at the moment t;
are respectively paired
Figure FDA0002792886720000045
Sampling is performed assuming a time series length of NsIf the sampling interval Δ T is equal to T/NsTo obtain two time sequences
Figure FDA0002792886720000046
Characterized by the formulae (10), (11), respectively:
Figure FDA0002792886720000047
Figure FDA0002792886720000048
time series
Figure FDA0002792886720000049
And
Figure FDA00027928867200000410
the euclidean distance between can be characterized by equation (12):
Figure FDA00027928867200000411
step 2, reflecting time series trend and fluctuation information by using the dynamic time bending distance:
respectively carrying out power derivative functions on an endogenous task psi and a target task phi net load in a z region at a sampling interval delta t
Figure FDA00027928867200000412
Sampling to obtain two time sequences
Figure FDA00027928867200000413
Can be characterized by formulas (13), (14), respectively:
Figure FDA00027928867200000414
Figure FDA00027928867200000415
construction of Ns×NsOf (a), the elements within the matrix being characterized by equation (15):
Figure FDA00027928867200000416
the set of each set of adjacent elements in matrix Γ is referred to as a curved path, denoted as H ═ H1,…,hs,…,hmWhere m is the total number of elements in the path, element hsCoordinates of the s-th point on the path; the objective of the dynamic time warping algorithm is to find an optimal warped path, such that the sequence
Figure FDA00027928867200000417
And
Figure FDA0002792886720000051
is minimized and can be characterized by equation (16):
Figure FDA0002792886720000052
wherein
Figure FDA0002792886720000053
For minimum total cost of bending, i.e. time series
Figure FDA0002792886720000054
And
Figure FDA0002792886720000055
dynamic time warping distance between;
step 3, calculating the similar distance between the target task and the source task based on the Euclidean distance and the dynamic time bending distance of the net load prediction power of the target task and the source task in each region
Figure FDA0002792886720000056
Can be characterized by formula (17):
Figure FDA0002792886720000057
wherein λ ise、λdThe weighting coefficients of the euclidean distance and the dynamic time warping distance, respectively.
CN201910932990.5A 2019-09-29 2019-09-29 Knowledge migration-based cross-regional interconnected power grid dynamic scheduling rapid optimization method Active CN110635476B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910932990.5A CN110635476B (en) 2019-09-29 2019-09-29 Knowledge migration-based cross-regional interconnected power grid dynamic scheduling rapid optimization method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910932990.5A CN110635476B (en) 2019-09-29 2019-09-29 Knowledge migration-based cross-regional interconnected power grid dynamic scheduling rapid optimization method

Publications (2)

Publication Number Publication Date
CN110635476A CN110635476A (en) 2019-12-31
CN110635476B true CN110635476B (en) 2021-01-15

Family

ID=68973697

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910932990.5A Active CN110635476B (en) 2019-09-29 2019-09-29 Knowledge migration-based cross-regional interconnected power grid dynamic scheduling rapid optimization method

Country Status (1)

Country Link
CN (1) CN110635476B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112308303B (en) * 2020-10-22 2024-03-08 新奥数能科技有限公司 High fault tolerance energy supply group load scheduling method and device based on deviation distribution and terminal equipment
CN113435793A (en) * 2021-08-09 2021-09-24 贵州大学 Micro-grid optimization scheduling method based on reinforcement learning
CN113918727B (en) * 2021-09-16 2022-12-09 西南交通大学 Construction project knowledge transfer method based on knowledge graph and transfer learning
CN114201924B (en) * 2022-02-16 2022-06-10 杭州经纬信息技术股份有限公司 Solar irradiance prediction method and system based on transfer learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018084300A1 (en) * 2016-11-07 2018-05-11 株式会社オプティマイザー Power demand procurement support system, information processing device, information processing method, and information processing program
CN108347048A (en) * 2017-01-22 2018-07-31 中国电力科学研究院 A kind of planning device adapting to transregional transnational scheduling method
CN109066805A (en) * 2018-07-18 2018-12-21 合肥工业大学 A kind of transregional interconnected network generating and transmitting system dynamic dispatching optimization method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018084300A1 (en) * 2016-11-07 2018-05-11 株式会社オプティマイザー Power demand procurement support system, information processing device, information processing method, and information processing program
CN108347048A (en) * 2017-01-22 2018-07-31 中国电力科学研究院 A kind of planning device adapting to transregional transnational scheduling method
CN109066805A (en) * 2018-07-18 2018-12-21 合肥工业大学 A kind of transregional interconnected network generating and transmitting system dynamic dispatching optimization method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张延等.考虑源荷随机性的跨区互联电网直流联络线调度学习优化.《控制理论与应用》.2019,第36卷(第7期), *
直流跨区互联电网发输电计划模型与方法;王斌等;《电力系统自动化》;20160210;第40卷(第3期);第8页-第13页 *

Also Published As

Publication number Publication date
CN110635476A (en) 2019-12-31

Similar Documents

Publication Publication Date Title
CN110635476B (en) Knowledge migration-based cross-regional interconnected power grid dynamic scheduling rapid optimization method
CN109066805B (en) Dynamic scheduling optimization method for power generation and transmission system of cross-regional interconnected power grid
Niknam et al. A modified shuffle frog leaping algorithm for multi-objective optimal power flow
CN111030188B (en) Hierarchical control strategy containing distributed and energy storage
CN105719091B (en) A kind of parallel Multiobjective Optimal Operation method of Hydropower Stations
CN106682810B (en) Long-term operation method of cross-basin cascade hydropower station group under dynamic production of giant hydropower station
Zia et al. Energy management system for a hybrid PV-Wind-Tidal-Battery-based islanded DC microgrid: Modeling and experimental validation
Sultana et al. Oppositional krill herd algorithm for optimal location of distributed generator in radial distribution system
CN106505635A (en) Abandon the minimum active power dispatch model of wind and scheduling system
Li et al. Hierarchical multi-reservoir optimization modeling for real-world complexity with application to the Three Gorges system
Shen et al. Optimization of peak loads among multiple provincial power grids under a central dispatching authority
CN108092321B (en) Active power and reactive power coordinated control method considering uncertainty for active power distribution network
CN111062514A (en) Power system planning method and system
Venayagamoorthy et al. Energy dispatch controllers for a photovoltaic system
CN104239966B (en) Active power distribution network operating method based on electricity cost differentiation
CN105305501B (en) The lower power station multi-mode space-time nesting output dynamic adjusting method of Real-time Load change
CN111667136A (en) Clearing method and device for regional power market and storage medium
CN116151558A (en) Carbon emission responsibility division method of interconnected power grid based on carbon emission flow
CN102751724A (en) Prediction-based three-phase load scheduling method and device responding to demand side
CN116388153A (en) Optimal configuration method for flexible interconnection equipment in active power distribution network
Huang et al. Distributed real-time economic dispatch for islanded microgrids with dynamic power demand
CN107392350B (en) Comprehensive optimization method for power distribution network extension planning containing distributed energy and charging stations
Lavania et al. Reducing variation in solar energy supply through frequency domain analysis
CN110061510A (en) A kind of period decoupling security constrained economic dispatch fast solution method and system
CN111064231A (en) New energy graded interactive consumption method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant