CN107045655A - Wolf pack clan strategy process based on the random consistent game of multiple agent and virtual generating clan - Google Patents

Wolf pack clan strategy process based on the random consistent game of multiple agent and virtual generating clan Download PDF

Info

Publication number
CN107045655A
CN107045655A CN201611117291.8A CN201611117291A CN107045655A CN 107045655 A CN107045655 A CN 107045655A CN 201611117291 A CN201611117291 A CN 201611117291A CN 107045655 A CN107045655 A CN 107045655A
Authority
CN
China
Prior art keywords
formula
clan
power
strategy
game
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611117291.8A
Other languages
Chinese (zh)
Inventor
席磊
李玉丹
杨苹
许志荣
柳浪
陈建峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Three Gorges University CTGU
Original Assignee
China Three Gorges University CTGU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Three Gorges University CTGU filed Critical China Three Gorges University CTGU
Priority to CN201611117291.8A priority Critical patent/CN107045655A/en
Publication of CN107045655A publication Critical patent/CN107045655A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E40/00Technologies for an efficient electrical power generation, transmission or distribution
    • Y02E40/70Smart grids as climate change mitigation technology in the energy generation sector
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Abstract

Wolf pack clan strategy based on the random consistent game of multiple agent and virtual generating clan, including step:S1, determine state discrete collection S;S2, determine teamwork discrete set A;S3, when each controlling cycle starts, gather the real-time running data of each power network, the real-time running data includes frequency departure △ f and power deviation, calculate regional control error ACEi(k) instantaneous value and control performance standard CPSi(k) instantaneous value;S4, in current state S, certain regional power grid i obtains a short-term reward function signal Ri(k);S5, pass through to calculate and obtain value function error p with estimationk、δk;S6, optimal objective value function and strategy are asked for by function.This method combines harmonious (MAS CC) two frameworks of the multi-agent system of multi intelligent agent game theory (MAS SG) and isolog, the problem of solving the coordination optimization of virtual generating clan, has the advantages that to improve closed-loop system system, improves utilization of new energy resources rate, reduces carbon emission, fast convergence.

Description

Wolf pack clan strategy based on the random consistent game of multiple agent and virtual generating clan Method
Technical field
It is more particularly to a kind of consistent at random based on multiple agent the present invention relates to Economic Dispatch technical field Game and the wolf pack clan strategy process of virtual generating clan, it is adaptable to the dynamic multi-objective optimization of distributing economic load dispatching point Match somebody with somebody.
Background technology
AGC can be generally divided into two steps:A), the tracking of the total generated outputs of AGC, b), by optimized algorithm total hair Electrical power distribution is to each AGC unit.In fact, the AGC general powers that PI controllers have been widely used for IDN coordinate control. In order to further improve AGC adaptability and control performance, existing document propose it is a kind of be used to exchanging microgrid based on online The fuzzy evolutionary algorithm of particle swarm optimization algorithm (particle swarm optimization, PSO).Bacterium is looked for food to optimize and calculated Method (bacteria foraging optimization, BFO), PSO, genetic algorithm (genetic algorithm, GA) and biography The gradient algorithm of system is all applied to optimize control parameters all in microgrid.On the other hand,《The interconnected network learnt based on Q Dynamic optimal CPS is controlled》Have studied intensified learning is used to realize interconnected network SGC, so as to improve AGC dynamic control performances.So And, the research method of above-mentioned document is all centerized fusion, it is necessary to substantial amounts of distant place information, therefore dynamic response is slow, control Performance is not ideal enough.
Distributing correlated equilibrium Q (λ) methods (decentralized based on multiple agent that existing document is proposed Correlated equilibrium Q (λ)-learning, DCEQ (λ)) SGC complexity is solved as optimal policy at random Dynamic characteristic and optimistic coordinated control problem, learn with Q, and Q (λ) study, R (λ) study is compared with PI control algolithms with more excellent Control performance.
But can further be improved in view of its control performance, and when intelligent body number increase, DCEQ (λ) algorithm In the search MA equilibrium solution times in the increase of geometry number, the extensive use in more massive network system of its method is limited. Bowling&Veloso developed hill climbing algorithm (the win or learn fast of " win " or " Fast Learning " in 2002 Policy hill-climbing, WoLF-PHC);In study, each intelligent body is using mixed strategy and only preserves the Q values of itself Table.So, on the one hand, it avoids the exploration for needing to solve in general Q study and utilizes this contradictory problems;On the other hand, It can solve the asynchronous decision problem of MA systems.Therefore, based on wolf-phc, eligibility trace and SARSA (λ), it is proposed that a kind of Q The mutation algorithm of (λ) study, i.e., the distributing based on multiple agent is won or Fast Learning hill climbing method (decentralized Win or learn fast policy hill-climbing (λ), DWoLF-PHC (λ), hereinafter referred to as wolf climbs the mountain).The algorithm Utilize the learning rate of changeThe change of environment is perceived in MA, accommodation itself strategy encourages algorithmic statement with this To optimal solution and it ensure that the reasonability of algorithm.With WoLF characteristics, i.e. Win or Learning Fast.It is sharp in algorithm Equilibrium is instead of with average mixed strategy.But the above method is that the tracking that general power is instructed is studied, without Dynamic optimization distribution is carried out to AGC power instructions.Also, when intelligent body number continues to increase, it may appear that many solutions are topic, are caused System is unstable.Therefore need to explore new method, to obtain distributing optimistic coordinated control.
The content of the invention
To overcome the shortcoming and defect of existing Q learning algorithms, the harmonious property of distribution type control system, this hair are solved It is bright to propose a kind of wolf pack clan strategy process based on the random consistent game of multiple agent and virtual generating clan.This method knot Harmonious two frameworks of multi-agent system of multi intelligent agent game theory and isolog are closed, mark decay factor has been considered λ, discount factor γ, Q learning rate α, learning rate changingDeng the influence to system convergence effect;Also contemplate communication delay, noise and The influence that change in topology is dispatched to distributing, further expands the tactful scope of application.Engineering practice can be preferably applicable In nonideal communication environment, and have more preferable optimum results.
The technical solution adopted in the present invention is:
Wolf pack clan strategy process based on the random consistent game of multiple agent and virtual generating clan, including following step Suddenly:
Step S1, determine state discrete collection S.
Step S2, determine teamwork discrete set A.
Step S3, when each controlling cycle starts, gather the real-time running data of each power network, the real time execution number Error ACE is controlled according to regional including frequency departure △ f and power deviation △ P, is calculatedi(k) instantaneous value and control performance Standard CPSi(k) instantaneous value.
Step S4, in current state S, certain regional power grid i obtains a short-term reward function signal Ri(k)。
Step S5, pass through to calculate and obtain value function error p with estimationk、δk
Step S6, optimal objective value function and strategy are asked for by function.
Step S7, to all regional power grid j, update institute it is stateful-action is to (s, Q functions form and eligibility trace square a) Battle array ej(s, a), and by the mixed strategy U under the Q values renewal current state S updatedk(sk,ak), then by mixed strategy Uk(sk,ak) Update value function Qk+1(sk,ak), eligibility trace element e (s, a), learning rate changing φ and average mixed strategy table.
Step S8, determine Laplacian Matrix L, obtain clan power Δ Pi, and climbing slope is obtained by clan's power.
Step S9, leader and the virtual uniformity variable that followed by are updated.
Step S10, each power of the assembling unit △ P of solutioniwIf the power of the assembling unit is out-of-limit, step S9 is jumped to.
Step S11, when reaching boundary condition, calculate generated output △ PiwAnd tiwWith renewal row stochastic matrix element.
Step S12, calculating power deviation △ Perror-i, judge whether to meet, next step calculating carried out if meeting, if not Satisfaction then jumps to step 9.
Step S13, return to step S3.
The state discrete collection S of the step S1, passes through the division of control performance standard CPS and area control error ACE values To determine.
The teamwork discrete set A of step S2 expression formula is:
A=A1×A2×…×Ai×…×AnWherein, AiFor intelligent body i output discrete movement collection, n is intelligent body Number.
The real-time running data of the step S3 is gathered using computer and monitoring system.
In the step S3, the area control error ACE of the region ii(k) instantaneous value calculating method is as follows:
ACE=Ta-Ts-10B(Fa-Fs),
Wherein, TaFor the actual trend value of interconnection, TsExpect trend value for interconnection, B is frequency bias coefficient, FaTo be System actual frequency values, FsFor system expected frequency value.
The CPS of the control performance standard 1 of the region ii(k) instantaneous value calculating method is as follows:
CPS1=(2-CF1) × 100%,
Wherein,BiFor control area i frequency bias coefficient;ε1For interconnection electricity Control targe value of the net to annual 1 minute frequency averaging deviation root mean square;N is the number of minutes of the examination period;ACEAVE-1minFor Average values of the area control error ACE in 1 minute;△fAVEFor average values of the frequency departure △ f in 1 minute.
The CPS of the control performance standard 2 of the region ii(k) instantaneous value calculating method is as follows:
CPS2=(1-R) × 100%,
Wherein,
In formula, ε10For control targe value of the interconnected network to annual 10 minutes frequency averaging deviation root mean square;BnetTo be whole The frequency bias coefficient of interconnected network;ACEAVE-10minFor average values of the area control error ACE in 10 minutes.
The short-term reward function signal R of the step S4i(k) obtained by following formula, formula is as follows:
Wherein, ACE (k) and △ P in formulaiw(k) instantaneous value and kth of kth step iteration area control error are represented respectively Walk the real output of w-th of unit in iteration;μ and (1- μ) represent area control error and the weights of carbon emission respectively, The μ values in each region are identical, and μ=0.5 is set to herein;DiwIt is the carbon intensity coefficient of w units, unit is kg/kWh;WithThe respectively bound of w unit capacities;Thermoelectricity generating set efficiency is considered, when unit adjustable capacity is big When 600MW, Dj=0.87, when unit rated capacity, which is less than or equal to 600MW, is more than 300MW, take Dj=0.89, when unit holds Amount is less than or equal to take D during 300MWj=0.99.The D of fuel oil consump-tion, Gas Generator Set and Hydropower Unit in each VGTjSet respectively For 0.7,0.5,0.
The value function error p of the step S5k、δkBy formula pk=R (sk,sk+1,ak)+γQk(sk+1,ag)-Qk(sk,ak) And δk=R (sk,sk+1,ak)+γQk(sk+1,ag)-Qk(sk,ak)
Wherein, R (sk,sk+1,ak) it is in selected action akLower state is from skTo sk+1Intelligent body reward function, γ for folding The factor is detained, γ span is 0<γ<1, agFor greedy action policy.
In the step S6, optimal objective value functionAnd strategyFor
In formula, A is behavior aggregate.
In the step S7, update eligibility trace matrix and pass through formula:
ek+1(s,a)←γλek(s a), updates Q function forms, according to formula Qk+1(s, a)=Qk(s,a)+αδkek(s,a)
Wherein, ek(s a) walks the eligibility trace of iteration, γ is discount factor, γ value model for kth under acting a in state s Enclose for 0<γ<1, λ is mark decay factor, and λ span is 0<λ<1, α is Q learning rates, and α sets scope to be 0<α<1.
Mixed strategy U in the step S7k(sk,ak) updated according to following formula:
In formula, φiFor learning rate changing.
In the step S7, value function Q is updatedk+1(sk,ak), according to formula:
Qk+1(sk,ak)=Qk+1(sk,ak)+αpk
Update eligibility trace element e (sk,ak)←e(sk,ak)+1, according to formula:
Update learning rate changingAccording to formula:
Average mixed strategy table is updated, according to formula:
In formula,WithTwo learning parameters are used for representing the win of intelligent body and defeated, visit (sk) it is from original state The s undergone to current statekNumber of times.
Step S8, determines Laplace formula L=[lij] ∈ Rn × n, according to formula
In formula, constant bij(bij>=0) represent weight factor between intelligent body
Climbing slope is calculated, according to formula:
In formula, URiwAnd DRiwIt is the bound of climbing slope respectively.
Step S9, is updated to leader and the virtual uniformity variable that followed by, according to formula:
The former is that leader's uniformity variable is updated, and the latter is that the virtual uniformity variable that followed by is carried out more Newly, in formula, in i-th of VGT, mi is unit sum,It is μ i in random row matrix, formula>0 represents i-th The Dynamic gene of individual VGT power deviations, △ Perror-iRepresent the deviation of i-th of VGT general powers instruction and all aggregate capacities.
Further, power of the assembling unit △ P are calculatediwIf the power of the assembling unit is out-of-limit, power deviation is calculated, is judged whether full Sufficient condition, then obtain the power of the assembling unit, carries out repeatedly calculating k=k+1 next time, deviation is unsatisfactory for condition, then since being calculated uniformity Above step is repeated to calculate;When reaching boundary condition, generated output △ P are calculatediwAnd tiw, according to formula:
Row stochastic matrix element is updated, according to formula
In formula, L=[lij]∈Rn×nIt is Laplacian Matrix, constant bij(bij>=0) represent weight factor between intelligent body
D=[dij]∈Rn×nRepresent row stochastic matrix,It is i-th of VGT weighted adjacent matrix
Further, the deviation △ P of aggregate capacityerror-iCalculation formula:
If Δ Pi>0, thenOtherwise
Further, judge whether power deviation meets condition, meet then progress next iteration calculating and take k=k+1;No Meet, the step of jumping to calculating uniformity.
A kind of wolf pack clan strategy process based on the random consistent game of multiple agent and virtual generating clan of the present invention, has Beneficial effect is as follows:
1):The present invention combines the multiple agent system of multi-agent system multi intelligent agent game theory (MAS-SG) and isolog Harmonious (MAS-CC) the two big frame system of system, the problem of solving the coordination optimization of virtual generating clan.
2):The present invention solves distributing correlated equilibrium Q (λ) methods based on multiple agent in intelligent body number increase Extensive use of its method in more massive network system is limited, is improved in existing hill climbing algorithm, the calculation Method utilizes the learning rate changedThe change of environment is perceived in MA, accommodation itself strategy encourages algorithm to receive with this Hold back optimal solution and ensure that the reasonability of algorithm.
3):The present invention solve tradition centralization AGC can not meet new energy continuous access and intelligent grid " i.e. insert Use " demand, the algorithm using virtual consistent variable solve due to power it is out-of-limit caused by change in topology and AGC units i.e. Plug-and-play problem.
Brief description of the drawings
Fig. 1 is AGC MAS control frameworks.
Fig. 2 is VTC frame diagrams.
Fig. 3 is that wolf pack hunts strategic process figure.
Fig. 4 is LOAD FREQUENCY Controlling model figure described in embodiment.
Embodiment
Wolf pack clan strategy process based on the random consistent game of multiple agent and virtual generating clan, complete description is such as Under:
1), analyze the behaviour of systems determination state discrete collection S, specifically can determine state by CPS 1 and ACE values division Discrete set S.
2) teamwork discrete set A, wherein A=A, are determined1×A2×…×Ai×…×An, AiFor intelligent body i output Discrete movement collection, n is intelligent body number.
3) real-time running data of regional power network, is gathered when each controlling cycle starts:△ f, △ P, and calculate The ACE of regionaliAnd CPS (k)i(k) instantaneous value Ri(k), Ri(k) the regional power grid i kth that is designed as walks ACE's and CPS1 The linear combination of difference value and power adjustment value.
4), using in selected action akLower state is from skTo sk+1Intelligent body reward function R (sk,sk+1,ak), value folding Detain factor gamma, its value scope 0<γ<1, by ρk=R (sk,sk+1,ak)+γQk(sk+1,ag)-Qk(sk,ak) and δk=R (sk,sk+1, ak)+γQk(sk+1,ag)-Qk(sk,ag) the Q function errors p that intelligent body is walked in kth is asked for respectivelykWith the error evaluation δ of Q functionsk
5), for each state action to (s a), is performed:
1. eligibility trace matrix e is updatedk+1(s, a)=λ × γ ek(s,a);
2. Q functions Q is updatedk+1(s, a)=Qk(s,a)+αδkek(s,a);
Wherein, λ, γ, α are respectively the qualification decay factor of control system, discount factor, Q learning rates, their value model It is trapped among [0,1], first step of execution updates eligibility trace matrix e with λ and γk(s a), updates the mistake asked required by Q functions Difference assesses δk, learning rate α and eligibility trace ek(s,a)。
6) for region j, perform
1. step 5 obtains value function Qk+1(s, a)=Qk(s,a)+αδkek(s,a)
2. formula is utilizedSolve mixed strategyWhereinIt is learning rate changing, | Ai| it is behavior aggregate Element number.
③:According to take satisfaction require Q learning parameters α and step 4 required by Q functional value errors ρk, update value function Qk+1(sk, ak)=Qk+1(sk, ak)+αpk
4. eligibility trace element e (s are updatedk,ak)=e (sk,ak)+1;
5. learning rate changing is selectedAccording to
In formula,WithTwo learning parameters be used for representing the win of intelligent body with it is defeated.
6. the mixing Average Strategy U (s required by step 6 are utilizedk,ak), update average mixed strategy table:
7. the number of times undergone from original state to current state is updated:
visit(sk)=visit (sk)+1
For i VTG, clan power △ P are first obtainediw(i=1,2,3 ..., n)
7) climbing slope, is solved
Wherein, URiwAnd DRiwIt is the bound of climbing slope respectively.
8), leader and the virtual uniformity variable that followed by are updated, according to formula:
The former is that leader's uniformity variable is updated, and the latter is that the virtual uniformity variable that followed by is carried out more Newly, in formula, in i-th of VGT, miIt is unit sum,It is μ i in random row matrix, formula>0 represents i-th The Dynamic gene of individual VGT power deviations, △ Perror-iRepresent the deviation of i-th of VGT general powers instruction and all aggregate capacities.
9), when reaching boundary condition, generated output Δ P is calculatediwAnd tiw, according to formula:
In formula, URiwAnd DRiwIt is the bound of climbing slope respectively.
10) following machine matrix, is updated, according to formula:
In formula, L=[lij]∈Rn×nIt is Laplacian Matrix, constant bij(bij>=0) represent weight factor between intelligent body
D=[dij]∈Rn×nRepresent row stochastic matrix,It is i-th of VGT weighted adjacent matrix.
11) power deviation and judgment bias, are calculated, according to formula
10) more, | Perror-r|<εi
Power of the assembling unit △ P obtained by step 10iw, calculate power deviation △ Perror-i, whether judgment bias meets condition, full Foot obtains the power of the assembling unit.It is unsatisfactory for, jumps to step 8.
12) power of the assembling unit △ P, are obtainediw, next iteration k=k+1 is carried out, step 1 is jumped to.
The application that wolf pack hunts strategy is not limited by centralization calculating and single centralized controller power instruction distribution System.If in fact, some intelligent body failures, other intelligent bodies can continue to carry out information interchange and realize new uniformity. Due to generally there is more than one communication channel between intelligent body, when certain passage breaks down, AGC performances can still be kept It is optimal.This information sharing depended between each intelligent body, as shown in Figure 3.Some related concepts are as follows:
1., manor:Regional power grid in one independent cut set, refers generally in province's net and three lines of defence Active Splitting system The isolated island region transmission and distribution network of matching.Typically there is large-scale plant-grid connection in the power network of manor, therefore with microgrid and active distribution Distinguish.
2., clan:Only one of which clan in one manor, clan is all true generatings for participating in frequency modulation in the power network of manor Unit and virtual synchronous generator group (such as energy-storage system and interruptible load system).
3., head:Dispatching terminal in one clan only one of which head, i.e., whole clan.Head is responsible for saving net scheduling End (higher level) and other clan's dispatching terminals (other clan leaders) are linked up, contacted with being cooperated, and instruction is issued in this clan Parent in each family.
4., family:A generating set group with similar generating control characteristic in clan, such as thermal motor group, Gas engine group etc..One clan is by multiple family compositions.
5., parent:There is leader's generation-control unit of stronger dispatching in family.Parent can carry out actively searching Rope, independently executes the instruction of complexity.
6., member:One independent generation-control unit, can only imitate the behavior of parent, perform simple instruction.
7., reserves:The reserve force just set out when needing to surround prey at crucial moment (herein refers to water-storage electricity Stand), if that is, load disturbance exceedes the 50% of preset value, hydroenergy storage station then brings into operation.It is with " energy storage wolf pack man The form in front yard " occurs.
Embodiment:
The present embodiment is that under the general frame of south electric network, using Guangdong Power Grid as main study subject, simulation model is The detailed full dynamic simulation model that Guangdong Center of Electric Dispatching and Transforming's practical engineering project is built, detailed model parameter and emulation Design principle refers to Yu Tao, Zhou Bin, what Chen Jiarong was delivered《The interconnected network dynamic optimal CPS controls learnt based on Q》(in State's electrical engineering journal), south electric network is divided into Guangdong, Guangxi, Guizhou and four, Yunnan regional power grid in the simulation model, wherein Being subject to the sampling time in Guangdong Power Grid and other each province's power networks, no more than 1500MW, (correspondence Guangdong Power Grid is maximum for 15min, amplitude Single failure-direct current monopole locking) limited-bandwidth white noise load disturbance, to south electric network each province LOAD FREQUENCY respond be Number adds white noise parameter perturbation, and simulation study is modeled using Simulink.Each regional power grid AGC Control is in synthesis Other regional power grid ACE instantaneous values and take and each seek optimal joint action policy under conditions of strategy.
Wolf pack clan strategy process based on the random consistent game of multiple agent and virtual generating clan, including following step Suddenly:
Step 1), analyze the behaviour of systems with to state set s discretizations:This example refers to according to Guangdong Electric Power control centre CPS The criteria for classifying is marked, CPSl values are divided into 6 states:(- ∞, 0), [0,100%), [100%, 150%), [150%, 180%), [180%, 200%), [200% ,+∞), then ACE is divided into positive and negative 2 states, thus each intelligent body can be true Surely there are 12 states.ACE state causes the reason for CPS indexs are fluctuated primarily to distinguishing.
Step 2), determine teamwork discrete set A, the behavior aggregate of the i-th regional power grid is Ai=[- 50, -20, -10, -5, 0,5,10,20,50] MW, teamwork set of strategies is, A=A1×A2×…×Ai×…×An, A is that the output of controller is moved Make, i.e. AGC power adjustment instruction.Control step-length to use AGC controlling cycles, take 4s.
Step 3), when each controlling cycle starts gather regional power network real-time running data:△ f, △ P, its Middle △ f represent system frequency deviation, and △ P represent dominant eigenvalues deviation according to international evaluation method ACE=Ta-Ts-10B (Fa-Fs)(Ta, TsRespectively the actual trend value of interconnection is with expecting trend value;B is frequency bias coefficient;Fa, FsRespectively system Actual frequency values and expected frequency value),(BiFor control area i frequency bias coefficient; ε1For control targe value of the interconnected network to annual 1 minute frequency averaging deviation root mean square;N is the number of minutes of the examination period), CPS2=(1-R) × 100%,10It is interconnected network to annual 10 minutes frequencies The control targe value of average deviation root mean square;BnetFor the frequency bias coefficient of whole interconnected network;ACEAVE-10minControlled for region Average values of the error ACE processed in 10 minutes),With formula CPS2=(1-R) × 100% calculates the ACE of regionaliAnd CPS (k)i(k) instant value.
Step 4), according to the ACE of regionaliAnd CPS (k)i(k) instant value determines current state s, then stateful s and prize Encourage the award value R immediately that function obtains regional power gridi(k), reward function design is as follows:
s.t.△Piw min≤△Piw≤△Piw max
ACE (k) and △ P in formulaiw(k) instantaneous value and kth step iteration of kth step iteration area control error are represented respectively In w-th of unit real output;μ and (1- μ) represent area control error and the weights of carbon emission respectively, each The μ values in region are identical, and μ=0.5 is set to herein;DiwIt is the carbon intensity coefficient of w units, unit is kg/kWh; WithThe respectively bound of w unit capacities;Thermoelectricity generating set efficiency is considered, when unit adjustable capacity is more than During 600MW, Dj=0.87, when unit rated capacity, which is less than or equal to 600MW, is more than 300MW, take Dj=0.89, work as unit capacity D is taken during less than or equal to 300MWj=0.99.The D of fuel oil consump-tion, Gas Generator Set and Hydropower Unit in each VGTjIt is respectively set to 0.7,0.5,0。
5), for each state action to (s a), is performed:
1. eligibility trace matrix e is updatedk+1(s, a)=0.9 × 0.9ek(s,a);
2. Q functions Q is updatedk+1(s, a)=Qk(s,a)+0.5δkek(s,a);
Wherein, λ, γ, α are respectively the qualification decay factor of control system, discount factor, Q learning rates, and their value is 0.9th, 0.9,0.5, first step of execution updates eligibility trace matrix e with λ and γk(s a), updates what is asked required by Q functions Error evaluation δk, learning rate α and eligibility trace ek(s,a)。
6) for region j, perform
1. step 5 obtains value function Qk+1(s, a)=Qk(s,a)+0.5δkek(s,a)
2. formula is utilizedSolve mixed strategyWhereinIt is learning rate changing, | Ai| it is behavior aggregate Element number, takes 11 here.
③:According to take satisfaction require Q learning parameters α and step 4 required by Q functional value errors ρk, update value function Qk+1(sk, ak)=Qk+1(sk, ak)+0.5pk
4. eligibility trace element e (s are updatedk,ak)=e (sk,ak)+1;
5. learning rate changing is selectedAccording to
In formula,WithTwo learning parameters be used for representing the win of intelligent body with it is defeated.
6. the mixing Average Strategy U (s required by step 6 are utilizedk,ak), update average mixed strategy table:
7. the number of times undergone from original state to current state is updated:
visit(sk)=visit (sk)+1
For i VTG, clan power △ P are first obtainediw(i=1,2,3 ..., n)
7) climbing slope, is solved
Wherein, URiwAnd DRiwIt is the bound of climbing slope respectively.
8), leader and the virtual uniformity variable that followed by are updated, according to formula:
The former is that leader's uniformity variable is updated, and the latter is that the virtual uniformity variable that followed by is carried out more Newly, in formula, in i-th of VGT, mi is unit sum,It is μ i in random row matrix, formula>0 represents i-th The Dynamic gene of individual VGT power deviations, △ Perror-iRepresent the deviation of i-th of VGT general powers instruction and all aggregate capacities.
9), when reaching boundary condition, generated output Δ P is calculatediwAnd tiw, according to formula:
The former is that leader's uniformity variable is updated, and the latter is that the virtual uniformity variable that followed by is carried out more Newly, in formula, URiwAnd DRiwIt is the bound of climbing slope respectively.
10) following machine matrix, is updated, according to formula:
In formula, L=[lij]∈Rn×nIt is Laplacian Matrix, constant bij(bij>=0) represent weight factor between intelligent body
D=[dij]∈Rn×nRepresent row stochastic matrix,It is i-th of VGT weighted adjacent matrix.
11) power deviation and judgment bias, are calculated, according to formula
10) more, | Perror-r|<εi
Power of the assembling unit △ P obtained by step 10iw, calculate power deviation △ Perror-i, whether judgment bias meets condition, full Foot obtains the power of the assembling unit.It is unsatisfactory for, jumps to step 8.
12) power of the assembling unit △ P, are obtainediw, next iteration k=k+1 is carried out, step 1 is jumped to.
Core of the present invention is solution, the Mei Gezhi in the designing of reward function, optimal average mixed strategy and learning rate changing Renewal, the virtually proposition of generating clan (VTC) and the combination of consistency algorithm of energy body Q values.Wherein virtual generating clan carries Go out with uniformity calculate combination be key innovations, the implementation of this method and correlation technique, realize save net, distribution and Optimal control between microgrid, the problem of solution occurs solve when intelligent body number is numerous more and is carried out to AGC power instructions The problem of dynamic optimization is distributed, this solves the consistent game of Stochastic Game that is mixed based on isomorphism isomery multiple agent simultaneously Basic Science Problem.
The present invention proposes virtual this concept of generating clan (VGT), the multi-agent system stochastic game theory combined (MAS-SG) and harmonious the two frameworks of property (MAS-CC) of multi-agent system, to solve to instruct dynamic optimization to general power Control and distribution:Using MAS-SG frameworks, used method is that the distributing based on multiple agent is won or Fast Learning Hill climbing method (decentralized win or learn fast policy hill-climbing (λ), DWoLF-PHC (λ)), the complicated dynamic game for solving the multiple agent of isomery is discussed and the problem of decision theory, realizes the optimum control to AGC; MAS-CC frameworks are employed, used method is harmonious control algolithm (collaborative consensus Algorithm, CCA), solve quick distribution power, optimization Cooperation controlling.
Above-described embodiment is preferably embodiment, but embodiments of the present invention are not by above-described embodiment of the invention Limitation, other any Spirit Essences without departing from the present invention and the change made under principle, modification, replacement, combine, simplification, Equivalent substitute mode is should be, is included within protection scope of the present invention.

Claims (10)

1. the wolf pack clan strategy process based on the random consistent game of multiple agent and virtual generating clan, it is characterised in that including Following steps:
Step S1, determine state discrete collection S;
Step S2, determine teamwork discrete set A;
Step S3, when each controlling cycle starts, gather the real-time running data of each power network, the real-time running data bag Frequency departure △ f and power deviation △ P are included, regional control error ACE is calculatedi(k) instantaneous value and control performance standard CPSi(k) instantaneous value;
Step S4, in current state S, certain regional power grid i obtains a short-term reward function signal Ri(k);
Step S5, pass through to calculate and obtain value function error p with estimationk、δk
Step S6, optimal objective value function and strategy are asked for by function;
Step S7, to all regional power grid j, update institute it is stateful-action is to (s, Q functions form and eligibility trace matrix e a)j (s, a), and by the mixed strategy U under the Q values renewal current state S updatedk(sk,ak), then by mixed strategy Uk(sk,ak) update Value function Qk+1(sk,ak), eligibility trace element e (s, a), learning rate changing φ and average mixed strategy table;
Step S8, determine Laplacian Matrix L, obtain clan power △ Pi, and climbing slope is obtained by clan's power;
Step S9, leader and the virtual uniformity variable that followed by are updated;
Step S10, each power of the assembling unit △ P of solutioniwIf the power of the assembling unit is out-of-limit, step S9 is jumped to;
Step S11, when reaching boundary condition, calculate generated output △ PiwAnd tiwAnd update row stochastic matrix element;
Step S12, calculating power deviation △ Perror-i, judge whether to meet, next step calculating carried out if meeting, if being unsatisfactory for Then jump to step step 9;
Step S13, return to step S3.
2. the strategy side of wolf pack clan according to claim 1 based on multiple agent random unanimously game and virtual generating clan Method, it is characterised in that:The teamwork discrete set A of step S2 expression formula is:
A=A1×A2×…×Ai×…×AnWherein, AiFor intelligent body i output discrete movement collection, n is intelligent body number.
3. the strategy side of wolf pack clan according to claim 1 based on multiple agent random unanimously game and virtual generating clan Method, it is characterised in that:The real-time running data of the step S3 is gathered using computer and monitoring system;
The area control error ACE of the region ii(k) instantaneous value calculating method is as follows:
ACE=Ta-Ts-10B(Fa-Fs),
Wherein, TaFor the actual trend value of interconnection, TsExpect trend value for interconnection, B is frequency bias coefficient, FaIt is real for system Border frequency values, FsFor system expected frequency value;
The CPS of the control performance standard 1 of the region ii(k) instantaneous value calculating method is as follows:
CPS1=(2-CF1) × 100%,
Wherein,BiFor control area i frequency bias coefficient;ε1It is interconnected network to complete The control targe value of year 1 minute frequency averaging deviation root mean square;N is the number of minutes of the examination period;ACEAVE-1minControlled for region Average values of the error ACE processed in 1 minute;△fAVEFor average values of the frequency departure △ f in 1 minute;
The CPS of the control performance standard 2 of the region ii(k) instantaneous value calculating method is as follows:
CPS2=(1-R) × 100%,
Wherein,
In formula, ε10For control targe value of the interconnected network to annual 10 minutes frequency averaging deviation root mean square;BnetFor whole interconnection The frequency bias coefficient of power network;ACEAVE-10minFor average values of the area control error ACE in 10 minutes.
4. the strategy side of wolf pack clan according to claim 1 based on multiple agent random unanimously game and virtual generating clan Method, it is characterised in that:The short-term reward function signal R of the step S4i(k) obtained by following formula, formula is as follows:
Wherein, ACE (k) and △ P in formulaiw(k) instantaneous value and kth step for representing kth step iteration area control error respectively change The real output of w-th of unit in generation;μ and (1- μ) represent area control error and the weights of carbon emission respectively, each The μ values in region are identical, and μ=0.5 is set to herein;DiwIt is the carbon intensity coefficient of w units, unit is kg/kWh; WithThe respectively bound of w unit capacities;Thermoelectricity generating set efficiency is considered, when unit adjustable capacity is more than During 600MW, Dj=0.87, when unit rated capacity, which is less than or equal to 600MW, is more than 300MW, take Dj=0.89, work as unit capacity D is taken during less than or equal to 300MWj=0.99;The D of fuel oil consump-tion, Gas Generator Set and Hydropower Unit in each VGTjIt is respectively set to 0.7,0.5,0。
5. the strategy side of wolf pack clan according to claim 1 based on multiple agent random unanimously game and virtual generating clan Method, it is characterised in that:The value function error p of the step S5k、δkBy formula:
pk=R (sk,sk+1,ak)+γQk(sk+1,ag)-Qk(sk,ak) and δk=R (sk,sk+1,ak)+γQk(sk+1,ag)-Qk(sk, ak)
Wherein, R (sk,sk+1,ak) it is in selected action akLower state is from skTo sk+1Intelligent body reward function, γ be discount because Son, γ span is 0<γ<1, agFor greedy action policy.
6. the strategy side of wolf pack clan according to claim 1 based on multiple agent random unanimously game and virtual generating clan Method, it is characterised in that:In the step S6, optimal objective value functionWith tactful π*(s) it is:
In formula, A is behavior aggregate.
7. the strategy side of wolf pack clan according to claim 1 based on multiple agent random unanimously game and virtual generating clan Method, it is characterised in that:In the step S7, update eligibility trace matrix and pass through formula:
ek+1(s,a)←γλek(s a), updates Q function forms, according to formula Qk+1(s, a)=Qk(s,a)+αδkek(s,a);
Wherein, ek(s, a) walks the eligibility trace of iteration for kth under acting a in state s, and γ is discount factor, and γ span is 0 <γ<1, λ is mark decay factor, and λ span is 0<λ<1, α is Q learning rates, and α sets scope to be 0<α<1.
8. the strategy side of wolf pack clan according to claim 1 based on multiple agent random unanimously game and virtual generating clan Method, it is characterised in that:Mixed strategy U in the step S7k(sk,ak) updated according to following formula:
In formula, φiFor learning rate changing.
In the step S7, value function Q is updatedk+1(sk,ak), according to formula:
Qk+1(sk,ak)=Qk+1(sk,ak)+αpk
Update eligibility trace element e (sk,ak)←e(sk,ak)+1, according to formula:
Update learning rate changingAccording to formula:
Average mixed strategy table is updated, according to formula:
In formula,WithTwo learning parameters are used for representing the win of intelligent body and defeated, visit (sk) be from original state to work as The s that preceding state is undergonekNumber of times.
9. the strategy side of wolf pack clan according to claim 1 based on multiple agent random unanimously game and virtual generating clan Method, it is characterised in that:Step S8, determines Laplace formula L=[lij]∈Rn×n, according to formula
In formula, constant bij(bij>=weight factor between intelligent body 0) is represented, climbing slope is calculated, according to formula:
In formula,It is the climbing power of unit, URiwAnd DiwIt is the bound of climbing slope respectively.
10. the wolf pack clan according to claim 1 based on the random consistent game of multiple agent and virtual generating clan is tactful Method, it is characterised in that:Step S9, is updated to leader and the virtual uniformity variable that followed by, according to formula:
The former is that leader's uniformity variable is updated, and the latter is that the virtual uniformity variable that followed by is updated. In formula, in i-th of VGT, miIt is unit sum,It is μ in random row matrix, formulai>0 represents i-th The Dynamic gene of VGT power deviations, △ Perror-iRepresent the deviation of i-th of VGT general powers instruction and all aggregate capacities;
Calculate power of the assembling unit △ PiwIf the power of the assembling unit is out-of-limit, power deviation is calculated, judges whether to meet condition, then obtains Since the power of the assembling unit, carry out repeatedly calculating k=k+1 next time, deviation is unsatisfactory for condition, then above step meter repeated calculating uniformity Calculate;When reaching boundary condition, generated output △ P are calculatediwAnd tiw, according to formula:
Row stochastic matrix element is updated, according to formula:
In formula, L=[lij]∈Rn×nIt is Laplacian Matrix, constant bij(bij>=0) represent weight factor D=between intelligent body [dij]∈Rn×nRepresent row stochastic matrix,It is i-th of VGT weighted adjacent matrix
The deviation △ P of aggregate capacityerror-iCalculation formula:
If Δ Pi>0, thenOtherwise
Judge whether power deviation meets condition, meet then progress next iteration calculating and take k=k+1;It is unsatisfactory for, jumps to meter The step of calculating uniformity.
CN201611117291.8A 2016-12-07 2016-12-07 Wolf pack clan strategy process based on the random consistent game of multiple agent and virtual generating clan Pending CN107045655A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611117291.8A CN107045655A (en) 2016-12-07 2016-12-07 Wolf pack clan strategy process based on the random consistent game of multiple agent and virtual generating clan

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611117291.8A CN107045655A (en) 2016-12-07 2016-12-07 Wolf pack clan strategy process based on the random consistent game of multiple agent and virtual generating clan

Publications (1)

Publication Number Publication Date
CN107045655A true CN107045655A (en) 2017-08-15

Family

ID=59543466

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611117291.8A Pending CN107045655A (en) 2016-12-07 2016-12-07 Wolf pack clan strategy process based on the random consistent game of multiple agent and virtual generating clan

Country Status (1)

Country Link
CN (1) CN107045655A (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106372366A (en) * 2016-09-30 2017-02-01 三峡大学 Intelligent power generation control method based on hill-climbing algorithm
CN107589672A (en) * 2017-09-27 2018-01-16 三峡大学 The intelligent power generation control method of isolated island intelligent power distribution virtual wolf pack control strategy off the net
CN108092307A (en) * 2017-12-15 2018-05-29 三峡大学 Layered distribution type intelligent power generation control method based on virtual wolf pack strategy
CN108449212A (en) * 2018-03-23 2018-08-24 大连大学 MAS message delivery methods based on event correlation
CN108737266A (en) * 2018-04-28 2018-11-02 国网江苏省电力有限公司苏州供电分公司 Dynamics route selection method based on double estimators
CN108898221A (en) * 2018-06-12 2018-11-27 中国科学技术大学 The combination learning method of feature and strategy based on state feature and subsequent feature
CN109034563A (en) * 2018-07-09 2018-12-18 国家电网公司 A kind of increment power distribution network source net lotus collaborative planning method of multi-agent Game
CN109217306A (en) * 2018-10-19 2019-01-15 三峡大学 A kind of intelligent power generation control method based on the deeply study with movement from optimizing ability
CN109656140A (en) * 2018-12-28 2019-04-19 三峡大学 A kind of fractional order differential offset-type VSG control method
CN109784545A (en) * 2018-12-24 2019-05-21 深圳供电局有限公司 A kind of dispatching method of the distributed energy hinge based on multiple agent
CN111934364A (en) * 2020-07-30 2020-11-13 国网甘肃省电力公司电力科学研究院 Emergency source network coordination peak regulation control method in fault state of transmitting-end power grid
CN112714165A (en) * 2020-12-22 2021-04-27 声耕智能科技(西安)研究院有限公司 Distributed network cooperation strategy optimization method and device based on combination mechanism
CN113128705A (en) * 2021-03-24 2021-07-16 北京科技大学顺德研究生院 Intelligent agent optimal strategy obtaining method and device
CN113269297A (en) * 2021-07-19 2021-08-17 东禾软件(江苏)有限责任公司 Multi-agent scheduling method facing time constraint
CN114280931A (en) * 2021-12-14 2022-04-05 广东工业大学 Method for solving consistency of multiple intelligent agents based on intermittent random noise
CN116706997A (en) * 2023-06-12 2023-09-05 国网湖北省电力有限公司电力科学研究院 Cooperative control method, device and system for micro-grid group and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2913592A (en) * 1958-10-30 1959-11-17 Westinghouse Electric Corp Automatic generation control
GB866271A (en) * 1956-07-31 1961-04-26 Gen Electric Improvements in electric power control system
CN103490413A (en) * 2013-09-27 2014-01-01 华南理工大学 Intelligent electricity generation control method based on intelligent body equalization algorithm
CN106026084A (en) * 2016-06-24 2016-10-12 华南理工大学 AGC power dynamic distribution method based on virtual generation tribe

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB866271A (en) * 1956-07-31 1961-04-26 Gen Electric Improvements in electric power control system
US2913592A (en) * 1958-10-30 1959-11-17 Westinghouse Electric Corp Automatic generation control
CN103490413A (en) * 2013-09-27 2014-01-01 华南理工大学 Intelligent electricity generation control method based on intelligent body equalization algorithm
CN106026084A (en) * 2016-06-24 2016-10-12 华南理工大学 AGC power dynamic distribution method based on virtual generation tribe

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
席磊等: "A wolf pack hunting strategy based virtual tribes control for automatic generation control of smart grid", 《APPLIED ENERGY》 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106372366A (en) * 2016-09-30 2017-02-01 三峡大学 Intelligent power generation control method based on hill-climbing algorithm
CN107589672A (en) * 2017-09-27 2018-01-16 三峡大学 The intelligent power generation control method of isolated island intelligent power distribution virtual wolf pack control strategy off the net
CN108092307A (en) * 2017-12-15 2018-05-29 三峡大学 Layered distribution type intelligent power generation control method based on virtual wolf pack strategy
CN108449212B (en) * 2018-03-23 2021-01-12 大连大学 MAS message transmission method based on event association
CN108449212A (en) * 2018-03-23 2018-08-24 大连大学 MAS message delivery methods based on event correlation
CN108737266A (en) * 2018-04-28 2018-11-02 国网江苏省电力有限公司苏州供电分公司 Dynamics route selection method based on double estimators
CN108898221A (en) * 2018-06-12 2018-11-27 中国科学技术大学 The combination learning method of feature and strategy based on state feature and subsequent feature
CN109034563B (en) * 2018-07-09 2020-06-23 国家电网有限公司 Multi-subject game incremental power distribution network source-network-load collaborative planning method
CN109034563A (en) * 2018-07-09 2018-12-18 国家电网公司 A kind of increment power distribution network source net lotus collaborative planning method of multi-agent Game
CN109217306A (en) * 2018-10-19 2019-01-15 三峡大学 A kind of intelligent power generation control method based on the deeply study with movement from optimizing ability
CN109784545A (en) * 2018-12-24 2019-05-21 深圳供电局有限公司 A kind of dispatching method of the distributed energy hinge based on multiple agent
CN109656140A (en) * 2018-12-28 2019-04-19 三峡大学 A kind of fractional order differential offset-type VSG control method
CN111934364A (en) * 2020-07-30 2020-11-13 国网甘肃省电力公司电力科学研究院 Emergency source network coordination peak regulation control method in fault state of transmitting-end power grid
CN112714165A (en) * 2020-12-22 2021-04-27 声耕智能科技(西安)研究院有限公司 Distributed network cooperation strategy optimization method and device based on combination mechanism
CN113128705A (en) * 2021-03-24 2021-07-16 北京科技大学顺德研究生院 Intelligent agent optimal strategy obtaining method and device
CN113128705B (en) * 2021-03-24 2024-02-09 北京科技大学顺德研究生院 Method and device for acquiring intelligent agent optimal strategy
CN113269297A (en) * 2021-07-19 2021-08-17 东禾软件(江苏)有限责任公司 Multi-agent scheduling method facing time constraint
CN114280931A (en) * 2021-12-14 2022-04-05 广东工业大学 Method for solving consistency of multiple intelligent agents based on intermittent random noise
CN116706997A (en) * 2023-06-12 2023-09-05 国网湖北省电力有限公司电力科学研究院 Cooperative control method, device and system for micro-grid group and storage medium

Similar Documents

Publication Publication Date Title
CN107045655A (en) Wolf pack clan strategy process based on the random consistent game of multiple agent and virtual generating clan
CN112615379B (en) Power grid multi-section power control method based on distributed multi-agent reinforcement learning
Khan et al. Multi-agent based distributed control architecture for microgrid energy management and optimization
CN106549394B (en) Electric power idle work optimization system and method based on double fish-swarm algorithms
CN104377826B (en) A kind of active distribution network control strategy and method
CN109936133B (en) Power system vulnerability analysis method considering information and physics combined attack
CN105006846B (en) A kind of wind energy turbine set station level active power optimization method
CN109217306A (en) A kind of intelligent power generation control method based on the deeply study with movement from optimizing ability
CN109103893A (en) A kind of cluster temperature control load participates in the auxiliary frequency modulation method of power grid AGC
CN103490413A (en) Intelligent electricity generation control method based on intelligent body equalization algorithm
Wang et al. Multiobjective reinforcement learning-based intelligent approach for optimization of activation rules in automatic generation control
CN104269873A (en) CSMA/CD-mechanism-referred micro-grid autonomous control method based on system health status evaluation
CN106712075A (en) Peaking strategy optimization method considering safety constraints of wind power integration system
CN105703355A (en) Diverse load grading self-discipline collaboration demand response method
Jordehi et al. Heuristic methods for solution of FACTS optimization problem in power systems
CN108767866A (en) Energy management method, apparatus and system
CN106340890B (en) For coordinating the distributed control method of power distribution network energy-storage system efficiency for charge-discharge
CN116169776A (en) Cloud edge cooperative artificial intelligent regulation and control method, system, medium and equipment for electric power system
Ebell et al. Reinforcement learning control algorithm for a pv-battery-system providing frequency containment reserve power
CN104836227B (en) The power distribution network active voltage control method of case-based reasioning
Beheshtikhoo et al. Design of type-2 fuzzy logic controller in a smart home energy management system with a combination of renewable energy and an electric vehicle
Sun et al. Hybrid reinforcement learning for power transmission network self-healing considering wind power
CN107589672A (en) The intelligent power generation control method of isolated island intelligent power distribution virtual wolf pack control strategy off the net
CN106372366A (en) Intelligent power generation control method based on hill-climbing algorithm
Ghasemi et al. Optimal placement and tuning of robust multimachine PSS via HBMO

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170815

RJ01 Rejection of invention patent application after publication