CN115392766A - Demand side resource collaborative optimization scheduling method based on local power market - Google Patents

Demand side resource collaborative optimization scheduling method based on local power market Download PDF

Info

Publication number
CN115392766A
CN115392766A CN202211110254.XA CN202211110254A CN115392766A CN 115392766 A CN115392766 A CN 115392766A CN 202211110254 A CN202211110254 A CN 202211110254A CN 115392766 A CN115392766 A CN 115392766A
Authority
CN
China
Prior art keywords
demand side
market
local
electricity
price
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211110254.XA
Other languages
Chinese (zh)
Inventor
赵博超
许彪
栾文鹏
刘博�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202211110254.XA priority Critical patent/CN115392766A/en
Publication of CN115392766A publication Critical patent/CN115392766A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • G06Q10/06315Needs-based resource requirements planning or analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Software Systems (AREA)
  • Marketing (AREA)
  • Health & Medical Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Artificial Intelligence (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a demand side resource collaborative optimization scheduling method based on a local power market, which comprises the following steps: establishing a demand side resource aggregation model for an electric vehicle charging station, a heating ventilation air-conditioning aggregator and a micro-grid; on the basis of the intermediate market price, the network loss balance cost is apportioned according to the network loss sensitivity, a local electric power market pricing strategy meeting the balance of revenue and expenditure requirement is provided, and a local electric power market guiding the cooperative operation of the resources on the demand side is established; constructing basic elements of a multi-agent reinforcement learning method based on a demand side resource aggregation model and a local power market; and training an agent representing different demand side resource aggregators by using the MACSAC, and performing scheduling decision on the demand side resource aggregators represented by the agent to realize distributed cooperative optimal scheduling of demand side resources. The demand side resource coordination under the multi-benefit subject is solved, and the balance of the local power market, the privacy safety of the user and the safe and stable operation of the power distribution network are effectively guaranteed.

Description

Demand side resource collaborative optimization scheduling method based on local power market
Technical Field
The invention relates to the field of power systems and automation thereof, in particular to a demand side resource collaborative optimization scheduling method based on a local power market.
Background
Demand side resources, including renewable energy generation, flexible demand and energy storage, are growing rapidly in power distribution networks today, with great potential in voltage regulation, congestion management and carbon abatement. To activate these potentials, a coordinated management of demand side resources is required. As a marketization means, the local power market can stimulate users to spontaneously conduct direct power trading through price signals, promote the balance of local power supply and demand, and achieve the purpose of coordinated operation of demand-side resources. Specifically, the local power market sets a selling price higher than an online electricity price and a buying price lower than a time-of-use electricity price according to the local supply and demand conditions, and stimulates users to adjust the power utilization curve and trade power surplus or power shortage on the local power market, so that the dependence on a power grid is reduced. Compared with a centralized control means, the local power market adopts an indirect guidance mode, the control decision right of the equipment is released to the user, the control autonomy of the user is kept, and the privacy safety problem is avoided. However, the existing work still has many disadvantages, which are as follows:
neglecting the influence of local power trading on network loss, the pricing mechanism of the local power market has the problem of unbalanced revenue and expenditure. The existing pricing mechanism aiming at the local electric power market often determines the local transaction price directly according to the electric quantity reported by a user, changes of net load and net generated electric quantity caused by network loss are ignored, and the local electric power market cannot charge the cost required by balancing the network loss through pricing, so that the problem of unbalanced balance is caused. In addition, how to distribute the network loss balance cost among users fairly according to the size of the network loss contribution degree is also a key problem which must be solved.
The transaction decision of the user in the local electric power market does not consider the network security constraint, and the safe and stable operation of the power grid can be threatened. Most researches generally consider that the generation and utilization scales of users in the local power market are small, and the influence on the safe and stable operation of a power grid is negligible. However, due to the self-owned tendency of users, the electricity demand/generation plan may be concentrated at the time of the lowest/highest electricity price, which causes demand bounce or new generation peak, thereby possibly causing problems such as node voltage out-of-limit, line transmission blocking, etc.
The existing distributed optimization algorithm is difficult to meet the requirement of the optimal trading decision of the user on the solving efficiency in the local power market. The alternative multiplier method, a commonly used distributed optimization algorithm, can help users to make trading decisions independently, but the method needs a coordination center to coordinate all users. The method based on the consistency principle is improved on the basis of the alternative multiplier method, and complete decentralized optimization can be realized without depending on a coordination center. However, both the alternative multiplier method and the improved method involve an iterative solution process, and the convergence of solving the large-scale optimization problem needs to be verified.
Disclosure of Invention
Aiming at the prior art, the invention provides a demand side resource collaborative optimization scheduling method based on a local power market, which mainly comprises the following steps:
s1, modeling three types of demand side resource aggregators, namely an electric vehicle charging station, a heating ventilation air conditioner aggregator and a micro-grid to obtain a demand side resource aggregation model;
s2, on the basis of the intermediate market price, the network loss balance cost is apportioned according to the network loss sensitivity, a local electric power market pricing strategy meeting the balance of revenue and expenditure is provided, and a local electric power market guiding the resources on the demand side to cooperatively run is established;
s3, constructing basic elements of the multi-agent reinforcement learning method based on a demand side resource aggregation model and a local power market, wherein the basic elements comprise an agent, an environment, an observation value, an action value, a reward function and a cost function;
and S4, training the agents representing different demand side resource aggregators by using a multi-agent reinforced learning algorithm MACSAC (multi-agent constrained soft operator-critical), and using the trained agents to perform scheduling decision on the demand side resource aggregators represented by the agents, thereby realizing distributed cooperative optimal scheduling of demand side resources.
Further, the specific steps of step S1 include:
s1-1) modeling is carried out on the electric vehicle charging station by adopting a virtual energy storage model, wherein the virtual energy storage model of the electric vehicle charging station is expressed as follows:
Figure BDA0003843780860000021
in the formula (1), the reaction mixture is,
Figure BDA0003843780860000022
representing the total power of the virtual stored energy i representing the electric vehicle charging station at the moment t,
Figure BDA0003843780860000023
and
Figure BDA0003843780860000024
respectively an upper power limit and a lower power limit;
Figure BDA0003843780860000025
representing an amount of power representing a virtual stored energy of an electric vehicle charging station,
Figure BDA0003843780860000026
and
Figure BDA0003843780860000027
respectively an electric quantity upper limit and an electric quantity lower limit; Δ t represents a time interval;
Figure BDA0003843780860000028
representing the virtual energy storage capacity change caused by the fact that the electric vehicle leaves or enters the charging station;
s1-2) modeling the heating, ventilation and air conditioning aggregator by adopting a virtual energy storage model, wherein the virtual energy storage model of the heating, ventilation and air conditioning aggregator is expressed as follows:
Figure BDA0003843780860000029
in the formula (2), the reaction mixture is,
Figure BDA00038437808600000210
to represent the total power of the virtual energy storage i of the hvac aggregator at time t,
Figure BDA00038437808600000211
and
Figure BDA00038437808600000212
respectively an upper power limit and a lower power limit;
Figure BDA00038437808600000213
to represent the amount of electricity of the virtual stored energy of the hvac aggregator,
Figure BDA00038437808600000214
and
Figure BDA00038437808600000215
respectively an electric quantity upper limit and an electric quantity lower limit; alpha is the electric quantity attenuation rate of the virtual energy storage; in addition, the virtual energy storage representing the heating, ventilation and air conditioning also has a reference load
Figure BDA00038437808600000216
The parameters of (1);
s1-3) the microgrid comprises a photovoltaic system, an energy storage system and a non-flexible load; wherein: the photovoltaic system is modeled as:
Figure BDA0003843780860000031
Figure BDA0003843780860000032
the energy storage system is modeled as follows:
Figure BDA0003843780860000033
the inflexible load modeling is as follows:
Figure BDA0003843780860000034
Figure BDA0003843780860000035
in the formulae (4) to (8),
Figure BDA0003843780860000036
outputting an active power predicted value for the photovoltaic;
Figure BDA0003843780860000037
outputting reactive power for the photovoltaic inverter;
Figure BDA0003843780860000038
outputting active power for the photovoltaic system; sigma is a power factor;
Figure BDA0003843780860000039
the output power of the energy storage i at the moment t is obtained;
Figure BDA00038437808600000310
and
Figure BDA00038437808600000311
respectively an upper limit and a lower limit of the energy storage power;
Figure BDA00038437808600000312
the electric quantity of the stored energy i at the moment t is obtained;
Figure BDA00038437808600000313
and
Figure BDA00038437808600000314
respectively representing an upper limit and a lower limit of the energy storage electric quantity; eta + And η - Respectively the charge and discharge efficiency of the stored energy;
Figure BDA00038437808600000315
and
Figure BDA00038437808600000316
the active power of the inflexible load and a predicted value thereof;
Figure BDA00038437808600000317
and
Figure BDA00038437808600000318
and the reactive power of the inflexible load and a predicted value thereof.
In step S2 of the method, the local power market is formed by taking a demand side resource aggregator as a market member and a trading platform together; the local electricity market pricing strategy that meets the balance of revenue and expenditure requirements comprises: determining a basic electricity price; determining a network loss allocation price; determining local electricity purchasing price and local electricity selling price by combining the basic electricity price and the network loss allocation price; the local power market guiding the cooperative operation of the resources on the demand side is established based on a local power market pricing strategy, the local electricity purchasing price and the electricity selling price are determined in a self-adaptive mode according to the supply and demand balance condition on the local power market, and the resource aggregator on the demand side makes an optimal resource scheduling decision on the demand side according to the local electricity purchasing price and the electricity selling price, so that the cooperative operation of the resources on the demand side is realized. The specific content of step S2 is as follows:
s2-1) determining the basic electricity price: firstly, each market member determines the electricity purchasing quantity or electricity selling quantity on the local electricity market according to the scheduling decision value of the managed demand side resource:
Figure BDA00038437808600000319
in the formula (9), the reaction mixture is,
Figure BDA00038437808600000320
representing the electricity purchases submitted by the demand side resource aggregator i on the local electricity market at time tElectric quantity or selling electric quantity, omega EV 、Ω HVAC And Ω MG Respectively representing an electric vehicle charging station set, a heating ventilation air-conditioning aggregation business set and a micro-grid set in the market; the local power market then calculates the total demand on the market
Figure BDA0003843780860000041
Total power generation
Figure BDA0003843780860000042
And net demand
Figure BDA0003843780860000043
Figure BDA0003843780860000044
In the formula (10), the compound represented by the formula (10),
Figure BDA0003843780860000045
representing the increment of loss of the network caused by local electric power market trading; Ω represents a set of all market members; calculating a base electricity purchase price according to the intermediate market price definition
Figure BDA0003843780860000046
And basic electricity selling price
Figure BDA0003843780860000047
Figure BDA0003843780860000048
Figure BDA0003843780860000049
In the formula (12), the reaction mixture is,
Figure BDA00038437808600000410
and
Figure BDA00038437808600000411
respectively representing time-of-use electricity price and internet electricity price in the electricity price of the power grid;
s2-2) determining the network loss apportionment price: based on the network loss sensitivity, calculating the network loss apportionment price of each node:
Figure BDA00038437808600000412
in the formula (13), the reaction mixture is,
Figure BDA00038437808600000413
the network loss sensitivity coefficient corresponding to the node i is obtained;
s2-3) determining local electricity purchasing price and local electricity selling price by combining the basic electricity price and the network loss share price:
Figure BDA00038437808600000414
Figure BDA00038437808600000415
in formulas (14) and (15):
Figure BDA00038437808600000416
representing local electricity purchase prices of market members located at node i;
Figure BDA00038437808600000417
a local electricity selling price representing a market member located at the node j;
s2-4) the operation mechanism of the local power market is as follows: when each transaction is started, the transaction platform firstly releases the current power grid price and the local electricity purchasing price and the local electricity selling price of the last transaction to market members; the market members make a demand side resource scheduling decision, determine electricity purchasing electric quantity and electricity selling electric quantity according to the scheduling decision, and submit the electricity purchasing electric quantity and the electricity selling electric quantity to the transaction platform; the trading platform clears the local electric power market by using a local electric power market pricing strategy to obtain a local electricity purchasing price and an electricity selling price; entering the next transaction; therefore, the cooperative operation of the resources on the demand side is realized.
In step S3 of the method of the present invention, the basic elements for constructing the multi-agent reinforcement learning method include:
s3-1) an agent: initializing an agent for each demand side resource aggregator, wherein the agent comprises a reward evaluator network, a cost evaluator network and an actor network, the reward evaluator network and the cost evaluator network are used for guiding actor network training, and the actor network is used for outputting a scheduling decision of the demand side resource aggregator on the managed demand side resource;
s3-2) environment: the local power market and the power distribution network jointly form an environment for interacting with the intelligent agent; the interaction process of the environment and the intelligent agent is as follows: after receiving the electricity purchasing quantity and the electricity selling quantity submitted by the demand side resource aggregator, the local electricity market clears the market by using the proposed local electricity market pricing strategy to obtain a local electricity purchasing price and an electricity selling price, and distributes the local electricity purchasing price and the electricity selling price to each demand side resource aggregator; meanwhile, the power distribution network operation platform feeds back the running state information of the power distribution network at the current moment, including node voltage and branch power, to each demand side resource aggregator;
s3-3) observed value: defining an observed value of each intelligent agent according to the monitoring value of the equipment operation state in the demand side resource aggregation model established in the step S1 and price information transmitted to the intelligent agents by the environment, wherein the price information comprises power grid electricity price, local electricity purchasing price and local electricity selling price;
for an agent i representing the microgrid, the set of observation values at the moment t is as follows:
Figure BDA0003843780860000051
Figure BDA0003843780860000052
for agent j representing an electric vehicle charging station, itThe set of observations at time t is:
Figure BDA0003843780860000053
Figure BDA0003843780860000054
for the intelligent agent k of the heating, ventilation and air conditioning aggregator, the observation value set at the time t is
Figure BDA0003843780860000055
Figure BDA0003843780860000056
S3-4) action value: defining an action value of each agent according to the control variable of the equipment in the demand side resource aggregation model established in the step S1, wherein the action values of the agents form a scheduling decision of a demand side resource aggregator on the demand side resources managed by the demand side resource aggregator; for an agent i representing a microgrid, the set of action values is
Figure BDA0003843780860000057
For agent j representing a charging station for an electric vehicle, the action value is
Figure BDA0003843780860000058
For an agent k representing the aggregator of heating, ventilation and air conditioning, the action value is
Figure BDA0003843780860000059
S3-5) reward function: the reward function for each agent is defined as its electricity sales revenue minus electricity purchase expenditure on the local electricity market, which is calculated by the local electricity market in the environment and fed back to the agent:
Figure BDA00038437808600000510
the operator in the formula (16) is defined as [ x ]] + /[x] - =max/min(θ,x);
S3-6) cost function: defining a cost function according to the safe operation constraint of the power distribution network, wherein the cost function value is calculated by a power distribution network operation platform in the environment and fed back to an intelligent agent:
Figure BDA00038437808600000511
in the formula (17), V n Is the voltage of node n;
Figure BDA00038437808600000512
andVthe upper limit and the lower limit of the node voltage are respectively; n is a radical of i The method is a node set in a power distribution network partition to which an agent i belongs.
In step S4 of the method, an agent representing resource aggregators on different demand sides is trained by using a multi-agent reinforcement learning algorithm MACSAC, and the method specifically comprises the following steps:
s4-1) initializing parameters of an actor network, a reward evaluator network and a cost evaluator network and an empty experience playback pool for each agent;
s4-2) inputting the observed value of each agent in the environment to the actor network to obtain an action value;
s4-3) each market member schedules the managed demand-side resource according to the action value given by the intelligent agent, calculates the electricity purchasing and selling quantity and submits the electricity purchasing and selling quantity to the local electric power market, the local electric power market clears the market according to the pricing mechanism provided in the step S2, the local electric power market feeds back a reward function value and price information to the market members, and the power distribution network operation platform feeds back a cost function value to the market members;
s4-4) storing an action value, an observation value, an incentive function value and a cost function value generated by interaction of the intelligent agent and the environment as a sample in an experience playback pool;
s4-5) randomly extracting a batch of samples from the experience playback pool, and updating the network parameters of the reward evaluator and the cost evaluator of each agent;
s4-6) estimating the future accumulated reward expectation value and the accumulated cost expectation value of each intelligent agent by using the reward evaluator network and the cost evaluator network, and guiding the updating of the actor network of each intelligent agent;
s4-7) repeating the steps S4-2) -S4-6) until the number of times of updating the parameters reaches a set value, and obtaining a trained intelligent agent;
when the method of the invention is used for demand side resource collaborative optimization scheduling, after each agent receives the observed value of the current moment, the agent network is used for outputting the action value, and the corresponding demand side resource aggregator schedules the managed demand side resource according to the action value, thereby realizing the distributed collaborative optimization scheduling of the demand side resource.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides a demand side resource collaborative optimization scheduling method based on a local power market. The marketization means is utilized to guide the demand side resources to actively adjust the power curve so as to realize local supply and demand balance, and the problem of resource coordination of the demand side under the multi-benefit subject is solved; a local electric power market pricing mechanism considering the fair share of the network loss is designed, and the balance of the local electric power market is effectively guaranteed; the resource collaborative optimization scheduling problem at the demand side is modeled into a limited Markov game model, and the MACSAC algorithm is utilized to realize decentralized optimization solution, so that the privacy and the safety of users are protected, and the safe and stable operation of the power distribution network is ensured.
Drawings
Fig. 1 is a flow chart of a scheduling method of the present invention.
FIG. 2 is a 33-node power distribution network according to an exemplary embodiment of the present invention;
FIG. 3 is a reward convergence curve and a voltage out-of-limit cost curve for both MACSAC and MASAC algorithms;
FIG. 4 is a schematic diagram of a node voltage box in an embodiment of the present invention.
Detailed Description
The invention will be further described with reference to the following figures and specific examples, which are not intended to limit the invention in any way.
The design concept of the invention is as follows: constructing a demand side resource collaborative optimization scheduling framework based on a local power market: the demand side resource aggregator and the micro-grid are used as market members and form a local electric power market together with the trading platform; before each transaction, the transaction platform firstly releases the power grid price to market members; the market member makes a scheduling plan for the demand side resources managed by the market member according to the price of the power grid, and submits the power purchasing and selling plan to a trading platform; the trading platform clears the local electric power market by using a pricing mechanism, and feeds back the obtained local electricity purchasing and selling prices to market members, and the market members continuously improve a trading decision model according to the local electric power supply and demand conditions reflected by the local electricity purchasing and selling prices, and optimize future electricity purchasing and selling plans, so that coordinated operation among resources on demand sides is realized. Based on the framework, the original problem of demand side resource collaborative optimization scheduling is converted into the optimal transaction decision problem of the demand side resource aggregator in the local power market. The method of the invention, as shown in fig. 1, comprises the following steps:
s1, firstly, constructing a demand side resource collaborative optimization scheduling framework based on a local power market, so that a demand side resource collaborative optimization scheduling problem is converted into an optimal transaction decision problem of each demand side resource aggregator, and modeling is performed on three types of demand side resource aggregators, namely an electric vehicle charging station, a heating ventilation air conditioner aggregator and a micro-grid, so as to obtain a demand side resource aggregation model;
s2, on the basis of the intermediate market price, the network loss balance cost is apportioned according to the network loss sensitivity, a local electric power market pricing strategy meeting the balance of revenue and expenditure is provided, and a local electric power market guiding the cooperative operation of the resources on the demand side is established through price signals;
s3, constructing basic elements of the multi-agent reinforcement learning method based on a demand side resource aggregation model and a local power market, wherein the basic elements comprise agents, environments, observation values, action values, reward functions and cost functions; modeling each demand side resource aggregator in the market by using a plurality of intelligent agents; the method specifically comprises the steps of defining a state variable, an action variable and a reward function of each intelligent agent according to the operating characteristics, the control characteristics and the economic parameters of demand side resources, defining a cost function according to the safety operation constraint of a power distribution network, and describing the transaction process of demand side resource aggregators on a local electric power market as the interaction between a plurality of intelligent agents and the environment, so that the optimal transaction decision problem of the demand side resource aggregators on the local electric power market is modeled into a limited Markov game model;
and S4, training the agents representing different demand side resource aggregators by using a multi-agent reinforced learning algorithm MACSAC (Multi-agent constrained software-conditional) and using the trained agents to perform scheduling decisions on the demand side resource aggregators represented by the agents, wherein each demand side resource aggregator independently makes transaction decisions through the respective agent, so that distributed cooperative optimal scheduling of demand side resources is realized.
In the invention, three types of demand side resource aggregators, namely an electric vehicle charging station, a heating ventilation air conditioner aggregator and a microgrid, are modeled, wherein the electric vehicle charging station and the heating ventilation air conditioner aggregator are modeled by adopting a virtual energy storage model, and the specific formula is as follows:
(1) For an electric vehicle charging station that is modeled using a virtual energy storage model, the virtual energy storage model representing the electric vehicle charging station may be represented as:
Figure BDA0003843780860000081
in the formula (I), the compound is shown in the specification,
Figure BDA0003843780860000082
representing the total power of the virtual stored energy i representing the electric vehicle charging station at the moment t,
Figure BDA0003843780860000083
and
Figure BDA0003843780860000084
respectively an upper power limit and a lower power limit;
Figure BDA0003843780860000085
representing an amount of power representing a virtual stored energy of an electric vehicle charging station,
Figure BDA0003843780860000086
and
Figure BDA0003843780860000087
respectively an electric quantity upper limit and an electric quantity lower limit; Δ t represents a time interval;
Figure BDA0003843780860000088
representing a change in the virtual amount of stored energy due to the electric vehicle leaving or entering the charging station.
(2) For the heating, ventilation and air conditioning aggregator, a virtual energy storage model is adopted for modeling, and the virtual energy storage model representing the heating, ventilation and air conditioning aggregator can be expressed as follows:
Figure BDA0003843780860000089
in the formula (I), the compound is shown in the specification,
Figure BDA00038437808600000810
to represent the total power of the virtual energy storage i of the hvac aggregator at time t,
Figure BDA00038437808600000811
and
Figure BDA00038437808600000812
respectively an upper power limit and a lower power limit;
Figure BDA00038437808600000813
to represent the amount of electricity of the virtual stored energy of the hvac aggregator,
Figure BDA00038437808600000814
and
Figure BDA00038437808600000815
respectively an electric quantity upper limit and an electric quantity lower limit; alpha is the electric quantity attenuation rate of the virtual energy storage; in addition, the virtual energy storage representing the heating, ventilation and air conditioning also has a reference load
Figure BDA00038437808600000816
The parameter (c) of (c).
(3) The microgrid comprises a photovoltaic system, an energy storage system and a non-flexible load; wherein: the photovoltaic system can be modeled as:
Figure BDA00038437808600000817
Figure BDA00038437808600000818
the energy storage system can be modeled as:
Figure BDA00038437808600000819
the inflexible load can be modeled as:
Figure BDA00038437808600000820
Figure BDA00038437808600000821
in the formula (I), the compound is shown in the specification,
Figure BDA0003843780860000091
outputting an active power predicted value for the photovoltaic;
Figure BDA0003843780860000092
outputting reactive power for the photovoltaic inverter;
Figure BDA0003843780860000093
outputting active power for the photovoltaic system; sigma is a power factor;
Figure BDA0003843780860000094
the output power of the energy storage i at the moment t is obtained;
Figure BDA0003843780860000095
and
Figure BDA0003843780860000096
respectively an upper limit and a lower limit of the energy storage power;
Figure BDA0003843780860000097
the electric quantity of the stored energy i at the moment t is obtained;
Figure BDA0003843780860000098
and
Figure BDA0003843780860000099
respectively representing an upper limit and a lower limit of the energy storage electric quantity; eta + And η - Respectively the charge and discharge efficiency of the stored energy;
Figure BDA00038437808600000910
and
Figure BDA00038437808600000911
the active power of the inflexible load and a predicted value thereof;
Figure BDA00038437808600000912
and
Figure BDA00038437808600000913
and the reactive power of the inflexible load and a predicted value thereof.
In the invention, the local power market is formed by a demand side resource aggregator as a market member and a trading platform together; the local electricity market pricing strategy that meets the balance of revenue and expenditure requirements comprises: determining a basic electricity price; determining a network loss allocation price; determining local electricity purchasing price and local electricity selling price by combining the basic electricity price and the network loss allocation price; the local power market for guiding the demand side resources to run cooperatively is established based on a local power market pricing strategy, the local electricity purchasing price and the electricity selling price are determined in a self-adaptive mode according to the supply and demand balance condition on the local power market, and the demand side resource aggregator makes an optimal demand side resource scheduling decision according to the local electricity purchasing price and the electricity selling price, so that the demand side resources run cooperatively.
The invention designs a local electric power market price mechanism meeting the balance of revenue and expenditure requirements, wherein the price comprises two parts of basic electricity price and network loss share price, and the specific calculation formula is as follows:
(1) Determining the basic electricity price: first, the bid power of each type of market member is defined: and each market member determines the electricity purchasing quantity or electricity selling quantity of the market member on the local electricity market according to the scheduling decision value of the managed demand side resource:
Figure BDA00038437808600000914
in the formula (I), the compound is shown in the specification,
Figure BDA00038437808600000915
representing the electricity purchasing quantity or electricity selling quantity, omega, submitted by the demand side resource aggregator i on the local electricity market at the moment t EV 、Ω HVAC And Ω MG Respectively represent electric automobile charging station set, heating and ventilation air conditioning aggregate business set and little electric wire netting set in the market.
Then, the total demand on the market is calculated
Figure BDA00038437808600000916
Total power generation
Figure BDA00038437808600000917
And net demand
Figure BDA00038437808600000918
Figure BDA00038437808600000919
In the formula (I), the compound is shown in the specification,
Figure BDA00038437808600000920
representing the increment of loss of the network caused by local electric power market trading; Ω represents the set of all market members.
Calculating the basic electricity purchase price according to the definition of the intermediate market price
Figure BDA00038437808600000921
And basic electricity selling price
Figure BDA00038437808600000922
Figure BDA00038437808600000923
Figure BDA0003843780860000101
In the formula (12), the reaction mixture is,
Figure BDA0003843780860000102
and
Figure BDA0003843780860000103
and respectively representing time-of-use electricity price and internet electricity price in the electricity price of the power grid.
(2) Determining loss share prices
Based on the network loss sensitivity, calculating the network loss apportionment price of each node:
Figure BDA0003843780860000104
in the formula (I), the compound is shown in the specification,
Figure BDA0003843780860000105
is a section ofAnd (4) the network loss sensitivity coefficient corresponding to the point i.
And determining local electricity purchasing price and local electricity selling price by combining the basic electricity price and the network loss share price:
Figure BDA0003843780860000106
Figure BDA0003843780860000107
in the formula:
Figure BDA0003843780860000108
representing local electricity purchase prices of market members located at node i;
Figure BDA0003843780860000109
representing the local electricity selling price of the market member located at node j.
The operating mechanism of the local power market is: when each transaction is started, the transaction platform firstly releases the current power grid price and the local electricity purchasing price and the local electricity selling price of the last transaction to market members; the market members make a demand side resource scheduling decision, determine electricity purchasing electric quantity and electricity selling electric quantity according to the scheduling decision, and submit the electricity purchasing electric quantity and the electricity selling electric quantity to the transaction platform; the trading platform clears the local electric power market by using a local electric power market pricing strategy to obtain a local electricity purchasing price and an electricity selling price; entering the next transaction; therefore, the cooperative operation of the resources on the demand side is realized.
In the invention, an optimal transaction decision problem of a demand side resource aggregator in a local power market is modeled into a restricted Markov game model, which specifically comprises the following steps: in the restricted Markov game corresponding to the problem, each agent corresponds to a demand side resource aggregator, the environment interacting with the agent is composed of a power distribution network and a local power market, and the complete process is as follows: at each time step t, each agent receives its own observation o i,t Then according to its policy function pi i (·|o i,t ) SelectingAction value a i,t (ii) a All agents take corresponding action and receive the reward r of environmental feedback i,t And cost
Figure BDA00038437808600001010
The environment transitions to the next state and each agent will receive a new observation o i,t+1 And continues the above process. The basic elements of the method for constructing the multi-agent reinforcement learning relate to the following specific definitions:
(1) The intelligent agent:
and initializing an agent for each demand side resource aggregator, wherein the agent consists of a reward evaluator network, a cost evaluator network and an actor network, the reward evaluator network and the cost evaluator network are used for guiding the actor network to train, and the actor network is used for outputting a scheduling decision of the demand side resource aggregator on the managed demand side resource.
(2) Environment:
the local power market and the power distribution network jointly form an environment for interacting with the intelligent agent; the interaction process of the environment and the intelligent agent is as follows: after receiving the electricity purchasing quantity and the electricity selling quantity submitted by the demand side resource aggregator, the local electricity market clears the market by using the proposed local electricity market pricing strategy to obtain a local electricity purchasing price and an electricity selling price, and distributes the local electricity purchasing price and the electricity selling price to each demand side resource aggregator; meanwhile, the power distribution network operation platform feeds back the running state information of the power distribution network at the current moment, including node voltage and branch power, to each demand side resource aggregator;
(3) Observation variables:
and defining the observed value of each intelligent agent according to the monitoring value of the equipment operation state in the established demand side resource aggregation model and the price information transmitted to the intelligent agents by the environment, wherein the price information comprises the power grid electricity price, the local electricity purchasing electricity price and the local electricity selling price.
For agent i representing the microgrid, the set of observations at time t is:
Figure BDA0003843780860000111
Figure BDA0003843780860000112
for agent j representing the charging station of the electric vehicle, the observation set at the moment t is as follows:
Figure BDA0003843780860000113
Figure BDA0003843780860000114
for the intelligent agent k of the heating, ventilation and air conditioning aggregator, the observation value set at the time t is
Figure BDA0003843780860000115
(4) Action variables:
and defining the action value of each agent according to the control variable of the equipment in the established demand side resource aggregation model, wherein the action value of each agent forms a scheduling decision of a demand side resource aggregator on the demand side resources managed by the demand side resource aggregator.
For agent i representing the microgrid, the set of action variables is
Figure BDA0003843780860000116
For agent j representing the aggregator of electric vehicles, the action variable is
Figure BDA0003843780860000117
For agent k representing the aggregator of heating, ventilating and air conditioning, the action variable is
Figure BDA0003843780860000118
(5) The reward function:
the reward function for each agent is defined as its electricity sales revenue minus electricity purchase expenditure on the local electricity market, and the reward function value is calculated and fed back to the agent by the local electricity market in the environment: the reward function expression is:
Figure BDA0003843780860000119
the operator in the formula is defined as [ x ]] + /[x] - =max/min(0,x)。
(6) The cost function is:
defining a cost function according to the safety operation constraint of the power distribution network, wherein the cost function value is calculated by a power distribution network operation platform in the environment and fed back to the intelligent agents, the cost function of each intelligent agent is defined as the voltage out-of-limit punishment in the power distribution network partition to which the intelligent agent belongs, and the expression is as follows:
Figure BDA00038437808600001110
in the formula, V n Is the voltage of node n;
Figure BDA00038437808600001111
andVthe upper limit and the lower limit of the node voltage are respectively; n is a radical of hydrogen i The method is a node set in a power distribution network partition to which an agent i belongs.
In the invention, a multi-agent constrained soft operator-critical (MACSAC) algorithm is utilized to train agents representing resource aggregators on different demand sides, and the specific solving steps are as follows:
1) For each agent, parameters of the actor network, reward evaluator network, cost evaluator network, and an empty experience playback pool are initialized.
2) Inputting the observed quantity of each agent in the local power market to an action neural network to obtain an action value;
3) Each market member carries out scheduling control on demand side resources managed by the market member according to an action value given by the intelligent agent, calculates electricity purchasing and selling quantity and submits the electricity purchasing and selling quantity to a local electric power market (environment), the local electric power market clears the market according to a proposed pricing mechanism, rewards and cost including reward function values and price information and new observation information are fed back to the market members, and a power distribution network operation platform feeds back cost function values to the market members;
4) Storing an action value, an observation value, an incentive function value and a cost function value generated by interaction of the intelligent agent and the environment in an experience playback pool as a sample;
5) Randomly extracting a batch of samples from the experience playback pool, and updating the network parameters of the reward evaluator and the cost evaluator of each agent;
6) Estimating the future accumulated reward expectation value and the accumulated cost expectation value of each intelligent agent by using the reward evaluator network and the cost evaluator network, and guiding the updating of the actor network of each intelligent agent;
7) And repeating the steps 2) -6) until the number of times of updating the parameters reaches a set value, thereby obtaining the trained intelligent agent.
In an application stage of demand side resource collaborative optimization scheduling, after each intelligent agent receives an observed value at the current moment, an actor network is used for outputting an action value, a corresponding demand side resource aggregator schedules the managed demand side resource according to the action value, and performs power trading in a local power market, so that distributed collaborative optimization scheduling of the demand side resource is realized.
Study materials:
the effect of the method is verified based on a 33-node power distribution system, as shown in fig. 2, the system comprises two micro-grids (MG 1, MG 2), two electric vehicle charging stations (EV 1, EV 2) and two heating, ventilation and air conditioning aggregators (HVAC 1, HVAC 2), and as shown in fig. 2, the node numbers of the demand resource aggregators are 7, 15, 11, 23, 18, and 27 respectively. The photovoltaic power data, the electric vehicle battery parameters and other test data adopted in the experiment are all from open-source real data. Comparative methods used in the experiments include:
MACSAC: the invention adopts a multi-agent reinforcement learning method considering network security constraints;
MASAC: a multi-agent reinforcement learning method without considering network security constraints;
LMMR: the invention provides a pricing method meeting the balance of revenue and expenditure;
MMR: intermediate market price law.
Figure 3 shows the reward (average daily electricity cost) convergence curve and the voltage out-of-limit cost curve for both MACSAC and MASAC algorithms. As can be seen from fig. 3 (a), both MASAC and MACSAC algorithms converge after a limited number of training, and although MASAC corresponds to a higher reward value, according to fig. 3 (b), the MACSAC algorithm can reduce the voltage out-of-limit cost to approach 0, which is significantly better than the MASAC algorithm.
To further illustrate the advantages of the MACSAC algorithm in terms of satisfying the grid safety constraints, fig. 4 shows the voltage statistics of some nodes. It can be seen from FIG. 4 that, when the MASAC algorithm is adopted, the voltages of the nodes where MG1, MG2 and EV1 are located exceed the safety constraint range [0.95,1.05] p.u. in some situations, which is not acceptable in practical operation. However, when the MACSAC algorithm is employed, all node voltages are within safety constraints.
Table 1 shows each economic index when two pricing strategies, LMMR and MMR, are respectively adopted, where the total income represents the daily average total cost paid by the local electricity market members to the local electricity market operation platform, and the total expenditure represents the daily average total cost that the local electricity market operation platform needs to pay to the power grid.
TABLE 1 comparison of economic indicators
Figure BDA0003843780860000131
As can be seen from table 1: under the MMR method, there is a 28.9 dollar difference between the total revenue and total expenditure of the local electricity market operating platform, which does not exist when the LMMR method is employed. In addition, under the LMMR method, the electric car aggregators pay less money, because the proposed LMMR method will be price-incentivized when the trading decisions made by market members help to reduce network loss.
While the present invention has been described with reference to the accompanying drawings, the present invention is not limited to the above-described embodiments, which are illustrative only and not restrictive, and various modifications which do not depart from the spirit of the present invention and which are intended to be covered by the claims of the present invention may be made by those skilled in the art.

Claims (6)

1. A demand side resource collaborative optimization scheduling method based on a local power market is characterized by comprising the following steps:
s1, modeling three types of demand side resource aggregators, namely an electric vehicle charging station, a heating ventilation air conditioner aggregator and a micro-grid, so as to obtain a demand side resource aggregation model;
s2, on the basis of the intermediate market price, the network loss balance cost is apportioned according to the network loss sensitivity, a local electric power market pricing strategy meeting the balance of revenue and expenditure is provided, and a local electric power market guiding the resources on the demand side to cooperatively run is established;
s3, constructing basic elements of the multi-agent reinforcement learning method based on a demand side resource aggregation model and a local power market, wherein the basic elements comprise an agent, an environment, an observation value, an action value, a reward function and a cost function;
and S4, training the agents representing different demand side resource aggregators by using a multi-agent reinforced learning algorithm MACSAC (multi-agent constrained soft operator-critical), and using the trained agents to perform scheduling decision on the demand side resource aggregators represented by the agents, thereby realizing distributed cooperative optimal scheduling of demand side resources.
2. The demand side resource cooperative optimization scheduling method of claim 1, wherein the specific step of step S1 includes:
s1-1) modeling is carried out on the electric vehicle charging station by adopting a virtual energy storage model, wherein the virtual energy storage model of the electric vehicle charging station is expressed as follows:
Figure FDA0003843780850000011
in the formula (1), the reaction mixture is,
Figure FDA0003843780850000012
representing the total power of the virtual energy storage i representing the electric vehicle charging station at time t,
Figure FDA0003843780850000013
and
Figure FDA0003843780850000014
respectively an upper power limit and a lower power limit;
Figure FDA0003843780850000015
representing an amount of power representing a virtual stored energy of an electric vehicle charging station,
Figure FDA0003843780850000016
and
Figure FDA0003843780850000017
respectively an electric quantity upper limit and an electric quantity lower limit; Δ t represents a time interval;
Figure FDA0003843780850000018
representing the virtual energy storage capacity change caused by the fact that the electric vehicle leaves or enters the charging station;
s1-2) modeling the heating, ventilation and air conditioning aggregator by adopting a virtual energy storage model, wherein the virtual energy storage model of the heating, ventilation and air conditioning aggregator is expressed as follows:
Figure FDA0003843780850000019
in the formula (2), the reaction mixture is,
Figure FDA00038437808500000110
to represent the total power of the virtual energy storage i of the hvac aggregator at time t,
Figure FDA00038437808500000111
and
Figure FDA00038437808500000112
respectively an upper power limit and a lower power limit;
Figure FDA00038437808500000113
to represent the amount of electricity of the virtual stored energy of the hvac aggregator,
Figure FDA00038437808500000114
and
Figure FDA00038437808500000115
respectively an upper limit and a lower limit of electric quantity; alpha is the electric quantity attenuation rate of the virtual energy storage; in addition, the virtual energy storage representing the heating, ventilation and air conditioning also has a reference load
Figure FDA00038437808500000116
The parameters of (1);
s1-3) the microgrid comprises a photovoltaic system, an energy storage system and a non-flexible load; wherein:
the photovoltaic system is modeled as:
Figure FDA0003843780850000021
Figure FDA0003843780850000022
the energy storage system is modeled as follows:
Figure FDA0003843780850000023
the inflexible load modeling is as follows:
Figure FDA0003843780850000024
Figure FDA0003843780850000025
in the formulae (4) to (8),
Figure FDA0003843780850000026
outputting an active power predicted value for the photovoltaic;
Figure FDA0003843780850000027
outputting reactive power for the photovoltaic inverter;
Figure FDA0003843780850000028
outputting active power for the photovoltaic system; sigma is a power factor;
Figure FDA0003843780850000029
the output power of the energy storage i at the moment t is obtained;
Figure FDA00038437808500000210
and
Figure FDA00038437808500000211
respectively an upper limit and a lower limit of the energy storage power;
Figure FDA00038437808500000212
the electric quantity of the stored energy i at the moment t is obtained;
Figure FDA00038437808500000213
and
Figure FDA00038437808500000214
respectively representing an upper limit and a lower limit of the energy storage electric quantity; eta + And η - Respectively the charge and discharge efficiency of the stored energy;
Figure FDA00038437808500000215
and
Figure FDA00038437808500000216
the active power of the inflexible load and a predicted value thereof;
Figure FDA00038437808500000217
and
Figure FDA00038437808500000218
the reactive power of the inflexible load and the predicted value of the reactive power are obtained.
3. The demand side resource collaborative optimization scheduling method according to claim 1, wherein in step S2, the local power market is formed by a demand side resource aggregator as a market member and a trading platform together; the local electricity market pricing strategy that meets the balance of revenue and expenditure requirements comprises: determining a basic electricity price; determining a network loss allocation price; determining local electricity purchasing price and local electricity selling price by combining the basic electricity price and the network loss allocation price; the local power market for guiding the demand side resources to run cooperatively is established based on a local power market pricing strategy, the local electricity purchasing price and the electricity selling price are determined in a self-adaptive mode according to the supply and demand balance condition on the local power market, and the demand side resource aggregator makes an optimal demand side resource scheduling decision according to the local electricity purchasing price and the electricity selling price, so that the demand side resources run cooperatively.
4. The demand side resource collaborative optimization scheduling method according to claim 1, wherein the specific content of step S2 is as follows:
s2-1) determining the basic electricity price:
firstly, each market member determines the electricity purchasing quantity or electricity selling quantity on the local electricity market according to the scheduling decision value of the managed demand side resource:
Figure FDA0003843780850000031
in the formula (9), the reaction mixture is,
Figure FDA0003843780850000032
representing the electricity purchasing quantity or electricity selling quantity, omega, submitted by the demand side resource aggregator i on the local electricity market at the moment t EV 、Ω HVAC And Ω MG Respectively representing an electric vehicle charging station set, a heating ventilation air-conditioning aggregation business set and a micro-grid set in the market;
the local power market then calculates the total demand on the market
Figure FDA0003843780850000033
Total power generation
Figure FDA0003843780850000034
And net demand
Figure FDA00038437808500000321
Figure FDA0003843780850000035
Figure FDA0003843780850000036
Figure FDA0003843780850000037
In the formula (10), the reaction mixture is,
Figure FDA0003843780850000038
representing the increment of loss of the network caused by local electric power market trading; Ω represents a set of all market members;
according to intermediate market price definitionCalculating the basic electricity purchase price
Figure FDA0003843780850000039
And basic electricity selling price
Figure FDA00038437808500000310
Figure FDA00038437808500000311
Figure FDA00038437808500000312
In the formula (12), the reaction mixture is,
Figure FDA00038437808500000313
and
Figure FDA00038437808500000314
respectively representing time-of-use electricity price and internet electricity price in the electricity price of the power grid;
s2-2) determining the network loss apportionment price:
based on the network loss sensitivity, calculating the network loss apportionment price of each node:
Figure FDA00038437808500000315
in the formula (13), the reaction mixture is,
Figure FDA00038437808500000316
the network loss sensitivity coefficient corresponding to the node i is obtained;
s2-3) determining local electricity purchasing price and local electricity selling price by combining the basic electricity price and the network loss share price:
Figure FDA00038437808500000317
Figure FDA00038437808500000318
in formulae (14) and (15):
Figure FDA00038437808500000319
representing local electricity purchase prices of market members located at node i;
Figure FDA00038437808500000320
a local electricity selling price representing a market member located at the node j;
s2-4) the operation mechanism of the local power market is as follows: when each transaction is started, the transaction platform firstly releases the current power grid price and the local electricity purchasing price and the local electricity selling price of the last transaction to market members; the market members make a demand side resource scheduling decision, determine electricity purchasing electric quantity and electricity selling electric quantity according to the scheduling decision, and submit the electricity purchasing electric quantity and the electricity selling electric quantity to the transaction platform; the trading platform clears the local electric power market by using a local electric power market pricing strategy to obtain a local electricity purchasing price and an electricity selling price; entering the next transaction; therefore, the cooperative operation of the resources on the demand side is realized.
5. The demand side resource collaborative optimization scheduling method according to claim 1, wherein in step S3, the constructing basic elements of the multi-agent reinforcement learning method includes:
s3-1) an agent: initializing an agent for each demand side resource aggregator, wherein the agent comprises a reward evaluator network, a cost evaluator network and an actor network, the reward evaluator network and the cost evaluator network are used for guiding actor network training, and the actor network is used for outputting a scheduling decision of the demand side resource aggregator on the managed demand side resource;
s3-2) environment: the local power market and the power distribution network jointly form an environment for interacting with the intelligent agent;
the interaction process of the environment and the intelligent agent is as follows: after receiving the electricity purchasing quantity and the electricity selling quantity submitted by the demand side resource aggregator, the local electricity market clears the market by using the proposed local electricity market pricing strategy to obtain a local electricity purchasing price and an electricity selling price, and distributes the local electricity purchasing price and the electricity selling price to each demand side resource aggregator; meanwhile, the power distribution network operation platform feeds back the running state information of the power distribution network at the current moment, including node voltage and branch power, to each demand side resource aggregator;
s3-3) observed value: defining an observed value of each intelligent agent according to the monitoring value of the equipment operation state in the demand side resource aggregation model established in the step S1 and price information transmitted to the intelligent agents by the environment, wherein the price information comprises power grid electricity price, local electricity purchasing price and local electricity selling price;
for agent i representing the microgrid, the set of observations at time t is:
Figure FDA0003843780850000041
Figure FDA0003843780850000042
for agent j representing the charging station of the electric vehicle, the observation set at the moment t is as follows:
Figure FDA0003843780850000043
Figure FDA0003843780850000044
for the intelligent agent k of the heating, ventilation and air conditioning aggregator, the observation value set at the time t is
Figure FDA0003843780850000045
Figure FDA0003843780850000046
S3-4) action value: defining an action value of each agent according to the control variable of the equipment in the demand side resource aggregation model established in the step S1, wherein the action values of the agents form a scheduling decision of a demand side resource aggregator on the demand side resources managed by the demand side resource aggregator;
for agent i representing the microgrid, the set of action values is
Figure FDA0003843780850000047
For agent j representing a charging station for an electric vehicle, the action value is
Figure FDA0003843780850000048
For an agent k representing the aggregator of heating, ventilation and air conditioning, the action value is
Figure FDA0003843780850000049
S3-5) reward function: the reward function for each agent is defined as its electricity sales revenue minus electricity purchase expenditure on the local electricity market, which is calculated by the local electricity market in the environment and fed back to the agent:
Figure FDA0003843780850000051
the operator in equation (16) is defined as [ x ]] + /[x] - =max/min(0,x);
S3-6) cost function: defining a cost function according to the safe operation constraint of the power distribution network, wherein the cost function value is calculated by a power distribution network operation platform in the environment and fed back to an intelligent agent:
Figure FDA0003843780850000052
in the formula (17), V n Is the voltage of node n;
Figure FDA0003843780850000053
andVthe upper limit and the lower limit of the node voltage are respectively; n is a radical of i The method is a node set in a power distribution network partition to which an agent i belongs.
6. The demand side resource collaborative optimization scheduling method of claim 1, wherein in step S4, an agent representing different demand side resource aggregators is trained by using a multi-agent reinforcement learning algorithm MACSAC, and the specific steps are as follows:
s4-1) initializing parameters of an actor network, a reward evaluator network and a cost evaluator network and an empty experience playback pool for each agent;
s4-2) inputting the observed value of each agent in the environment to the actor network to obtain an action value;
s4-3) scheduling the demand side resources managed by each market member according to the action values given by the intelligent agents, calculating the electricity purchasing and selling quantity and submitting the electricity purchasing and selling quantity to a local electric power market, clearing the local electric power market according to the pricing mechanism provided in the step S2, feeding reward function values and price information back to the market members by the local electric power market, and feeding cost function values back to the market members by a power distribution network operation platform;
s4-4) storing an action value, an observation value, an incentive function value and a cost function value generated by interaction of the intelligent agent and the environment as a sample in an experience playback pool;
s4-5) randomly extracting a batch of samples from the experience playback pool, and updating the network parameters of the reward evaluator and the cost evaluator of each agent;
s4-6) estimating the future accumulated reward expectation value and the accumulated cost expectation value of each intelligent agent by using the reward evaluator network and the cost evaluator network, and guiding the updating of the actor network of each intelligent agent;
s4-7) repeating the steps S4-2) -S4-6) until the number of times of updating the parameters reaches a set value, and obtaining a trained intelligent agent; when the demand side resource collaborative optimization scheduling is carried out, after each intelligent agent receives the observed value of the current moment, the action value is output by using the actor network, and the corresponding demand side resource aggregator schedules the managed demand side resource according to the action value, so that the distributed collaborative optimization scheduling of the demand side resource is realized.
CN202211110254.XA 2022-09-13 2022-09-13 Demand side resource collaborative optimization scheduling method based on local power market Pending CN115392766A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211110254.XA CN115392766A (en) 2022-09-13 2022-09-13 Demand side resource collaborative optimization scheduling method based on local power market

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211110254.XA CN115392766A (en) 2022-09-13 2022-09-13 Demand side resource collaborative optimization scheduling method based on local power market

Publications (1)

Publication Number Publication Date
CN115392766A true CN115392766A (en) 2022-11-25

Family

ID=84127219

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211110254.XA Pending CN115392766A (en) 2022-09-13 2022-09-13 Demand side resource collaborative optimization scheduling method based on local power market

Country Status (1)

Country Link
CN (1) CN115392766A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116451880A (en) * 2023-06-16 2023-07-18 华北电力大学 Distributed energy optimization scheduling method and device based on hybrid learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116451880A (en) * 2023-06-16 2023-07-18 华北电力大学 Distributed energy optimization scheduling method and device based on hybrid learning
CN116451880B (en) * 2023-06-16 2023-09-12 华北电力大学 Distributed energy optimization scheduling method and device based on hybrid learning

Similar Documents

Publication Publication Date Title
Qiu et al. Reinforcement learning for electric vehicle applications in power systems: A critical review
CN109190802B (en) Multi-microgrid game optimization method based on power generation prediction in cloud energy storage environment
CN111079971A (en) Charging station pricing method considering vehicle, station and network
CN114997631B (en) Electric vehicle charging scheduling method, device, equipment and medium
Zhong et al. Cooperative operation of battery swapping stations and charging stations with electricity and carbon trading
CN113326994A (en) Virtual power plant energy collaborative optimization method considering source load storage interaction
Luo et al. A hierarchical blockchain architecture based V2G market trading system
Seitaridis et al. An agent-based negotiation scheme for the distribution of electric vehicles across a set of charging stations
CN111619391A (en) Electric vehicle ordered charging and discharging method based on cooperative game and dynamic time-of-use electricity price
Gu et al. Fair and privacy-aware EV discharging strategy using decentralized whale optimization algorithm for minimizing cost of EVs and the EV aggregator
Benalcazar et al. Short-term economic dispatch of smart distribution grids considering the active role of plug-in electric vehicles
CN114037192A (en) Virtual power plant transaction management method, device, equipment and medium based on big data
Afshar et al. A distributed electric vehicle charging scheduling platform considering aggregators coordination
CN115392766A (en) Demand side resource collaborative optimization scheduling method based on local power market
Lin et al. Optimal scheduling management of the parking lot and decentralized charging of electric vehicles based on Mean Field Game
Wu et al. On optimal charging scheduling for electric vehicles with wind power generation
CN117613906B (en) Multi-main-body real-time collaborative carbon reduction method of power distribution system under participation of distributed source load
CN115204442A (en) Power grid-charging operator collaborative operation optimization method and system
Mignoni et al. Distributed Noncooperative MPC for Energy Scheduling of Charging and Trading Electric Vehicles in Energy Communities
Yang et al. A two-stage pricing strategy for electric vehicles participating in emergency power supply for important loads
Jin et al. Optimal EV scheduling and voltage security via an online bi-layer steady-state assessment method considering uncertainties
Ren et al. Study on optimal V2G pricing strategy under multi-aggregator competition based on game theory
Rahman et al. On efficient operation of a V2G-enabled virtual power plant: when solar power meets bidirectional electric vehicle charging
CN117543581A (en) Virtual power plant optimal scheduling method considering electric automobile demand response and application thereof
Hijgenaar et al. A decentralised energy trading architecture for future smart grid load balancing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination