CN113052638A - Price demand response-based determination method and system - Google Patents
Price demand response-based determination method and system Download PDFInfo
- Publication number
- CN113052638A CN113052638A CN202110366209.XA CN202110366209A CN113052638A CN 113052638 A CN113052638 A CN 113052638A CN 202110366209 A CN202110366209 A CN 202110366209A CN 113052638 A CN113052638 A CN 113052638A
- Authority
- CN
- China
- Prior art keywords
- load
- value function
- time
- price
- retail
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000004044 response Effects 0.000 title claims abstract description 58
- 238000000034 method Methods 0.000 title claims abstract description 54
- 230000009471 action Effects 0.000 claims abstract description 180
- 230000005611 electricity Effects 0.000 claims abstract description 155
- 238000005265 energy consumption Methods 0.000 claims abstract description 51
- 230000008569 process Effects 0.000 claims abstract description 29
- 230000006870 function Effects 0.000 claims description 128
- 150000001875 compounds Chemical class 0.000 claims description 17
- 238000012544 monitoring process Methods 0.000 claims description 15
- 230000009467 reduction Effects 0.000 claims description 8
- 230000001186 cumulative effect Effects 0.000 claims description 5
- 230000001419 dependent effect Effects 0.000 claims description 3
- 238000010200 validation analysis Methods 0.000 claims 7
- 238000012790 confirmation Methods 0.000 claims 2
- 230000008901 benefit Effects 0.000 description 6
- 230000008859 change Effects 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 230000002787 reinforcement Effects 0.000 description 4
- 239000003795 chemical substances by application Substances 0.000 description 3
- 238000013178 mathematical model Methods 0.000 description 3
- 230000007704 transition Effects 0.000 description 3
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- BULVZWIRKLYCBC-UHFFFAOYSA-N phorate Chemical compound CCOP(=S)(OCC)SCSCC BULVZWIRKLYCBC-UHFFFAOYSA-N 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000036962 time dependent Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
- G06Q30/0206—Price or cost determination based on market factors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Strategic Management (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Finance (AREA)
- Marketing (AREA)
- Theoretical Computer Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Game Theory and Decision Science (AREA)
- Data Mining & Analysis (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Primary Health Care (AREA)
- Tourism & Hospitality (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a price demand response-based determination method and a price demand response-based determination system, wherein a dynamic retail electricity price pricing problem of an electric power company is modeled into a Markov decision process, retail electricity price making actions are determined by utilizing an electricity price selection probability-greedy strategy according to states of all load units at the current moment, immediate return of earnings and states of all load units at the next moment are obtained, a reference action value function obtained by last iteration is updated into a target action value function, when the current moment reaches a terminal moment and an absolute value of a difference value between the target action value function and the reference action value function is not greater than a difference threshold value, the target action value function is used as an optimal action value function, the optimal retail electricity price strategy is determined according to the optimal action value function, and then the optimal energy consumption of schedulable loads is calculated. When the action value function is determined, the influence of the current electricity price on the instant response of the load and the response in a future period of time is considered, so that the accuracy of the response based on price demand is improved.
Description
Technical Field
The invention relates to the technical field of power grids, in particular to a price demand response-based determination method and system.
Background
The smart grid is a typical cyber-physical system that integrates advanced detection, control and communication technologies into a physical power system to provide reliable energy supply, promote active participation of loads, and ensure stable operation of the grid system. Based on the characteristic of smart grid information physical fusion, power demand response (demand response) has become a research hotspot in the field of energy management (energy management), and the aim is to change the energy use mode of a load according to time-varying electricity price or reward/penalty excitation, so as to achieve the aims of reducing energy cost on a demand side and the like. In other words, power demand response is a means of reshaping the load energy usage by price or incentive means to achieve more efficient energy management.
Currently, existing research work focuses mainly on two branches of demand response, namely, price-based demand response (price-based demand response) and incentive-based demand response (incentive-based demand response). Among them, it is expected to change the energy use pattern of the end user by pricing the electricity price according to the time-dependent electricity price, such as time-of-use pricing and real-time pricing, based on price demand response as a kind of commonly used demand response.
The existing price-based demand response is mostly based on a deterministic price mechanism, such as a time-of-use electricity price pricing mechanism, a day-ahead electricity price pricing mechanism or a linear price model. However, the deterministic price mechanism does not truly characterize the uncertainty and flexibility of the dynamic electricity market, and thus the accuracy of the existing price demand response-based method is not high.
Disclosure of Invention
In view of the above, the invention discloses a method and a system for determining based on price demand response, so as to achieve uncertainty and flexibility in truly depicting a dynamic power market and improve accuracy based on price demand response.
A price demand response based determination method, comprising:
modeling a dynamic retail electricity price pricing problem of an electric power company as a Markov decision process;
monitoring the states of all load units at the current moment, recording the states as a first state, and selecting a target retail price making action at the current moment by using a price selection probability-greedy strategy within an allowable retail price range;
calculating the profit return immediately after the target retail electricity price making action is executed, monitoring the states of all load units at the next moment of the current moment, and recording as a second state;
updating a reference action value function into a target action value function based on the first state, the second state, the target retail electricity price making action and the immediate return of income, wherein the reference action value function is an action value function obtained by last iteration;
judging whether the current time reaches the terminal time;
if so, judging whether the absolute value of the difference between the target action value function and the reference action value function is not greater than a difference threshold value;
if so, taking the target action value function as an optimal action value function, and determining an optimal retail electricity price strategy according to the optimal action value function;
and calculating the optimal energy consumption of the dispatchable load according to the optimal retail electricity price strategy.
Optionally, the specific meaning of the electricity price selection probability-greedy strategy is as follows: and randomly selecting one retail price from the action set according to the probability of epsilon, or selecting the retail price corresponding to the maximum action value function according to the probability of 1-epsilon, wherein epsilon represents the price selection probability.
Optionally, the revenue is reported immediately rtThe expression of (a) is as follows:
rt=ρUt-(1-ρ)Ct;
in the formula, rho is [0, 1]]Is a weight parameter, U, representing the relative social value of the utility's revenue and the combined cost of the load unittRepresenting the net income of the utility at time t, CtIndicating the overall cost of the load side at time t.
Optionally, the net profit U of the utility at time ttThe expression of (a) is as follows:
in the formula (I), the compound is shown in the specification,in order to be a non-schedulable set of loads,indicating non-schedulable loadThe retail electricity prices received at the time t,indicating non-schedulable loadThe energy consumption at time t, the superscript n denoting the non-dispatchable load identity, the subscript n denoting the load unit index, the subscript t denoting the time index,indicating schedulable loadThe energy consumption at time t, the superscript d representing the schedulable load identifier,in order to be able to schedule a set of loads,indicating schedulable loadRetail electricity prices, η, received at time ttIndicating schedulable loadWholesale electricity price at time t and satisfy The method comprises the steps that total electric energy purchased by an electric power company to a power grid operator at the moment t is represented, and the superscript tot represents a total electric energy identifier;
comprehensive cost C of load side at time ttThe expression of (a) is as follows:
in the formula (I), the compound is shown in the specification,indicating schedulable loadThe level of dissatisfaction caused by the reduced energy consumption requirement at time t.
in the formula (I), the compound is shown in the specification,indicating schedulable loadThe level of dissatisfaction caused by the reduced energy consumption requirement at time t,andrepresenting two schedulable dependent loadsThe dissatisfaction coefficient of (a) to (b),indicating schedulable loadThe amount of reduction in the demand of (c),indicating schedulable loadAt the time t, the energy demand, the superscript d, represents the schedulable load identifier,in order to be able to schedule a set of loads,indicating schedulable loadEnergy consumption at time t.
in the formula (I), the compound is shown in the specification,andrespectively representing schedulable loadsAnd a maximum demand reduction amount, and are both known amounts.
Optionally, the expression of the target action value function is as follows:
in the formula, Qk(st,at) Representing the state s from all load units at the kth iteration as a function of the target action valuetStarting, a target retail price making action a is executedtIs defined as the cumulative future discount return ofWhere γ represents a discount factor, e ∈ [0, 1]]Is the learning rate, represents the newly acquired QkValue pair Qk-1Degree of coverage of value, Qk-1(st,at) Representing said reference motion value function, st+1Representing the state of all load units at time t +1, at+1Indicating retail electricity price making action at time t +1, Qk-1(st+1,at+1) State s representing k-1 iterations from all loadst+1Starting from, carry out at+1Accumulated future discount returns.
Optionally, the expression of the optimal retail price policy is as follows:
in the formula, pi*(st) For optimal retail electricity price strategy, Q*(st,at) For the optimal action value function, A is the action set, and A ═ a1,a2,…,aTThe value range of the time t is: t1, 2, T represents the total number of time intervals, stRepresenting the state of all load units at time t, atIndicating a retail electricity price making action at time t.
Optionally, the expression of the optimal energy consumption is as follows:
in the formula (I), the compound is shown in the specification,indicating the optimal energy consumption, the index n for the load unit, the index t for the time index, the index d for the dispatchable load identification,indicating schedulable loadThe energy requirement at the time of t,indicating schedulable loadRetail electricity prices, η, received at time ttIndicating schedulable loadWholesale electricity price at time t and satisfyμtThe electricity rate elastic coefficient indicates a rate at which the energy demand changes with the retail electricity rate at time t.
Optionally, the determining method further includes: initializing the action value function, specifically including:
obtaining known prior parameter data, substituting the prior parameter data into the predetermined action value function, and initializing the action value function, wherein the initial value of the action value function is 0.
A price demand response based determination system comprising:
the modeling unit is used for modeling the dynamic retail electricity price pricing problem of the power company into a Markov decision process;
the action selection unit is used for monitoring the states of all load units at the current moment, recording the states as first states, and selecting a target retail price at the current moment to make an action by using a price selection probability-greedy strategy within an allowable retail price range;
the return calculating unit is used for calculating the return immediately after the target retail price making action is executed, monitoring the states of all load units at the next moment of the current moment and recording the states as a second state;
a function updating unit, configured to update a reference action value function to a target action value function based on the first state, the second state, the target retail electricity price making action, and the immediate return of revenue, where the reference action value function is an action value function obtained through last iteration;
the first judgment unit is used for judging whether the current moment reaches the terminal moment;
a second judgment unit configured to judge whether or not an absolute value of a difference between the target motion value function and the reference motion value function is not greater than a difference threshold value, in a case where the first judgment unit judges yes;
the electricity price strategy determining unit is used for taking the target action value function as an optimal action value function under the condition that the second judging unit judges that the electricity price strategy is positive, and determining an optimal retail electricity price strategy according to the optimal action value function;
and the energy consumption calculating unit is used for calculating the optimal energy consumption of the dispatchable load according to the optimal retail electricity price strategy.
From the above technical solutions, the present invention discloses a method and a system for determining based on price demand response, modeling a dynamic retail electricity price pricing problem of an electric power company as a markov decision process, monitoring states of all load units at a current moment as a first state, selecting a target retail electricity price making action at the current moment by using an electricity price selection probability-greedy strategy within an allowable retail electricity price range, calculating an immediate return of revenue after executing the target retail electricity price making action, monitoring states of all load units at a next moment at the current moment and marking as a second state, updating a reference action value function obtained from a previous iteration to a target action value function based on the first state, the second state, the target retail electricity price making action and the immediate return of the revenue, when the current moment reaches a terminal moment, and an absolute value of a difference between the target action value function and the reference action value function is not greater than a difference threshold value, and taking the target action value function as an optimal action value function, determining an optimal retail electricity price strategy according to the optimal action value function by adopting a Markov decision process, and further calculating the optimal energy consumption of the dispatchable load according to the optimal retail electricity price strategy, thereby realizing the determination based on price demand response. The dynamic retail electricity price pricing problem of the power company is modeled into a Markov decision process, and when the optimal retail electricity price strategy is determined according to the optimal action value function, the influence of the current electricity price on the instant response of the load is considered, and the influence of the current electricity price on the response of the load in a period of time in the future is also considered, so that the uncertainty and the flexibility of the dynamic power market can be truly depicted, and the accuracy based on price demand response is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the disclosed drawings without creative efforts.
FIG. 1 is a flow chart of a method for determining a response based on price demand according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a price demand response-based determination system according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention discloses a method and a system for determining based on price demand response, which model the dynamic retail electricity price pricing problem of an electric power company into a Markov decision process, monitor the states of all load units at the current moment and record the states as first states, select a target retail electricity price making action at the current moment by using an electricity price selection probability-greedy strategy within the allowable retail electricity price range, calculate the immediate return of income after executing the target retail electricity price making action, monitor the states of all load units at the next moment at the current moment and record the states as second states, update the reference action value function obtained by the last iteration into a target action value function based on the first state, the second state, the target retail electricity price making action and the immediate return of income, when the current moment reaches the terminal moment and the absolute value of the difference between the target action value function and the reference action value function is not more than the difference threshold value, and taking the target action value function as an optimal action value function, determining an optimal retail electricity price strategy according to the optimal action value function by adopting a Markov decision process, and further calculating the optimal energy consumption of the dispatchable load according to the optimal retail electricity price strategy, thereby realizing the determination based on price demand response. The dynamic retail electricity price pricing problem of the power company is modeled into a Markov decision process, and when the optimal retail electricity price strategy is determined according to the optimal action value function, the influence of the current electricity price on the instant response of the load is considered, and the influence of the current electricity price on the response of the load in a period of time in the future is also considered, so that the uncertainty and the flexibility of the dynamic power market can be truly depicted, and the accuracy based on price demand response is improved.
It should be particularly noted that the price-based demand response to be protected by the present invention is specifically: a problem for residential retail power markets that is based on price demand response. The retail power market comprises a power company and limited collectionWherein N refers to the total number of load units in the retail power market. In practical application, the upper-layer power company makes retail electricity prices for all the served lower-layer load units, and when the lower-layer load units receive retail electricity price signals, the lower-layer load units respond to the electricity prices in real time, so that own energy consumption strategies are determined and transmitted to the power company. Thus, in the residential retail power market framework, the goal of price-based demand response is to be within a limited period of timeCoordinating a limited set of electricity prices according to a dynamic retailThereby maximizing the social benefit of the power system (including the utility's benefit and the combined cost on the load side), where T represents the total number of time intervals.
Referring to fig. 1, a flowchart of a determination method based on price demand response according to an embodiment of the present invention is applied to a processor in an electric power company, and the determination method includes:
step S101, modeling a dynamic retail electricity price pricing problem of an electric power company into a Markov decision process;
step S102, monitoring the states of all load units at the current moment, recording the states as a first state, and selecting a target retail price making action at the current moment by using a price selection probability-greedy strategy within an allowable retail price range;
due to the initial state stRefers to the time index and the energy demand e of all load units at time ttAnd the information is stored in the computer as prior parameter data, so that the initial state of the initialization parameter data can be inquired by randomly selecting the initial time t.
In this embodiment, the initial state s is monitoredtAnd selecting a target retail price making action a at the current moment by using a price selection probability-greedy strategy within an allowable retail price rangetThe electricity price selection probability is expressed by epsilon, the electricity price selection probability-greedy strategy is a criterion for electricity price selection, and the specific meaning is as follows: randomly selecting a retail price theta from the action set A according to the probability of epsilontOr selecting the retail price theta corresponding to the maximum action value function according to the probability of 1-epsilont。
Step S103, calculating the profit return immediately after the target retail price making action is executed, monitoring the states of all load units at the next moment of the current moment, and recording as a second state;
in this embodiment, after the profit is calculated and returned immediately, the states s of all the load units at the next time of the current time t, that is, at the time t +1, are monitoredt+1。
Wherein the profit is calculated according to the formula (1) for immediate return, and the formula (1) is as follows:
rt=ρUt-(1-ρ)Ct(1);
in the formula, rho is [0, 1]]Is a weight parameter, U, representing the relative social value of the utility's revenue and the combined cost of the load unittRepresenting the net income of the utility at time t, CtIndicating the overall cost of the load side at time t.
UtIs shown in equation (2), equation (2) is as follows:
in the formula (I), the compound is shown in the specification,in order to be a non-schedulable set of loads,indicating non-schedulable loadThe retail electricity prices received at the time t,indicating non-schedulable loadThe energy consumption at time t, the superscript n denoting the non-dispatchable load identity, the subscript n denoting the load unit index, the subscript t denoting the time index,indicating schedulable loadThe energy consumption at time t, the superscript d representing the schedulable load identifier,in order to be able to schedule a set of loads,indicating schedulable loadRetail electricity prices, η, received at time ttIndicating schedulable loadWholesale electricity price at time t and satisfy The total electric energy purchased by the electric power company to the electric network operator at the time t is represented, and the superscript tot represents a total electric energy identifier.
CtIs shown in equation (3), equation (3) is as follows:
in the formula (I), the compound is shown in the specification,indicating schedulable loadThe level of dissatisfaction caused by the reduced energy consumption requirement at time t.
Step S104, updating a reference action value function into a target action value function based on the first state, the second state, the target retail price making action and the income immediate return;
the reference action value function is an action value function obtained by last iteration;
target action value function Qk(st,at) Is shown in equation (4), equation (4) is as follows:
in the formula, Qk(st,at) Represents the state s from all load units at the kth iteration as a function of the target action valuetStarting, a target retail price making action a is executedtAccumulated future discount ofThe reward is defined asWhere γ represents a discount factor, e ∈ [0, 1]]Is the learning rate, represents the newly acquired QkValue pair Qk-1Degree of coverage of value, Qk-1(st,at) Representing said reference motion value function, st+1Representing the state of all load units at time t +1, at+1Indicating retail electricity price making action at time t +1, Qk-1(st-1,at+1) State s representing k-1 iterations from all loadst+1Starting from, carry out at+1Accumulated future discount returns.
Step S105, judging whether the current time reaches the terminal time, if so, executing step S106;
if the current time T does not reach the terminal time T, the process returns to step S102.
Step S106, judging whether the absolute value of the difference between the target action value function and the reference action value function is not greater than a difference threshold value, if so, continuing to execute step S107;
in this embodiment, when | Qk-Qk-1If the value is less than or equal to delta, continuing to execute the step S108, otherwise, returning to the step S102, wherein Qk-1As a function of a reference action value, QkAs a function of the target motion value, δ is the difference threshold.
Step S107, taking the target action value function as an optimal action value function, and determining an optimal retail electricity price strategy according to the optimal action value function by adopting a Markov decision process;
specifically, the optimal retail electricity price strategy is shown in formula (5), and formula (5) is as follows:
in the formula, pi*(st) For optimal retail electricity price strategy, Q*(st,at) As a function of the optimum action value, A isAction set, A ═ a1,a2,…,aT}。
And S108, calculating the optimal energy consumption of the dispatchable load according to the optimal retail electricity price strategy.
The expression of the optimal energy consumption is shown in formula (6), and formula (6) is as follows:
in the formula (I), the compound is shown in the specification,indicating the optimal energy consumption, the index n for the load unit, the index t for the time index, the index d for the dispatchable load identification,indicating schedulable loadThe energy requirement at the time of t,indicating schedulable loadRetail electricity prices, η, received at time ttIndicating schedulable loadWholesale electricity price at time t and satisfyμtThe electricity rate elastic coefficient indicates a rate at which the energy demand changes with the retail electricity rate at time t.
To sum up, the invention discloses a method for determining based on price demand response, modeling the dynamic retail electricity price pricing problem of an electric power company as a Markov decision process, monitoring the states of all load units at the current moment as a first state, selecting a target retail electricity price making action at the current moment by using an electricity price selection probability-greedy strategy within an allowable retail electricity price range, calculating the immediate return of income after executing the target retail electricity price making action, monitoring the states of all load units at the next moment at the current moment and marking as a second state, updating a reference action value function obtained from the last iteration into a target action value function based on the first state, the second state, the target retail electricity price making action and the immediate return of income, when the current moment reaches a terminal moment and the absolute value of the difference between the target action value function and the reference action value function is not more than a difference threshold value, and taking the target action value function as an optimal action value function, determining an optimal retail electricity price strategy according to the optimal action value function by adopting a Markov decision process, and further calculating the optimal energy consumption of the dispatchable load according to the optimal retail electricity price strategy, thereby realizing the determination based on price demand response. The dynamic retail electricity price pricing problem of the power company is modeled into a Markov decision process, and when the optimal retail electricity price strategy is determined according to the optimal action value function, the influence of the current electricity price on the instant response of the load is considered, and the influence of the current electricity price on the response of the load in a period of time in the future is also considered, so that the uncertainty and the flexibility of the dynamic power market can be truly depicted, and the accuracy based on price demand response is improved.
In addition, the present invention utilizes a reinforcement learning algorithm to solve the price-based demand response problem in an unknown electricity market environment (i.e., retail electricity prices and load energy consumption are uncertain and random).
To further optimize the above embodiment, before step S102, the action value function needs to be initialized, and the process of initializing the action value function includes:
obtaining known prior parameter data, bringing the prior parameter data into a predetermined action value function, and initializing the action value function.
Wherein the prior parameter data comprises: energy requirement e of load celltCoefficient of dissatisfactionAndelastic coefficient of electricity price mutAnd wholesale electricity price etatAnd a weight parameter ρ, etc., t representing the time.
The initial value of the action value function is 0, i.e. Qk(st,at) At 0, the number of iterations k is 1, i.e., k is 1, and the time t is 1, i.e., Q0(s,a)=0。
In this embodiment, the value range of the time t is: t1, 2, T represents the total number of time intervals.
The variable parameters in the action value function include: stAnd at,stIndicating the state of all load units at time t, i.e. the energy demand e of all load units at time ttEnergy consumption ptAnd a time index t; a istRetail electricity price making action for representing t time, namely retail electricity price theta made by electric power company for all load units at t timet。
It should be noted that, since the power system model relates to information interaction between the power company and the load unit, in order to facilitate understanding of the technical solution to be protected by the present invention, a mathematical model between the power company and the load unit is described below.
Loads are generally classified into two categories, i.e., schedulable loads, according to user preferences and energy consumption characteristics of the loadsAnd non-schedulable loadThat is to say that
The load can be scheduled: the energy consumption of a typical schedulable load is expressed as equation (7), where equation (7) is as follows:
in the formula (I), the compound is shown in the specification,andrespectively representing schedulable loadsEnergy consumption and energy demand at time t, wherein energy demand refers to the electric energy expected to be consumed by the load unit before receiving the retail electricity price signal, energy consumption information refers to the electric energy actually consumed by the load unit after receiving the retail electricity price signal, subscript n denotes a load unit index, subscript t denotes a time index, and superscript d denotes a schedulable load identifier. Mu.stThe electricity rate elastic coefficient indicates a rate at which the energy demand changes with the retail electricity rate at time t.And ηtRespectively representing schedulable loadsRetail electricity prices received at time t and wholesale electricity prices at time t, and satisfy
Equation (7) shows that the actual energy consumption of the schedulable load depends not only on the energy demand information but also on the reduction of the energy demand due to the change of the retail electricity prices. When the schedulable load cell n is trueThe actual energy consumption isTime means the restAre not satisfied and thus may result in dissatisfaction with the load user.
To characterize this dissatisfaction, a dissatisfaction function is defined as shown in equation (8), equation (8) being as follows:
in the formula (I), the compound is shown in the specification,indicating schedulable loadThe level of dissatisfaction caused by the reduced energy consumption requirement at time t,andrepresenting two schedulable dependent loadsThe dissatisfaction coefficient of (a) to (b),indicating schedulable loadThe amount of reduction in the demand of (c),indicating schedulable loadAt the time t, the energy demand, the superscript d, represents the schedulable load identifier,in order to be able to schedule a set of loads,indicating schedulable loadEnergy consumption at time t.
Equation (8) shows that a greater reduction in demand results in a higher level of dissatisfaction of the load unit.
In addition, the load can be scheduledCannot exceed its allowable range, as shown in the inequality (9), the inequality (9) is as follows:
in the formula (I), the compound is shown in the specification,andrespectively representing schedulable loadsAnd a maximum demand reduction amount, and are both known amounts. Once the cover is closedAndwhen known, the real energy consumption of the schedulable load n can be determined accordinglyThe range of (1).
(II) non-schedulable load: generally, the energy requirements of non-dispatchable loads cannot be transferred and curtailed at will, so the energy requirements of these loads must be strictly met at all times.
in the formula (I), the compound is shown in the specification,andrespectively representing non-dispatchable loadsEnergy consumption and energy demand at time t, where the superscript n denotes the non-dispatchable load identity.
Thus, from a load point of view, the goal is to minimize the overall cost on the load side by determining the optimal energy consumption combination for all loads, i.e. to minimize the overall cost on the load side
Wherein p represents the energy consumption vector of all the participating load units in the whole time period T,indicating non-schedulable loadRetail electricity prices received at time t.
It can be seen that the above formula consists of two parts, corresponding to two types of loads, respectively. Specifically, item oneRepresenting the cost of electricity to a utility company for the purchase of electricity by a non-dispatchable load, item twoRepresenting the power cost and dissatisfaction costs of dispatchable loads to purchase power from the utility.
For the convenience of subsequent discussion and writing, the comprehensive cost of the load side at the time t is defined as CtTherefore, the above formula can be further written as
The electric power company, as an intermediary between the end user and the electric power producer, first purchases electric energy from the grid operator at a wholesale price and then sells the purchased electric energy to different types of load units on the load side at a retail price. The goal of the electric company is therefore to maximize revenue by trading in wholesale and retail markets, the mathematical model of which can be expressed as:
wherein θ represents a retail electricity price vector established by the electric power company for all load units during the whole time period T;representing the total power purchased by the utility company to the grid operator at time t, where tot represents the total power identification,θ nandrespectively, representing the lower and upper bounds of the retail price of electricity made by the utility for the load unit. It can be seen that the objective function in the mathematical model consists of three terms, wherein,representing the revenue of the utility selling power to the non-dispatchable load units,representing the revenue of the utility selling power to the dispatchable load unit,representing the cost of electricity from a utility company to purchase electricity from a grid operator. Likewise, for convenience of presentation, the net revenue for the utility at time t is defined as UtTherefore, the objective function in the above formula is further described as
Generally, when power loss is not considered and a power balance rule is followed, the total purchased power of the electric power company is equal to the total power consumption of the load side at any time, that is, as shown in equation (11), equation (11) is as follows:
in modeling the utility and load units, it can be found that price-based demand responses are closely related to revenue for the utility and cost of the load units. From a social perspective, therefore, the goal of the system is to maximize the social benefit including the combined cost of the utility revenue and load, as shown in equation (12), where equation (12) is as follows:
where ρ ∈ [0, 1] is a weight parameter representing the relative social value of revenue of the utility and the combined cost of the load unit. The larger ρ is, the more the profit of the electric power company is concerned from the social point of view; on the contrary, the influence of the comprehensive cost of the load on the social profit is more concerned.
In order to establish a dynamic retail electricity price capable of adapting to flexible load change in an unknown electric power market environment, the method firstly utilizes a reinforcement learning framework to model the retail electric power market.
Specifically, the electric power company acts as an agent; all load units are used as environment; an act of retail price being acted upon the environment as an agent; energy demand, energy consumption and time of the load as conditions; the social benefit (i.e., the weighted sum of the utility revenue and the combined cost of the load units) is rewarded.
Second, the dynamic retail price pricing problem is modeled using a Markov decision process, which is also typically the first step in using a reinforcement learning algorithm. Without loss of generality, the markov decision process is represented by a five-tuple < S, a, R, P, γ >, where the meaning of each element is as follows:
1) state collection: s ═ S1,s2,…,sTIn which s ist=(et,ptT) represents the state at time t, from which the energy demand e of all load unitstEnergy consumption ptAnd a time index t.
2) And (4) action set: a ═ a1,a2,…,aTIn which a ist=θtRepresenting the action of the agent at time t, i.e. the retail price of electricity theta established by the electric power company for all load units at time tt。
3) And (3) return set: r ═ R1,r2,…,rTIn which r ist=ρUt-(1-ρ)CtAnd the return of the system at the moment t is shown, namely the social benefit at the current moment.
Representing the probability that after taking action a at state s, the environment will transition to state s' at the next moment. Since the energy demand and energy consumption of the load are influenced by many factors, it is difficult to obtain the state transition probability thereof. The power market environment is unknown in the present invention, and therefore a modeless Q-learning approach is employed to address the dynamic retail pricing problem.
5) Discount factor: γ ∈ [0, 1] denotes the importance of the subsequent reward relative to the current reward.
Defining a strategy pi: s → A, i.e. mapping of state to action, the pricing problem of retail electricity prices translates into finding an optimal strategy of π*Maximizing the cumulative return of the system, i.e.Since the goal of the system is to maximize the social benefit over the entire time period and the social value returned at any time is equal, γ is 1 in the present invention.
After modeling the dynamic retail price pricing problem for the utility as a markov decision process, a Q-learning algorithm (a model-free reinforcement learning algorithm) is used to analyze how the utility selects the optimal retail price while interacting with all load units to achieve the power system objective.
The basic principle of the Q-learning algorithm is to assign an action-value function Q (s, a) to each state-action pair (s, a), and then update the function in each iteration to obtain the optimal action-value function Q*(s, a). The optimal action value function is defined as starting from a state s, taking an action a and then taking an optimal strategy pi*And satisfies the Bellman equation, i.e. the maximum cumulative future discount return
Where S 'is e.S, a' is e.A represents the state and action taken at the next time, respectively, r (S, a) represents the reward immediately after action a is taken from state S, Q*(s ', a') denotes that starting from the state s ', an action a' is performed, after which an optimal strategy is taken π*The maximum cumulative future discount returns. Gamma is belonged to 0, 1]And a discount factor is expressed, so that the influence of the current retail price on the load response in the future period is considered in the algorithm, and the influence of the current retail price on the load response is reflected at the same time. Therefore, once the optimum action value function Q is obtained*(st,at) The optimal retail electricity price policy shown in formula (5) can be directly obtained according to the following formula.
Corresponding to the embodiment of the method, the invention also discloses a system for determining the price demand response.
Referring to fig. 2, a schematic structural diagram of a price demand response-based determination system disclosed in an embodiment of the present invention, the system is applied to a processor in an electric power company, and the system includes:
a modeling unit 201, configured to model a dynamic retail price pricing problem of an electric power company as a markov decision process;
the action selection unit 202 is used for monitoring the states of all load units at the current moment, recording the states as first states, and selecting a target retail price at the current moment to make an action by using a price selection probability-greedy strategy within an allowable retail price range;
a reward calculation unit 203, configured to calculate a reward immediate reward after the target retail price making action is executed, monitor states of all load units at a next time of the current time, and record the states as a second state;
a function updating unit 204, configured to update a reference action value function to a target action value function based on the first state, the second state, the target retail electricity price making action, and the immediate return of revenue, where the reference action value function is an action value function obtained through last iteration;
a first judging unit 205, configured to judge whether the current time reaches a terminal time;
a second determination unit 206 configured to determine whether or not an absolute value of a difference between the target motion value function and the reference motion value function is not greater than a difference threshold value if the first determination unit 205 determines yes;
an electricity price policy determination unit 207 configured to, if the second determination unit 206 determines that the current price is the retail price, take the target action value function as an optimal action value function, and determine an optimal retail electricity price policy according to the optimal action value function;
and the energy consumption calculating unit 208 is used for calculating the optimal energy consumption of the dispatchable load according to the optimal retail electricity price strategy.
It should be noted that, for the specific working principle of each component in the system embodiment, please refer to the corresponding part of the method embodiment, which is not described herein again.
To sum up, the invention discloses a price demand response-based determination system, which models a dynamic retail electricity price pricing problem of an electric power company into a Markov decision process, monitors the states of all load units at the current moment to be recorded as a first state, selects a target retail electricity price making action at the current moment by using an electricity price selection probability-greedy strategy within an allowable retail electricity price range, calculates the immediate return of income after executing the target retail electricity price making action, monitors the states of all load units at the next moment at the current moment to be recorded as a second state, updates a reference action value function obtained from the last iteration into a target action value function based on the first state, the second state, the target retail electricity price making action and the immediate return of income, when the current moment reaches a terminal moment and the absolute value of the difference between the target action value function and the reference action value function is not more than a difference threshold value, and taking the target action value function as an optimal action value function, determining an optimal retail electricity price strategy according to the optimal action value function by adopting a Markov decision process, and further calculating the optimal energy consumption of the dispatchable load according to the optimal retail electricity price strategy, thereby realizing the determination based on price demand response. The dynamic retail electricity price pricing problem of the power company is modeled into a Markov decision process, and when the optimal retail electricity price strategy is determined according to the optimal action value function, the influence of the current electricity price on the instant response of the load is considered, and the influence of the current electricity price on the response of the load in a period of time in the future is also considered, so that the uncertainty and the flexibility of the dynamic power market can be truly depicted, and the accuracy based on price demand response is improved.
It should be noted that, besides the field of energy management on the demand side, the present invention can also be applied to decision-making problems in other unknown environments in the smart grid, such as power balance on both sides of supply and demand, scheduling problems of optimal generator sets, and the like.
The state space, the action space and the return definition in the Markov decision process are not unique, and can be redefined according to other targets of a system or an individual; in addition, the selection of the learning rate in the Q-learning algorithm has a great influence on the convergence of the algorithm, so that the selection of the learning rate can be further analyzed and discussed.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (11)
1. A method for determining a response based on a price demand, comprising:
modeling a dynamic retail electricity price pricing problem of an electric power company as a Markov decision process;
monitoring the states of all load units at the current moment, recording the states as a first state, and selecting a target retail electricity price making action at the current moment by using an electricity price selection probability-greedy strategy within an allowable retail electricity price range;
calculating the profit return immediately after the target retail electricity price making action is executed, monitoring the states of all load units at the next moment of the current moment, and recording as a second state;
updating a reference action value function into a target action value function based on the first state, the second state, the target retail electricity price making action and the immediate return of income, wherein the reference action value function is an action value function obtained by last iteration;
judging whether the current time reaches the terminal time;
if so, judging whether the absolute value of the difference between the target action value function and the reference action value function is not greater than a difference threshold value;
if so, taking the target action value function as an optimal action value function, and determining an optimal retail electricity price strategy according to the optimal action value function;
and calculating the optimal energy consumption of the dispatchable load according to the optimal retail electricity price strategy.
2. The validation method according to claim 1, wherein the specific meaning of the electricity price selection probability-greedy strategy is: and randomly selecting one retail price from the action set according to the probability of epsilon, or selecting the retail price corresponding to the maximum action value function according to the probability of 1-epsilon, wherein epsilon represents the price selection probability.
3. The validation method of claim 1, wherein the revenue is reported back immediately rtThe expression of (a) is as follows:
rt=ρUt-(1-ρ)Ct;
in the formula, rho is [0, 1]]Is a weight parameter, U, representing the relative social value of the utility's revenue and the combined cost of the load unittRepresenting the net income of the utility at time t, CtIndicating the overall cost of the load side at time t.
4. The validation method of claim 3, wherein the net profit U for the utility at time ttThe expression of (a) is as follows:
in the formula (I), the compound is shown in the specification,in order to be a non-schedulable set of loads,indicating non-schedulable loadThe retail electricity prices received at the time t,indicating non-schedulable loadThe energy consumption at time t, the superscript n denoting the non-dispatchable load identity, the subscript n denoting the load unit index, the subscript t denoting the time index,indicating schedulable loadThe energy consumption at time t, the superscript d representing the schedulable load identifier,in order to be able to schedule a set of loads,indicating schedulable loadRetail electricity prices, η, received at time ttIndicating schedulable loadWholesale electricity price at time t and satisfy Representing the total power purchased by the utility company to the grid operator at time t, and the superscript tot representing the totalElectric energy identification;
comprehensive cost C of load side at time ttThe expression of (a) is as follows:
5. The confirmation method according to claim 4, wherein the degree of dissatisfactionThe expression of (a) is as follows:
in the formula (I), the compound is shown in the specification,indicating schedulable loadThe level of dissatisfaction caused by the reduced energy consumption requirement at time t,andrepresenting two schedulable dependent loadsThe dissatisfaction coefficient of (a) to (b),indicating schedulable loadThe amount of reduction in the demand of (c),indicating schedulable loadAt the time t, the energy demand, the superscript d, represents the schedulable load identifier,in order to be able to schedule a set of loads,indicating schedulable loadEnergy consumption at time t.
6. The validation method of claim 5, wherein the load is schedulableReduction in demand ofThe following inequality is satisfied:
7. The validation method of claim 1, wherein the target action value function is expressed as follows:
in the formula, Qk(st,at) Representing the state s from all load units at the kth iteration as a function of the target action valuetStarting, a target retail price making action a is executedtIs defined as the cumulative future discount return ofWhere γ represents a discount factor, e ∈ [0, 1]]Is the learning rate, represents the newly acquired QkValue pair Qk-1Degree of coverage of value, Qk-1(st,at) Representing said reference motion value function, st+1Representing the state of all load units at time t +1, at+1Indicating retail electricity price making action at time t +1, Qk-1(st+1,at+1) State s representing k-1 iterations from all loadst+1Starting from, carry out at+1Accumulated future discount returns.
8. The confirmation method according to claim 1, wherein the expression of the optimal retail electricity price policy is as follows:
in the formula, pi*(st) For optimal retail electricity price strategy, Q*(st,at) For the optimal action value function, A is the action set, and A ═ a1,a2,…,aTThe value range of the time t is: t1, 2, T represents the total number of time intervals, stRepresenting the state of all load units at time t, atIndicating a retail electricity price making action at time t.
9. The validation method of claim 1, wherein the expression of the optimal amount of energy consumption is as follows:
in the formula (I), the compound is shown in the specification,indicating the optimal energy consumption, the index n for the load unit, the index t for the time index, the index d for the dispatchable load identification,indicating schedulable loadThe energy requirement at the time of t,indicating schedulable loadRetail electricity prices, η, received at time ttIndicating schedulable loadWholesale electricity price at time t and satisfyμtThe electricity rate elastic coefficient indicates a rate at which the energy demand changes with the retail electricity rate at time t.
10. The validation method of claim 1, wherein the determination method further comprises: initializing the action value function, specifically including:
obtaining known prior parameter data, substituting the prior parameter data into the predetermined action value function, and initializing the action value function, wherein the initial value of the action value function is 0.
11. A price demand response-based determination system, comprising:
the modeling unit is used for modeling the dynamic retail electricity price pricing problem of the power company into a Markov decision process;
the action selection unit is used for monitoring the states of all load units at the current moment, recording the states as first states, and selecting a target retail price at the current moment to make an action by using a price selection probability-greedy strategy within an allowable retail price range;
the return calculating unit is used for calculating the return immediately after the target retail price making action is executed, monitoring the states of all load units at the next moment of the current moment and recording the states as a second state;
a function updating unit, configured to update a reference action value function to a target action value function based on the first state, the second state, the target retail electricity price making action, and the immediate return of revenue, where the reference action value function is an action value function obtained through last iteration;
the first judgment unit is used for judging whether the current moment reaches the terminal moment;
a second judgment unit configured to judge whether or not an absolute value of a difference between the target motion value function and the reference motion value function is not greater than a difference threshold value, in a case where the first judgment unit judges yes;
the electricity price strategy determining unit is used for taking the target action value function as an optimal action value function under the condition that the second judging unit judges that the electricity price strategy is positive, and determining an optimal retail electricity price strategy according to the optimal action value function;
and the energy consumption calculating unit is used for calculating the optimal energy consumption of the dispatchable load according to the optimal retail electricity price strategy.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110366209.XA CN113052638B (en) | 2021-04-06 | 2021-04-06 | Price demand response-based determination method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110366209.XA CN113052638B (en) | 2021-04-06 | 2021-04-06 | Price demand response-based determination method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113052638A true CN113052638A (en) | 2021-06-29 |
CN113052638B CN113052638B (en) | 2023-11-24 |
Family
ID=76517587
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110366209.XA Active CN113052638B (en) | 2021-04-06 | 2021-04-06 | Price demand response-based determination method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113052638B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20190036488A (en) * | 2017-09-27 | 2019-04-04 | 한양대학교 에리카산학협력단 | Real-time decision method and system for industrial load management in a smart grid |
CN110378058A (en) * | 2019-07-26 | 2019-10-25 | 中民新能投资集团有限公司 | A kind of method for building up for the electro thermal coupling microgrid optimal response model comprehensively considering reliability and economy |
KR20190132193A (en) * | 2018-05-18 | 2019-11-27 | 한양대학교 에리카산학협력단 | A Dynamic Pricing Demand Response Method and System for Smart Grid Systems |
CN111105126A (en) * | 2019-10-30 | 2020-05-05 | 国网浙江省电力有限公司舟山供电公司 | Power grid service value making method based on reinforcement learning of user side demand response |
-
2021
- 2021-04-06 CN CN202110366209.XA patent/CN113052638B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20190036488A (en) * | 2017-09-27 | 2019-04-04 | 한양대학교 에리카산학협력단 | Real-time decision method and system for industrial load management in a smart grid |
KR20190132193A (en) * | 2018-05-18 | 2019-11-27 | 한양대학교 에리카산학협력단 | A Dynamic Pricing Demand Response Method and System for Smart Grid Systems |
CN110378058A (en) * | 2019-07-26 | 2019-10-25 | 中民新能投资集团有限公司 | A kind of method for building up for the electro thermal coupling microgrid optimal response model comprehensively considering reliability and economy |
CN111105126A (en) * | 2019-10-30 | 2020-05-05 | 国网浙江省电力有限公司舟山供电公司 | Power grid service value making method based on reinforcement learning of user side demand response |
Non-Patent Citations (2)
Title |
---|
MING JIN等: ""Microgrid to enable optimal distributed energy retail and end-user demand response"", 《APPLIED ENERGY》 * |
翟亚飞;刘继春;刘俊勇;: "多种电价形式和负荷类型下售电公司的定价策略", 供用电, no. 08 * |
Also Published As
Publication number | Publication date |
---|---|
CN113052638B (en) | 2023-11-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Lu et al. | A dynamic pricing demand response algorithm for smart grid: Reinforcement learning approach | |
Ahrarinouri et al. | Multiagent reinforcement learning for energy management in residential buildings | |
Celik et al. | Electric energy management in residential areas through coordination of multiple smart homes | |
Yang et al. | Decision-making for electricity retailers: A brief survey | |
Meng et al. | A profit maximization approach to demand response management with customers behavior learning in smart grid | |
Yousefi et al. | Optimal real time pricing in an agent-based retail market using a comprehensive demand response model | |
Hatami et al. | A stochastic-based decision-making framework for an electricity retailer: Time-of-use pricing and electricity portfolio optimization | |
JP6236585B2 (en) | Load forecast from individual customer to system level | |
EP3580719A2 (en) | Methods and systems for an automated utility marketplace platform | |
Wan et al. | Price-based residential demand response management in smart grids: A reinforcement learning-based approach | |
Lu et al. | A reinforcement learning-based decision system for electricity pricing plan selection by smart grid end users | |
KR20190132193A (en) | A Dynamic Pricing Demand Response Method and System for Smart Grid Systems | |
Ruan et al. | Time-varying price elasticity of demand estimation for demand-side smart dynamic pricing | |
Yang et al. | Quantifying the benefits to consumers for demand response with a statistical elasticity model | |
Reddy et al. | Computational intelligence for demand response exchange considering temporal characteristics of load profile via adaptive fuzzy inference system | |
Liu et al. | A home energy management system incorporating data-driven uncertainty-aware user preference | |
Oh et al. | A multi-use framework of energy storage systems using reinforcement learning for both price-based and incentive-based demand response programs | |
He et al. | An occupancy-informed customized price design for consumers: A stackelberg game approach | |
Li et al. | Reinforcement learning aided smart-home decision-making in an interactive smart grid | |
CN116227806A (en) | Model-free reinforcement learning method based on energy demand response management | |
KR20180044700A (en) | Demand response management system and method for managing customized demand response program | |
Ahmed et al. | Building load management clusters using reinforcement learning | |
Henni et al. | Industrial peak shaving with battery storage using a probabilistic forecasting approach: Economic evaluation of risk attitude | |
Xiang et al. | Smart Households' Available Aggregated Capacity Day-ahead Forecast Model for Load Aggregators under Incentive-based Demand Response Program | |
Wang et al. | Coordinated residential energy resource scheduling with human thermal comfort modelling and renewable uncertainties |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |