CN109461019A - A kind of dynamic need response pricing method based on Fuzzy Reinforcement Learning - Google Patents

A kind of dynamic need response pricing method based on Fuzzy Reinforcement Learning Download PDF

Info

Publication number
CN109461019A
CN109461019A CN201811109049.5A CN201811109049A CN109461019A CN 109461019 A CN109461019 A CN 109461019A CN 201811109049 A CN201811109049 A CN 201811109049A CN 109461019 A CN109461019 A CN 109461019A
Authority
CN
China
Prior art keywords
load
model
fuzzy
demand response
reinforcement learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811109049.5A
Other languages
Chinese (zh)
Inventor
邱守强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201811109049.5A priority Critical patent/CN109461019A/en
Publication of CN109461019A publication Critical patent/CN109461019A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0283Price estimation or determination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • G06Q10/06315Needs-based resource requirements planning or analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Theoretical Computer Science (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Tourism & Hospitality (AREA)
  • Accounting & Taxation (AREA)
  • Health & Medical Sciences (AREA)
  • Finance (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Educational Administration (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The dynamic need that the invention discloses a kind of based on Fuzzy Reinforcement Learning responds pricing method, comprising steps of S1, establishing layered education model, including Fuzzy Loads demand response model, load aggregation quotient's Optimized model and its target function model;S2, the model established to step S1 are solved to obtain optimal zero potential energy with fuzzy reinforcement algorithm.The present invention finds reasonable electricity price in the case where considering load responding fuzzy uncertainty, the deficiency of load responding fuzzy uncertainty is not accounted for for dynamic need response pricing model, it is proposed Fuzzy Loads demand response model, load assembles quotient's Optimized model and target function model, and propose that the dynamic need based on Fuzzy Reinforcement Learning responds pricing steps, not only fully consider the uncertainty of load responding, also adapt to the Power Market of dynamic change, improve computational efficiency, real-time optimal pricing strategy is found by optimization, it is unbalanced to play the role of raising electric network reliability reduction energy.

Description

Dynamic demand response pricing method based on fuzzy reinforcement learning
Technical Field
The invention relates to a dynamic demand response pricing method based on fuzzy reinforcement learning.
Background
With the development of the communication technology of the power distribution network, the demand side response has a flexible adjusting effect at the load end, and the demand side response becomes an effective method for improving the reliability of the power grid and reducing the energy loss. The price type demand response enables a user to change the electricity utilization mode of the user according to the electricity price signal which changes in real time, and the purpose of adjusting the load curve is achieved. The dynamic demand response pricing process is a decision-making process that aims to find a reasonable price for electricity to allocate the system's power services. Demand response pricing models often employ deterministic pricing models, such as time-of-use pricing models, and do not reflect well the uncertainty of the energy of the real-time dynamic market. Dynamic price pricing models typically utilize linear pricing models, have no logical pricing process that is reasonable, and do not reflect the complexity of demand response distribution. It is necessary how to build an uncertain demand response model reflecting the load demand response.
Reinforcement Learning (RL) is an artificial intelligence algorithm. The reinforcement learning algorithm is one of machine learning by taking the behavioral psychology as a reference, and can be used for a decision problem. Reinforcement learning maximizes the rewards of some decisions by individuals continually taking action on an uncertain environment. The application of the reinforcement learning algorithm in the pricing model is beneficial to fully considering the uncertainty and flexibility of the power market, and can be used for solving the dynamic demand response pricing method with uncertainty.
Disclosure of Invention
The invention aims to overcome the defects of a traditional dynamic demand response pricing model, and provides a dynamic demand response pricing method based on a fuzzy reinforcement learning algorithm, which can fully consider the uncertainty and the flexibility of a power market into the decision of electricity price.
The technical scheme adopted by the invention is as follows:
a dynamic demand response pricing method based on fuzzy reinforcement learning comprises the following steps:
s1, establishing a layered power market model, including a fuzzy load demand response model, a load aggregation quotient optimization model and an objective function model thereof;
and S2, solving the model established in the step S1 by using a fuzzy reinforcement learning algorithm to obtain the optimal retail electricity price.
Further, in step S1, the establishing the fuzzy load demand response model specifically includes:
s11, establishing a model of the basic load and an interruptible load model:
in the fuzzy load demand response model, the load comprises an interruptible load and a basic load which does not participate in demand response, and the model of the basic load is as follows:
in the formula,andrespectively representing the energy consumption and the actual energy demand of the user n in the time period t; t ∈ {1,2,3 … T }, T representing the total number of time segments of a day; n is equal to {1,2,3 … N }, N represents the total number of users, and b is a superscript tableIndicating a basic load;
the interruptible load model is:
ξt=(ξabc)
ξabc<0
λt,n≥πt
in the formula, E2]Representing a fuzzy expected value;andrepresenting the interruptible energy consumption and energy demand of user n during time t, respectively ξtIs the price elastic coefficient of the time period t, the value of which is less than zero and is a triangular fuzzy number; lambda [ alpha ]t,nIndicating a retail price of electricity for user n for time period t; pitA wholesale electricity price representing a t time period; superscript c represents interruptible load; subscripts a, b, and c represent the start, middle, and end points of the triangular blur number, respectively;
s12, determining a minimum cost target model of the user according to the model of the basic load and the interruptible load model:
wherein,indicating the total actual load consumption expected value,represents the dissatisfaction degree of the user n in the time period t:
αn>0,βn>0
in the formula, αnAnd βnA reaction parameter representing the load on the tangential load amount; dminAnd DmaxRepresenting the minimum and maximum load shedding amounts of the load, respectively.
Further, in step S1, the purpose of establishing the load aggregator optimization model is to earn the maximum profit of the retail electricity price and the wholesale electricity price, and the specific model is as follows:
further, in step S1, when the cost of the user and the profit of the load aggregator are considered simultaneously, the objective function model is:
in the formula, rho epsilon [0,1] represents the weight relation of the user cost and the load aggregation quotient.
Further, the step S2 specifically includes:
step S21: initializing parameters including: energy requirement E of the loadt,nElastic coefficient of price ξtThe reaction parameter α of load bisection loadn、βn(ii) a Minimum and maximum load shedding D of the loadmin、Dmax(ii) a Price of wholesale electricity pin(ii) a The weight factor θ of the reward; the weight relationship rho between the user cost and the load provider;
step S22: initializing Q (e)t,n|Et,nt,n) Each element in the Q table is zero, a time period t is set to be 0, and the iteration number k is set to be 0;
step S23: observing the energy demand E of the user at t ═ 1t,n
Step S24: selection of retail electricity prices lambda with greedy strategyt,n
Step S25: calculating a reward, i.e. objective functionObserving the energy demand E of the user over a time period t +1t+1,nAnd updating the FQ value;
step S26: judging whether the maximum time period T is reached, if yes, turning to the next step, otherwise, returning to the step S24 when T is T + 1;
step S27: judging whether the Q table converges to the maximum value, if yes, going to the next step, otherwise, returning to the step S23 when k is k + 1;
step S28: and outputting the optimal retail price of the T time periods in one day.
Further, in step S24, the state-action value function is:
V(x)=maxFQ(sk+1,a),
wherein FQ (·) represents the FQ value and is a fuzzy expected value; k represents the number of iterations; a is in the state sk+1An act of selecting;
then the action policy is selected using greedy principle as:
wherein x is a random number within the interval [0,1 ]; ε represents the search rate.
Further, in step S25, the FQ value may be updated by the following equation:
FQ(sk,a)←FQ(sk,ak)+αk[r(sk,ak)+γmaxFQ(sk+1,a)-FQ(sk,ak)]
wherein α represents a learning factor, γ represents a discount factor, r(s)k,ak) Is represented by skIn state selection akAnd (4) reporting the action.
Compared with the prior art, the invention has the following beneficial effects:
in the operation process, the uncertainty of the load is fully considered, the defect of fuzzy uncertainty of load response is not considered for the dynamic demand response pricing model, the method is suitable for the real-time changing electric power market environment, the rationality of dynamic pricing is improved, the calculation efficiency is improved, the real-time optimal pricing strategy is found through the optimization algorithm, and the effects of improving the reliability of the power grid and reducing the energy imbalance are achieved.
Drawings
FIG. 1 is a schematic diagram of a tiered power market model.
Fig. 2 is a schematic flow chart of solving to obtain the optimal retail electricity price based on the fuzzy reinforcement learning algorithm.
Detailed Description
The embodiments are further described below with reference to the accompanying drawings.
A dynamic demand response pricing method based on fuzzy reinforcement learning comprises the following steps:
s1, establishing a layered power market model, including a fuzzy load demand response model, a load aggregation quotient optimization model and an objective function model thereof;
and S2, solving the model established in the step S1 by using a fuzzy reinforcement learning algorithm to obtain the optimal retail electricity price.
As shown in fig. 1, energy is sold to a load aggregator at a wholesale price by a power producer, and then sold to a consumer at a retail price by the load aggregator. The exchange information among the three is mainly the purchase price and the electricity consumption. The information exchange and pricing decision mechanism of the retail price between the load aggregator and the consumer is the dynamic load demand response pricing method based on the fuzzy reinforcement learning provided by the embodiment.
Specifically, in step S1, the establishing the fuzzy load demand response model specifically includes:
s11, establishing a model of the basic load and an interruptible load model:
in the fuzzy load demand response model, the load comprises an interruptible load and a basic load which does not participate in demand response, and the model of the basic load is as follows:
in the formula,andrespectively representing the energy consumption and the actual energy demand of the user n in the time period t; t ∈ {1,2,3 … T }, T representing the total number of time segments of a day; n ∈ {1,2,3 … N }, where N denotes the total userNumber, superscript b represents base load;
the interruptible load model is:
ξt=(ξabc)
ξabc<0
λt,n≥πt
in the formula, E2]Representing a fuzzy expected value;andrepresenting the interruptible energy consumption and energy demand of user n during time t, respectively ξtIs the price elastic coefficient of the time period t, the value of which is less than zero and is a triangular fuzzy number; lambda [ alpha ]t,nIndicating a retail price of electricity for user n for time period t; pitA wholesale electricity price representing a t time period; superscript c represents interruptible load; subscripts a, b, and c represent the start, middle, and end points of the triangular blur number, respectively;
s11, determining a minimum cost target model of the user according to the model of the basic load and the interruptible load model:
wherein,indicating the total actual load consumption expected value,represents the dissatisfaction degree of the user n in the time period t:
αn>0,βn>0
in the formula, αnAnd βnA reaction parameter representing the load on the tangential load amount; dminAnd DmaxRepresenting the minimum and maximum load shedding amounts of the load, respectively.
Specifically, in step S1, the purpose of establishing the load aggregator optimization model is to earn the maximum profit of the retail electricity price and the wholesale electricity price, and the specific model is as follows:
specifically, in step S1, when the cost of the user and the profit of the load aggregator are considered, the objective function model is:
in the formula, rho epsilon [0,1] represents the weight relation of the user cost and the load aggregation quotient.
Specifically, as shown in fig. 2, the step S2 specifically includes:
step S21: initializing parameters, including: energy requirement E of the loadt,nElastic coefficient of price ξtThe reaction parameter α of load bisection loadn、βn(ii) a Minimum and maximum load shedding D of the loadmin、Dmax(ii) a Price of wholesale electricity pin(ii) a The weight factor θ of the reward; the weight relationship rho between the user cost and the load provider;
step S22: initializing Q (e)t,n|Et,nt,n) Each element in the Q table is zero, a time period t is set to be 0, and the iteration number k is set to be 0;
step S23: observing the energy demand E of the user at t ═ 1t,n
Step S24: selection of retail electricity prices lambda with greedy strategyt,nAt this time, the state-action value function is:
V(x)=maxFQ(sk+1,a),
wherein FQ (·) represents the FQ value and is a fuzzy expected value; k represents the number of iterations; a is in the state sk+1An act of selecting;
then the action policy is selected using greedy principle as:
wherein x is a random number within the interval [0,1 ]; ε represents the search rate;
step S25: calculating a reward, i.e. objective functionObserving the energy demand E of the user over a time period t +1t+1,nAnd updating the FQ value with the following equation:
FQ(sk,a)←FQ(sk,ak)+αk[r(sk,ak)+γmaxFQ(sk+1,a)-FQ(sk,ak)]
wherein α represents a learning factor, γ represents a discount factor, r(s)k,ak) Is represented by skIn state selection akA reward for an action;
step S26: judging whether the maximum time period T is reached, if yes, turning to the next step, otherwise, returning to the step S24 when T is T + 1;
step S27: judging whether the Q table converges to the maximum value, if yes, going to the next step, otherwise, returning to the step S23 when k is k + 1;
step S28: and outputting the optimal retail price of the T time periods in one day.
The load aggregation businessman collects the electricity demand E of the consumert,nAnd the initial parameters such as the user dissatisfaction degree coefficient and the like, the maximization of the objective function is obtained through the pricing method of the dynamic load demand response based on the fuzzy reinforcement learning, the calculated optimized retail price is issued to the consumption user, the power consumption demand is fed back to the power production department, and the power production department then guides the power production.
The method searches for reasonable electricity price under the condition of considering the fuzzy uncertainty of the load response, provides a load demand response model, a service provider model and an objective function model aiming at the defect that the fuzzy uncertainty of the load response is not considered in a dynamic demand response pricing model, and provides the steps of a dynamic demand response pricing algorithm based on fuzzy reinforcement learning, so that the uncertainty of the load response can be fully considered, and the method can adapt to the environment of a dynamically changing power market.
Although the present invention has been described with reference to the above embodiments, it should be understood that the present invention is not limited to the above embodiments, and other embodiments and modifications may be made by those skilled in the art without departing from the scope of the present invention.

Claims (7)

1. A dynamic demand response pricing method based on fuzzy reinforcement learning is characterized by comprising the following steps:
s1, establishing a layered power market model, including a fuzzy load demand response model, a load aggregation quotient optimization model and an objective function model thereof;
and S2, solving the model established in the step S1 by using a fuzzy reinforcement learning algorithm to obtain the optimal retail electricity price.
2. The dynamic demand response pricing method based on fuzzy reinforcement learning of claim 1, wherein: in step S1, the establishing the fuzzy load demand response model specifically includes:
s11, establishing a model of the basic load and an interruptible load model:
in the fuzzy load demand response model, the load comprises an interruptible load and a basic load which does not participate in demand response, and the model of the basic load is as follows:
in the formula,andrespectively representing the energy consumption and the actual energy demand of the user n in the time period t; t ∈ {1,2,3 … T }, T representing the total number of time segments of a day; n belongs to {1,2,3 … N }, wherein N represents the total number of users, and superscript b represents the basic load;
the interruptible load model is:
ξt=(ξabc)
ξabc<0
λt,n≥πt
in the formula, E2]Representing a fuzzy expected value;andrepresenting the interruptible energy consumption and energy demand of user n during time t, respectively ξtIs the price elastic coefficient of the time period t, the value of which is less than zero and is a triangular fuzzy number; lambda [ alpha ]t,nIndicating a retail price of electricity for user n for time period t; pitA wholesale electricity price representing a t time period; superscript c represents interruptible load; subscripts a, b, and c represent the start, middle, and end points of the triangular blur number, respectively;
s12, determining a minimum cost target model of the user according to the model of the basic load and the interruptible load model:
wherein,indicating the total actual load consumption expected value,represents the dissatisfaction degree of the user n in the time period t:
αn>0,βn>0
in the formula, αnAnd βnA reaction parameter representing the load on the tangential load amount; dminAnd DmaxRepresenting the minimum and maximum load shedding amounts of the load, respectively.
3. The dynamic demand response pricing method based on fuzzy reinforcement learning according to claim 2, characterized in that: in step S1, the purpose of establishing the load aggregator optimization model is to earn the maximum profit of the retail electricity price and the wholesale electricity price, and the specific model is as follows:
4. the dynamic demand response pricing method based on fuzzy reinforcement learning according to claim 3, wherein: in step S1, when the cost of the user and the profit of the load aggregator are considered, the objective function model is:
in the formula, rho epsilon [0,1] represents the weight relation of the user cost and the load aggregation quotient.
5. The dynamic demand response pricing method based on fuzzy reinforcement learning according to claim 2, characterized in that: the step S2 specifically includes:
step S21: initializing parameters, including: energy requirement E of the loadt,nElastic coefficient of price ξtThe reaction parameter α of load bisection loadn、βn(ii) a Minimum and maximum load shedding D of the loadmin、Dmax(ii) a Price of wholesale electricity pin(ii) a The weight factor θ of the reward; the weight relationship rho between the user cost and the load provider;
step S22: initializing Q (e)t,n|Et,nt,n) Each element in the Q table is zero, a time period t is set to be 0, and the iteration number k is set to be 0;
step S23: observing the energy demand E of the user at t ═ 1t,n
Step S24: selection of retail electricity prices lambda with greedy strategyt,n
Step S25: calculating a reward, i.e. objective functionObserving the energy demand E of the user over a time period t +1t+1,nAnd updating the FQ value;
step S26: judging whether the maximum time period T is reached, if yes, turning to the next step, otherwise, returning to the step S24 when T is T + 1;
step S27: judging whether the Q table converges to the maximum value, if yes, going to the next step, otherwise, returning to the step S23 when k is k + 1;
step S28: and outputting the optimal retail price of the T time periods in one day.
6. The dynamic demand response pricing method based on fuzzy reinforcement learning according to claim 5, wherein: in step S24, the state-action value function is:
V(x)=max FQ(sk+1,a),
wherein FQ (·) represents the FQ value and is a fuzzy expected value; k represents the number of iterations; a is in the state sk+1An act of selecting;
then the action policy is selected using greedy principle as:
wherein x is a random number within the interval [0,1 ]; ε represents the search rate.
7. The dynamic demand response pricing method based on fuzzy reinforcement learning of claim 6, wherein: in step S25, the FQ value may be updated by the following equation:
FQ(sk,a)←FQ(sk,ak)+αk[r(sk,ak)+γmax FQ(sk+1,a)-FQ(sk,ak)]
wherein α represents a learning factor, γ represents a discount factor, r(s)k,ak) Is represented by skIn state selection akAnd (4) reporting the action.
CN201811109049.5A 2018-09-21 2018-09-21 A kind of dynamic need response pricing method based on Fuzzy Reinforcement Learning Pending CN109461019A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811109049.5A CN109461019A (en) 2018-09-21 2018-09-21 A kind of dynamic need response pricing method based on Fuzzy Reinforcement Learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811109049.5A CN109461019A (en) 2018-09-21 2018-09-21 A kind of dynamic need response pricing method based on Fuzzy Reinforcement Learning

Publications (1)

Publication Number Publication Date
CN109461019A true CN109461019A (en) 2019-03-12

Family

ID=65606862

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811109049.5A Pending CN109461019A (en) 2018-09-21 2018-09-21 A kind of dynamic need response pricing method based on Fuzzy Reinforcement Learning

Country Status (1)

Country Link
CN (1) CN109461019A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111404154A (en) * 2020-04-16 2020-07-10 南方电网科学研究院有限责任公司 Power distribution network power supply capacity optimization method, equipment and storage medium
CN111598721A (en) * 2020-05-08 2020-08-28 天津大学 Load real-time scheduling method based on reinforcement learning and LSTM network
CN114004497A (en) * 2021-11-01 2022-02-01 国网福建省电力有限公司厦门供电公司 Large-scale load demand response strategy, system and equipment based on meta reinforcement learning
CN114881688A (en) * 2022-04-25 2022-08-09 四川大学 Intelligent pricing method for power distribution network considering distributed resource interactive response
CN114004497B (en) * 2021-11-01 2024-11-08 国网福建省电力有限公司厦门供电公司 Method, system and equipment for large-scale load demand response strategy based on meta reinforcement learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105117810A (en) * 2015-09-24 2015-12-02 国网福建省电力有限公司泉州供电公司 Residential electricity consumption mid-term load prediction method under multistep electricity price mechanism
US20180012137A1 (en) * 2015-11-24 2018-01-11 The Research Foundation for the State University New York Approximate value iteration with complex returns by bounding
US10041844B1 (en) * 2017-04-07 2018-08-07 International Business Machines Corporation Fluid flow rate assessment by a non-intrusive sensor in a fluid transfer pump system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105117810A (en) * 2015-09-24 2015-12-02 国网福建省电力有限公司泉州供电公司 Residential electricity consumption mid-term load prediction method under multistep electricity price mechanism
US20180012137A1 (en) * 2015-11-24 2018-01-11 The Research Foundation for the State University New York Approximate value iteration with complex returns by bounding
US10041844B1 (en) * 2017-04-07 2018-08-07 International Business Machines Corporation Fluid flow rate assessment by a non-intrusive sensor in a fluid transfer pump system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陆海等: "考虑分布式电源并网不确定性的需求侧分时电价运营策略", 《电力建设》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111404154A (en) * 2020-04-16 2020-07-10 南方电网科学研究院有限责任公司 Power distribution network power supply capacity optimization method, equipment and storage medium
CN111598721A (en) * 2020-05-08 2020-08-28 天津大学 Load real-time scheduling method based on reinforcement learning and LSTM network
CN114004497A (en) * 2021-11-01 2022-02-01 国网福建省电力有限公司厦门供电公司 Large-scale load demand response strategy, system and equipment based on meta reinforcement learning
CN114004497B (en) * 2021-11-01 2024-11-08 国网福建省电力有限公司厦门供电公司 Method, system and equipment for large-scale load demand response strategy based on meta reinforcement learning
CN114881688A (en) * 2022-04-25 2022-08-09 四川大学 Intelligent pricing method for power distribution network considering distributed resource interactive response
CN114881688B (en) * 2022-04-25 2023-09-22 四川大学 Intelligent pricing method for power distribution network considering distributed resource interaction response

Similar Documents

Publication Publication Date Title
CN109242193A (en) A kind of dynamic need response pricing method based on intensified learning
JP5413831B2 (en) Power trading management system, management apparatus, power trading method, and computer program for power trading
Samadi et al. Advanced demand side management for the future smart grid using mechanism design
Iria et al. Real-time provision of multiple electricity market products by an aggregator of prosumers
Moshari et al. Demand-side behavior in the smart grid environment
Klaassen et al. A methodology to assess demand response benefits from a system perspective: A Dutch case study
Qiu et al. Mean-field multi-agent reinforcement learning for peer-to-peer multi-energy trading
CN110852519A (en) Optimal profit method considering various types of loads for electricity selling companies
CN109461019A (en) A kind of dynamic need response pricing method based on Fuzzy Reinforcement Learning
Chen et al. Retail dynamic pricing strategy design considering the fluctuations in day-ahead market using integrated demand response
CN111695943B (en) Optimization management method considering floating peak electricity price
Qiu et al. Coordination for multienergy microgrids using multiagent reinforcement learning
Oh et al. A multi-use framework of energy storage systems using reinforcement learning for both price-based and incentive-based demand response programs
CN112132309B (en) Electricity purchasing optimization method and system for electricity selling company under renewable energy power generation quota system
Wang et al. Deep reinforcement learning for energy trading and load scheduling in residential peer-to-peer energy trading market
CN107220889A (en) The distributed resource method of commerce of microgrid community under a kind of many agent frameworks
Ren et al. Reinforcement Learning-Based Bi-Level strategic bidding model of Gas-fired unit in integrated electricity and natural gas markets preventing market manipulation
Khajeh et al. Blockchain-based demand response using prosumer scheduling
Salazar et al. Dynamic customer demand management: A reinforcement learning model based on real-time pricing and incentives
Chasparis et al. A cooperative demand-response framework for day-ahead optimization in battery pools
Razzak et al. Leveraging Deep Q-Learning to maximize consumer quality of experience in smart grid
Li et al. Online transfer learning-based residential demand response potential forecasting for load aggregator
CN112258210A (en) Market clearing method, device, equipment and medium under market one-side quotation
CN116362635A (en) Regional power grid source-load collaborative scheduling learning optimization method based on master-slave gaming
Oprea et al. A signaling game-optimization algorithm for residential energy communities implemented at the edge-computing side

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190312