CN110598925A - Energy storage in-trading market decision optimization method based on double-Q learning algorithm - Google Patents
Energy storage in-trading market decision optimization method based on double-Q learning algorithm Download PDFInfo
- Publication number
- CN110598925A CN110598925A CN201910832395.4A CN201910832395A CN110598925A CN 110598925 A CN110598925 A CN 110598925A CN 201910832395 A CN201910832395 A CN 201910832395A CN 110598925 A CN110598925 A CN 110598925A
- Authority
- CN
- China
- Prior art keywords
- energy storage
- decision
- double
- learning algorithm
- market
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004146 energy storage Methods 0.000 title claims abstract description 59
- 238000000034 method Methods 0.000 title claims abstract description 24
- 238000005457 optimization Methods 0.000 title claims description 7
- 230000009471 action Effects 0.000 claims abstract description 39
- 230000006870 function Effects 0.000 claims abstract description 23
- 230000005611 electricity Effects 0.000 claims abstract description 15
- 230000008569 process Effects 0.000 claims abstract description 8
- 238000013178 mathematical model Methods 0.000 claims abstract description 4
- 238000012549 training Methods 0.000 claims description 2
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 abstract description 7
- 229910052799 carbon Inorganic materials 0.000 abstract description 7
- 230000001186 cumulative effect Effects 0.000 abstract description 7
- 230000007774 longterm Effects 0.000 abstract description 2
- 230000002787 reinforcement Effects 0.000 description 4
- 238000007599 discharging Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000035515 penetration Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 230000035899 viability Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0637—Strategic management or analysis, e.g. setting a goal or target of an organisation; Planning actions based on goals; Analysis or evaluation of effectiveness of goals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
- G06Q30/0206—Price or cost determination based on market factors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0207—Discounts or incentives, e.g. coupons or rebates
- G06Q30/0226—Incentive systems for frequent usage, e.g. frequent flyer miles programs or point systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0283—Price estimation or determination
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Strategic Management (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Human Resources & Organizations (AREA)
- Entrepreneurship & Innovation (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- General Physics & Mathematics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Game Theory and Decision Science (AREA)
- Tourism & Hospitality (AREA)
- Educational Administration (AREA)
- Software Systems (AREA)
- Operations Research (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Quality & Reliability (AREA)
- Computing Systems (AREA)
- Medical Informatics (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
一种基于双Q学习算法的储能在交易市场决策优化方法,包括如下步骤:建立储能在交易市场决策的数学模型;将储能操作描述为一个马尔可夫决策过程;采用真实历史市场交易价格数据,运用Double‑Q learning算法,对两个数据集进行迭代训练,得到训练后的Q表;储能在训练好的Q表中执行决策目标最大化的动作,得到在联合套利下的累计奖励。本发明Double‑Q learning算法采用两个函数对Q表进行迭代更新,减少了Q‑learning算法高估问题的影响,设计的套利策略更加稳定,使得储能长期套利收益更高;套利来源不只限于电力市场,加入了碳市场,使得套利收入显著增加。
A method for optimizing decision-making of energy storage in a trading market based on a double-Q learning algorithm, comprising the following steps: establishing a mathematical model for decision-making of energy storage in a trading market; describing energy storage operations as a Markov decision-making process; using real historical market transactions For the price data, use the Double-Q learning algorithm to iteratively train the two data sets to obtain the trained Q table; the energy storage executes the action of maximizing the decision-making goal in the trained Q table, and obtains the cumulative value under the joint arbitrage award. The Double-Q learning algorithm of the present invention uses two functions to iteratively update the Q table, which reduces the impact of the Q-learning algorithm overestimation problem, and the designed arbitrage strategy is more stable, making the long-term arbitrage income of energy storage higher; the source of arbitrage is not limited to The electricity market, joining the carbon market, has significantly increased arbitrage income.
Description
技术领域technical field
本发明属于工程技术领域。The invention belongs to the technical field of engineering.
背景技术Background technique
随着可再生资源的日益深入,考虑到风能和太阳能的高度不确定性,有效地实现这一平衡是很重要的。储能系统够不断吸收能量并适时释放能量,以满足用户对电量的大量需求,能够缓解电力对于电网的超负荷,还能优化电网系统的配置,维持电网完全稳定运行,满足不同用户对于电力的需求,是对多变的可再生能源的一种补充,其经济可行性日益受到重视。储能最常被讨论的收入来源之一是实时价格套利,即储能利用实时电力市场价格的价差,在低电价时充电,高电价时放电实现盈利。With the increasing penetration of renewable resources, it is important to efficiently achieve this balance given the high uncertainty of wind and solar energy. The energy storage system can continuously absorb energy and release energy in a timely manner to meet the large demand of users for electricity, relieve the overload of electricity on the power grid, optimize the configuration of the power grid system, maintain the complete and stable operation of the power grid, and meet the needs of different users for power. demand, is a complement to variable renewable energy, and its economic viability is gaining increasing attention. One of the most frequently discussed sources of income for energy storage is real-time price arbitrage, that is, energy storage uses the price difference in real-time electricity market prices to charge when electricity prices are low and discharge when electricity prices are high to achieve profitability.
由于间歇性可再生发电的日益普及导致实时电力市场价格波动较大,储能在交易市场中的决策受到了研究界的极大关注。然而,即使价格差价上升,设计良好的策略以获得显著利润仍然不是一件易事。最先被想到的方法是对价格进行预测,但预测准确度难以保障。后来人们曾采用近似动态规划来推导储能的竞价策略,无需事先了解价格分布。但是这种策略往往由于状态空间的高维性导致计算代价昂贵。强化学习是一种不同于监督学习和无监督学习的在线学习技术,强化学习中Q-learning算法为人们提供了一种在数据驱动框架下储能决策策略。Due to the high volatility of real-time electricity market prices caused by the increasing popularity of intermittent renewable generation, the decision-making of energy storage in trading markets has received great attention from the research community. However, even with rising price spreads, designing a good strategy to make significant profits is still not an easy task. The first method that comes to mind is to predict the price, but the accuracy of the prediction is difficult to guarantee. Later, approximate dynamic programming was used to derive bidding strategies for energy storage without prior knowledge of the price distribution. However, this strategy is often computationally expensive due to the high dimensionality of the state space. Reinforcement learning is an online learning technology that is different from supervised learning and unsupervised learning. The Q-learning algorithm in reinforcement learning provides people with an energy storage decision-making strategy under a data-driven framework.
现有基于强化学习的储能决策方法存在以下缺陷:只对电价信息进行决策,决策信息来源单一;Q-learning算法存在高估问题,算法性能不稳定。The existing energy storage decision-making methods based on reinforcement learning have the following defects: only electricity price information is used for decision-making, and the source of decision-making information is single; the Q-learning algorithm has an overestimation problem, and the performance of the algorithm is unstable.
发明内容Contents of the invention
本发明的目的是克服现有技术的不足,提出一种基于双Q学习算法的储能在交易市场决策优化方法。The purpose of the present invention is to overcome the deficiencies of the prior art, and propose a decision-making optimization method for energy storage in the trading market based on a double-Q learning algorithm.
本发明是通过以下技术方案实现的。The present invention is achieved through the following technical solutions.
本发明所述的一种基于双Q学习算法的储能在交易市场决策优化方法,包括如下步骤:A method for optimizing decision-making of energy storage in a trading market based on a double-Q learning algorithm according to the present invention comprises the following steps:
步骤1:建立储能在交易市场决策的数学模型;Step 1: Establish a mathematical model for decision-making of energy storage in the trading market;
步骤2:将储能操作描述为一个马尔可夫决策过程;Step 2: Describe the energy storage operation as a Markov decision process;
步骤3:采用真实历史市场交易价格数据,运用Double-Q learning算法,对两个数据集进行迭代训练,得到训练后的Q表;Step 3: Use the real historical market transaction price data and use the Double-Q learning algorithm to iteratively train the two data sets to obtain the trained Q table;
步骤4:储能在训练好的Q表中执行决策目标最大化的动作,得到在联合套利下的累计奖励。Step 4: Energy storage executes the action of maximizing the decision-making goal in the trained Q table, and obtains the cumulative reward under joint arbitrage.
进一步地,所述步骤1,包括以下步骤:Further, said step 1 includes the following steps:
步骤1-1:确定储能决策的目标函数;Step 1-1: Determine the objective function of energy storage decision-making;
步骤1-2:确定储能系统的的存储电量约束;Step 1-2: Determine the energy storage constraints of the energy storage system;
步骤1-3:确定储能系统的充放电功率约束。Step 1-3: Determine the charging and discharging power constraints of the energy storage system.
进一步地,所述的步骤2,包括以下步骤:Further, said step 2 includes the following steps:
步骤2-1:将储能的动作设为关于价格的函数;Step 2-1: Set the energy storage action as a function of price;
步骤2-2:确定储能状态空间;Step 2-2: Determine the energy storage state space;
步骤2-3:确定储能动作空间;Step 2-3: Determine the energy storage action space;
步骤2-4:确定动作奖赏函数。Steps 2-4: Determine the action reward function.
进一步地,所述步骤3,包括以下步骤:Further, said step 3 includes the following steps:
步骤3-1:确定储能系统状态;Step 3-1: Determine the state of the energy storage system;
步骤3-2:根据∈-greedy策略选择储能动作;Step 3-2: Select an energy storage action according to the ∈-greedy strategy;
步骤3-3:随机选择两个函数之一更新Q表值,迭代3000次,得到Q值表。Step 3-3: Randomly select one of the two functions to update the Q table value, iterate 3000 times, and obtain the Q value table.
与现有技术相比,本发明的有益效果是:(1)Double-Q learning算法采用两个函数对Q表进行迭代更新,减少了Q-learning算法高估问题的影响,设计的套利策略更加稳定,使得储能长期决策收益更高。(2)价格数据不只限于电力市场,加入了碳市场,使得累计奖励显著增加。Compared with the prior art, the beneficial effects of the present invention are: (1) the Double-Q learning algorithm uses two functions to iteratively update the Q table, which reduces the impact of the Q-learning algorithm overestimation problem, and the designed arbitrage strategy is more Stability makes the long-term decision-making benefits of energy storage higher. (2) The price data is not limited to the electricity market, but has joined the carbon market, which has significantly increased the cumulative reward.
附图说明Description of drawings
附图1为储能在交易市场决策方法流程框图。Attached Figure 1 is a flow chart of the decision-making method for energy storage in the trading market.
附图2为马尔科夫决策过程框图。Accompanying drawing 2 is the block diagram of Markov decision process.
附图3为Double-Q learning算法决策流程图。Accompanying drawing 3 is the decision-making flowchart of Double-Q learning algorithm.
具体实施方式Detailed ways
下面结合附图工作原理对具体实施方式进行说明。The specific implementation will be described below in conjunction with the working principle of the accompanying drawings.
本发明提出的基于双Q学习算法的储能在交易市场决策优化方法,利用双Q学习算法在某地电力和碳市场交易中作出决策,实现累计奖励最大化,方法流程图如附图1所示,具体包括以下步骤:The double-Q learning algorithm-based energy storage decision optimization method in the trading market proposed by the present invention uses the double-Q learning algorithm to make decisions in the electricity and carbon market transactions in a certain place to maximize the cumulative reward. The method flow chart is shown in Figure 1 Specifically, the following steps are included:
步骤1:建立储能在电力市场和碳市场决策的数学模型;Step 1: Establish a mathematical model for decision-making of energy storage in the electricity market and carbon market;
步骤2:将储能操作描述为一个马尔可夫决策过程;Step 2: Describe the energy storage operation as a Markov decision process;
步骤3:采用某地交易市场真实的历史碳价和电价数据,运用Double-Q learning算法,对两个数据集进行迭代训练,得到训练后的Q表;Step 3: Use the real historical carbon price and electricity price data of a trading market in a certain place, and use the Double-Q learning algorithm to iteratively train the two data sets to obtain the trained Q table;
步骤4:储能在训练好的Q表中执行决策目标最大化的动作,得到在此方法决策下的累计奖励。Step 4: Energy storage executes the action of maximizing the decision-making goal in the trained Q table, and obtains the cumulative reward under the decision-making of this method.
进一步地,所述步骤1,包括以下步骤:Further, said step 1 includes the following steps:
步骤1-1:确定储能联合套利的目标函数为:Step 1-1: Determine the objective function of energy storage joint arbitrage as:
步骤1-2:确定储能系统的的存储电量约束为:Step 1-2: Determine the energy storage constraints of the energy storage system as:
步骤1-3:确定储能系统的充放电功率约束:Step 1-3: Determine the charging and discharging power constraints of the energy storage system:
进一步地,为了将储能操作描述为马尔科夫决策过程:Further, to describe the energy storage operation as a Markov decision process:
步骤2-1:将储能的动作设为关于价格的函数:Step 2-1: Set the energy storage action as a function of price:
步骤2-2:确定储能状态空间函数;Step 2-2: Determine the energy storage state space function;
S=(P,Q)*E (5)S=(P,Q)*E (5)
步骤2-3:确定储能动作空间函数;Step 2-3: Determine the energy storage action space function;
步骤2-4:确定动作奖赏函数。Steps 2-4: Determine the action reward function.
进一步地,所述步骤3,根据图3的算法流程图包括以下步骤:Further, said step 3, according to the algorithm flow chart of Fig. 3 includes the following steps:
步骤3-1:获取某地电力和碳市场交易历史价格数据,根据状态空间函数(5)确定储能系统状态S,与以往不同的是,我们在决策状态空间里加入另外一种价格信息,在两种价格分布中进行决策;Step 3-1: Obtain the historical price data of electricity and carbon market transactions in a certain place, and determine the state S of the energy storage system according to the state space function (5). What is different from the past is that we add another kind of price information in the decision state space, Make a decision between two price distributions;
步骤3-2:根据奖励函数(7)计算出每种状态的所有动作的奖励值,用于确定后续动作的选择,如表1所示:Step 3-2: Calculate the reward value of all actions in each state according to the reward function (7), which is used to determine the choice of subsequent actions, as shown in Table 1:
表1Table 1
步骤3-3:根据∈-greedy策略,算法会有∈∈[0,1]的概率随机选择动作函数(6)中的动作。有(1-∈)的概率选择奖励值表中奖励值最大的动作,这将避免算法迭代局部最优;Step 3-3: According to the ∈-greedy strategy, the algorithm will randomly select the action in the action function (6) with the probability ∈∈[0,1]. There is a probability of (1-∈) to choose the action with the largest reward value in the reward value table, which will avoid the algorithm iteration local optimum;
步骤3-4:Q-Learning是一项无模型的增强学习技术,它可以在MDP问题中寻找一个最优的动作选择策略。它通过一个动作-价值函数来进行学习,并且最终能够根据当前状态及最优策略给出期望的动作。它的一个优点就是它不需要知道某个环境的模型也可以对动作进行期望值比较。在标准的Q-学习中的max操作使用同样的值来进行选择和衡量一个行动。这实际上更可能选择过高的估计值,从而导致过于乐观的值估计。为了避免这种情况的出现,我们可以对选择和衡量进行解耦,每一次状态的更新都随机选择(8)和(9)函数之一更新Q表值,避免动作值被高估,迭代3000次,得到Q值表,如表2所示,选择Q值表中最大Q值的动作,得到累计奖励。Step 3-4: Q-Learning is a model-free reinforcement learning technique that can find an optimal action selection strategy in MDP problems. It learns through an action-value function, and finally can give the desired action according to the current state and the optimal policy. One of its advantages is that it does not require knowledge of a model of the environment to perform expected value comparisons of actions. The max operation in standard Q-learning uses the same value to select and measure an action. This is actually more likely to pick an estimate that is too high, leading to an overly optimistic value estimate. In order to avoid this situation, we can decouple the selection and measurement. Every time the state is updated, one of the functions (8) and (9) is randomly selected to update the Q table value, so as to avoid the overestimation of the action value. Iteration 3000 times, get the Q value table, as shown in Table 2, choose the action with the largest Q value in the Q value table, and get the cumulative reward.
表2Table 2
图2是马尔可夫决策过程框图,马尔可夫决策问题的目的是寻求一个最优策略,即使评价函数最大化的一系列动作。对于每一时刻的状态S,智能体均会通过最优策略选取适当的动作。为了实现决策目标的累计奖励最大化,我们将储能操作描述为马尔可夫决策过程。确定状态,动作,策略和奖励等要素。将充放电决策定义为关于价格信息的函数,设计双Q学习策略来最优地控制储能系统在电力市场与碳市场中的实时决策。Figure 2 is a block diagram of the Markov decision process. The purpose of the Markov decision problem is to find an optimal strategy, that is, a series of actions that maximize the evaluation function. For the state S at each moment, the agent will select the appropriate action through the optimal strategy. To maximize the cumulative reward of the decision objective, we describe the energy storage operation as a Markov decision process. Identify elements such as states, actions, strategies, and rewards. The charging and discharging decision is defined as a function of price information, and a double Q learning strategy is designed to optimally control the real-time decision of the energy storage system in the electricity market and carbon market.
图3是双Q学习算法流程图,当价格数据输入到状态空间进行训练后,双Q学习算法比Q学习算法的性能更加稳定,减少了高估。当应用于储能联合套利时,首先初始化,然后确定当前状态,选择储能动作,采用ε-greedy动作选择策略,算法会有ε∈[0,1]的概率随机选择动作,1-ε的概率选择最优动作,关键在于双Q学习采用的两个更新函数,一个用于确定动作产生的值,另一个用于更新Q值表。Figure 3 is a flow chart of the double Q learning algorithm. When the price data is input into the state space for training, the performance of the double Q learning algorithm is more stable than that of the Q learning algorithm, reducing overestimation. When applied to energy storage joint arbitrage, first initialize, then determine the current state, select the energy storage action, and adopt the ε-greedy action selection strategy. The algorithm will randomly select the action with the probability of ε∈[0,1], and the 1-ε Probabilistically select the optimal action, the key lies in the two update functions used in double Q learning, one is used to determine the value generated by the action, and the other is used to update the Q value table.
Claims (4)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910832395.4A CN110598925A (en) | 2019-09-04 | 2019-09-04 | Energy storage in-trading market decision optimization method based on double-Q learning algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910832395.4A CN110598925A (en) | 2019-09-04 | 2019-09-04 | Energy storage in-trading market decision optimization method based on double-Q learning algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110598925A true CN110598925A (en) | 2019-12-20 |
Family
ID=68857593
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910832395.4A Pending CN110598925A (en) | 2019-09-04 | 2019-09-04 | Energy storage in-trading market decision optimization method based on double-Q learning algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110598925A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112529610A (en) * | 2020-11-23 | 2021-03-19 | 天津大学 | End-to-end electric energy trading market user decision method based on reinforcement learning |
CN119494467A (en) * | 2024-11-04 | 2025-02-21 | 北京瑞智德信息技术有限公司 | A method for energy system fault prediction based on knowledge graph |
-
2019
- 2019-09-04 CN CN201910832395.4A patent/CN110598925A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112529610A (en) * | 2020-11-23 | 2021-03-19 | 天津大学 | End-to-end electric energy trading market user decision method based on reinforcement learning |
CN119494467A (en) * | 2024-11-04 | 2025-02-21 | 北京瑞智德信息技术有限公司 | A method for energy system fault prediction based on knowledge graph |
CN119494467B (en) * | 2024-11-04 | 2025-05-13 | 北京瑞智德信息技术有限公司 | A knowledge graph-based method for energy system fault prediction |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11581740B2 (en) | Method, system and storage medium for load dispatch optimization for residential microgrid | |
Way et al. | Empirically grounded technology forecasts and the energy transition | |
Han et al. | Mid-to-long term wind and photovoltaic power generation prediction based on copula function and long short term memory network | |
Qin et al. | Do the benefits outweigh the disadvantages? Exploring the role of artificial intelligence in renewable energy | |
CN112288164B (en) | Wind power combined prediction method considering spatial correlation and correcting numerical weather forecast | |
CN107563539A (en) | Short-term and long-medium term power load forecasting method based on machine learning model | |
CN110598929B (en) | Wind power nonparametric probability interval ultrashort term prediction method | |
CN102479347B (en) | Wind power plant short-term wind speed prediction method and system based on data driving | |
CN110837915B (en) | Low-voltage load point prediction and probability prediction method for power system based on hybrid integrated deep learning | |
CN113449919A (en) | Power consumption prediction method and system based on feature and trend perception | |
CN110188915A (en) | Method and system for optimal configuration of energy storage system in virtual power plant based on scenario set | |
CN110009160A (en) | An Electricity Price Prediction Method Based on Improved Deep Belief Network | |
CN115374995A (en) | Distributed photovoltaic and small wind power station power prediction method | |
CN114676941B (en) | Electric-heat load joint adaptive prediction method and device for integrated energy system in the park | |
CN109118120B (en) | A Multi-objective Decision-Making Method Considering the Sustainable Utilization of Reservoir Scheduling Scheme | |
CN114511132A (en) | Photovoltaic output short-term prediction method and prediction system | |
Li et al. | Research on a novel photovoltaic power forecasting model based on parallel long and short-term time series network | |
Sun et al. | Enhancing financial risk management through lstm and extreme value theory: A high-frequency trading volume approach | |
Gao et al. | Spatio-temporal interpretable neural network for solar irradiation prediction using transformer | |
CN115049115A (en) | RDPG wind speed correction method considering NWP wind speed transverse and longitudinal errors | |
CN110598925A (en) | Energy storage in-trading market decision optimization method based on double-Q learning algorithm | |
Liu et al. | Physics-informed reinforcement learning for probabilistic wind power forecasting under extreme events | |
CN116362136A (en) | Self-dispatching optimization method and system for independent energy storage system | |
CN114861555A (en) | Regional comprehensive energy system short-term load prediction method based on Copula theory | |
CN115713252B (en) | A method for optimizing comprehensive benefit evaluation scheme of hydro-wind-solar-storage multi-energy complementary system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191220 |
|
RJ01 | Rejection of invention patent application after publication |