CN112258039B - Intelligent scheduling method for defective materials of power system based on reinforcement learning - Google Patents
Intelligent scheduling method for defective materials of power system based on reinforcement learning Download PDFInfo
- Publication number
- CN112258039B CN112258039B CN202011144804.0A CN202011144804A CN112258039B CN 112258039 B CN112258039 B CN 112258039B CN 202011144804 A CN202011144804 A CN 202011144804A CN 112258039 B CN112258039 B CN 112258039B
- Authority
- CN
- China
- Prior art keywords
- materials
- reinforcement learning
- scheduling
- power system
- steps
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 239000000463 material Substances 0.000 title claims abstract description 65
- 238000000034 method Methods 0.000 title claims abstract description 59
- 230000002787 reinforcement Effects 0.000 title claims abstract description 33
- 230000002950 deficient Effects 0.000 title claims abstract description 29
- 230000009471 action Effects 0.000 claims abstract description 17
- 238000003860 storage Methods 0.000 claims abstract description 13
- 230000006870 function Effects 0.000 claims abstract description 8
- 238000012546 transfer Methods 0.000 claims abstract description 7
- 230000008569 process Effects 0.000 claims abstract description 6
- 230000007704 transition Effects 0.000 claims description 8
- 239000000203 mixture Substances 0.000 claims description 4
- 239000002994 raw material Substances 0.000 claims description 4
- 239000000758 substrate Substances 0.000 claims 1
- 230000007547 defect Effects 0.000 abstract description 8
- 238000010586 diagram Methods 0.000 description 5
- 238000009826 distribution Methods 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000011232 storage material Substances 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000011435 rock Substances 0.000 description 1
- 239000010193 shuanglong Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0631—Resource planning, allocation, distributing or scheduling for enterprises or organisations
- G06Q10/06312—Adjustment or analysis of established resource schedule, e.g. resource or task levelling, or dynamic rescheduling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Landscapes
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Engineering & Computer Science (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Entrepreneurship & Innovation (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Marketing (AREA)
- General Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Educational Administration (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Game Theory and Decision Science (AREA)
- Development Economics (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Feedback Control In General (AREA)
Abstract
The invention discloses an intelligent scheduling method for defective goods and materials of an electric power system based on reinforcement learning, which comprises the steps of defining states, decisions, transfer equations, reward functions and requirements and targets in dynamic scheduling problems of goods and materials storage in the reinforcement learning; solving the material warehousing dynamic scheduling problem by utilizing a Markov decision process; listing Bellman equations aiming at power grid defect materials and selecting a solving strategy; and modifying the Bellman equation into a data-driven online updating form, and determining a scheduling action based on an epsilon greedy strategy. The invention provides a combined control and scheduling problem for solving emergency materials of an electric power system based on a Markov random process and reinforcement learning, and an end-to-end algorithm does not predict the demand and directly makes inventory control and scheduling decisions; meanwhile, the method is verified on a real data set, has good convergence and gain, and proves the usability and practical value of the method.
Description
Technical Field
The invention relates to the technical field of power grid and artificial intelligence scheduling, in particular to an intelligent scheduling method for defective materials of a power system based on reinforcement learning.
Background
Statistical optimization method: according to statistical rules, the distribution of various emergency demands is modeled, and statistically average and optimal warehousing distribution is calculated through centralized mathematical modeling.
A data prediction method: based on the idea of data analysis and mining in each region, a sequence-to-sequence model is constructed for different requirements of each region by using an artificial intelligence and machine learning method, so that the time sequence is predicted; then, on the basis of prediction, centralized layout and optimization are carried out on the warehousing system and scheduling.
For a statistical optimization method, the method needs complete statistics on all the demand distributions in the region, meanwhile, the optimal distribution needs to be recalculated every time state transition and emergency occur, the calculation resource consumption is high, the response is slow, and certain limitations are realized; for a data prediction method, the traditional feature selection is usually based on a feature sorting method, according to the calculated importance and relevance of each feature, the first k features are taken as the input of demand prediction, and the method has the greatest defect that the global information of the system cannot be represented well by selecting the features with the greatest importance and relevance, so that the most abundant information cannot be provided for the prediction system; meanwhile, because the predicted result is not the final result, secondary calculation is carried out according to the predicted result to obtain a scheduling and control scheme, and errors are accumulated by a multi-step framework to cause deviation of the final result.
Disclosure of Invention
This section is for the purpose of summarizing some aspects of embodiments of the invention and to briefly introduce some preferred embodiments. In this section, as well as in the abstract and the title of the invention of this application, simplifications or omissions may be made to avoid obscuring the purpose of the section, the abstract and the title, and such simplifications or omissions are not intended to limit the scope of the invention.
The present invention has been made in view of the above-mentioned conventional problems.
Therefore, the invention provides an intelligent scheduling method for defective goods and materials of an electric power system based on reinforcement learning, which can solve the problem of joint control and scheduling of emergency goods and materials of the electric power system.
In order to solve the technical problems, the invention provides the following technical scheme: the method comprises the steps of defining states, decisions, transfer equations, reward functions and demands and targets in the dynamic scheduling problem of material storage in reinforcement learning; solving the material warehousing dynamic scheduling problem by utilizing a Markov decision process; listing Bellman equations aiming at power grid defective materials and selecting a solving strategy; and modifying the Bellman equation into a data-driven online updating form, and determining a scheduling action based on an epsilon greedy strategy.
The invention relates to a preferable scheme of an intelligent dispatching method for defective goods and materials of an electric power system based on reinforcement learning, wherein the preferable scheme comprises the following steps: including defining the current time state, and then storing the materials S in each warehouset=Z∈Rn×m(ii) a Wherein, Zi,jIndicating the quantity of the material j in the warehouse i at the current moment.
The invention relates to a preferable scheme of an intelligent dispatching method for defective goods and materials of an electric power system based on reinforcement learning, wherein the preferable scheme comprises the following steps: including, according to the current time state St∈Rn×mAnd the requirement Q ∈ Rn×mThe warehousing system determines a scheduling scheme X and a purchasing scheme B at the moment, wherein Xi,jAnd Bi,jRespectively representing the ex-warehouse quantity and the purchase quantity of the goods and materials j in the warehouse i at the current moment.
The invention relates to a preferable scheme of an intelligent dispatching method for defective goods and materials of an electric power system based on reinforcement learning, wherein the preferable scheme comprises the following steps: after the warehousing system decides a scheduling and purchasing scheme, the warehousing state randomly generates state transition at the next moment, and then a transition equation is expressed as follows:
St+1=Z-X+B
wherein, as the storage materials can not be negative physically, and the storage space is always limited, the effective decision (X, B) must satisfy the following inequality:
as a preferred scheme of the reinforcement learning-based intelligent scheduling method for the defective goods and materials of the power system, the method comprises the following steps: the warehouse system mainly aims at meeting the problem of emergency material demand in regions and between regions, and the reward function is obtained by subtracting the cost of purchasing materials from the lost income at the current moment, and comprises the following steps:
wherein, the symbol (x)-Comprises the following steps:
the invention relates to a preferable scheme of an intelligent dispatching method for defective goods and materials of an electric power system based on reinforcement learning, wherein the preferable scheme comprises the following steps: comprises the steps of (a) preparing a mixture of a plurality of raw materials,
wherein γ ∈ [0,1) is an attenuation Factor (Discount Factor).
The invention relates to a preferable scheme of an intelligent dispatching method for defective goods and materials of an electric power system based on reinforcement learning, wherein the preferable scheme comprises the following steps: changing the Bellman equation into a data-driven online updating form, which comprises the following steps:
V(St)←(1-αt)V(St)+αt[rt+γV(St+1)]
wherein alpha istThe learning rate at time t.
As a preferred scheme of the reinforcement learning-based intelligent scheduling method for the defective goods and materials of the power system, the method comprises the following steps: determining Action by adopting the epsilon greedy strategy, and taking the current best Action by the warehouse under the probability of 1-epsilon to convert V (S)t) And (4) maximizing.
The invention relates to a preferable scheme of an intelligent dispatching method for defective goods and materials of an electric power system based on reinforcement learning, wherein the preferable scheme comprises the following steps: also included is randomly selecting an action with a probability of ε, as follows:
wherein the random actions can be explored themselves, and knowledge learned by exploring to produce a variety of good or bad data, thereby improving current strategies.
The invention has the beneficial effects that: the invention provides a combined control and scheduling problem for solving emergency materials of an electric power system based on a Markov random process and reinforcement learning, and an end-to-end algorithm does not predict the demand and directly makes inventory control and scheduling decisions; the proposed algorithm is an "online" algorithm, i.e. inventory control and scheduling decisions rely only on observations of past events; the proposed algorithm is also a "model-free" algorithm, independent of any assumed stochastic model of uncertain events; meanwhile, the method is verified on a real data set, has good convergence and gain, and proves the usability and practical value of the method.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise. Wherein:
fig. 1 is a schematic flowchart illustrating a method for intelligently scheduling defective materials of an electrical power system based on reinforcement learning according to a first embodiment of the present invention;
fig. 2 is a schematic diagram illustrating defective material scheduling of a reinforcement learning-based intelligent defective material scheduling method for an electric power system according to a first embodiment of the present invention;
fig. 3 is a schematic diagram illustrating reinforcement learning scheduling of defective materials of an electric power system according to a reinforcement learning-based intelligent scheduling method of defective materials of an electric power system according to a first embodiment of the present invention;
fig. 4 is a schematic diagram illustrating a profit comparison of a warehousing system of the electric power system defect material intelligent scheduling method based on reinforcement learning according to a second embodiment of the present invention under different warehousing capacities;
fig. 5 is a schematic diagram illustrating a profit comparison of a warehousing system of the electric power system defect material intelligent scheduling method based on reinforcement learning according to the second embodiment of the present invention under different warehousing capacities (C) and attenuation coefficients (Y).
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, specific embodiments accompanied with figures are described in detail below, and it is apparent that the described embodiments are a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making creative efforts based on the embodiments of the present invention, shall fall within the protection scope of the present invention.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced otherwise than as specifically described herein, and it will be appreciated by those skilled in the art that the present invention may be practiced without departing from the spirit and scope of the present invention and that the present invention is not limited by the specific embodiments disclosed below.
Furthermore, reference herein to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one implementation of the invention. The appearances of the phrase "in one embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments.
The present invention will be described in detail with reference to the drawings, wherein the cross-sectional views illustrating the structure of the device are not necessarily enlarged to scale, and are merely exemplary, which should not limit the scope of the present invention. In addition, the three-dimensional dimensions of length, width and depth should be included in the actual fabrication.
Meanwhile, in the description of the present invention, it should be noted that the terms "upper, lower, inner and outer" and the like indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, and are only for convenience of describing the present invention and simplifying the description, but do not indicate or imply that the referred device or element must have a specific orientation, be constructed in a specific orientation and operate, and thus, cannot be construed as limiting the present invention. Furthermore, the terms first, second, or third are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
The terms "mounted, connected and connected" in the present invention are to be understood broadly, unless otherwise explicitly specified or limited, for example: can be fixedly connected, detachably connected or integrally connected; they may be mechanically, electrically, or directly connected, or indirectly connected through intervening media, or may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in a specific case to those of ordinary skill in the art.
Example 1
Referring to fig. 1, 2 and 3, for a first embodiment of the present invention, there is provided a method for intelligently scheduling defective materials of an electric power system based on reinforcement learning, including:
s1: and defining the state, the decision, the transfer equation, the reward function and the requirement and the target in the dynamic scheduling problem of the material storage in the reinforcement learning. In which it is to be noted that,
defining the state, decision, transfer equation, reward function and the demand and target of the material storage dynamic scheduling problem in the reinforcement learning algorithm aiming at the power system defect material scheduling;
the state is the storage state and the material defect state at the moment t, the decision is made to be the scheduling mode and the purchasing mode adopted at the moment, and the transfer equation is a front-back change equation;
defining the state of the current moment, then the materials S stored in each warehouset=Z∈Rn×m;
Wherein Z isi,jRepresenting the quantity of materials j in the warehouse i at the current moment;
according to the current time state St∈Rn×mAnd the demand Q ∈ Rn×mThe warehousing system determines a scheduling scheme X and a purchasing scheme B at the moment, wherein Xi,jAnd Bi,jRespectively representing the ex-warehouse quantity and the purchase quantity of the goods and materials j in the warehouse i at the current moment.
S2: and solving the problem of dynamic scheduling of material storage by using a Markov decision process. It should be noted that in this step,
after the warehousing system decides the scheduling and purchasing scheme, at the next moment, the warehousing state randomly generates state transition, and then the transition equation is expressed as:
St+1=Z-X+B
wherein, as the storage materials can not be negative physically, and the storage space is always limited, the effective decision (X, B) must satisfy the following inequality:
s3: listing Bellman equation aiming at power grid defect materials and selecting a solving strategy. Among them, it is also to be noted that:
the main objective of the warehousing system is to meet the problem of emergency material demand in regions and between regions, and then the reward function is to subtract the cost of purchasing materials from the lost income at the current moment as follows:
wherein, the symbol (x)-Comprises the following steps:
solving the MDP problem, then:
wherein γ ∈ [0,1) is an attenuation Factor (Discount Factor).
S4: and modifying the Bellman equation into a data-driven online updating form, and determining a scheduling action based on an epsilon greedy strategy. What should be further described in this step is:
the Bellman equation is modified to a form of data-driven online update as follows:
V(St)←(1-αt)V(St)+αt[rt+γV(St+1)]
wherein alpha istLearning rate at time t;
determining Action by adopting epsilon greedy strategy, and taking current best Action by the warehouse under the probability of 1-epsilon to obtain V (S)t) Maximization;
at a probability of ε, actions are randomly selected as follows:
where random actions can be explored themselves, learning knowledge by exploring to produce a variety of good or bad data, thereby improving current strategies.
Preferably, the embodiment further includes designing a Bellman equation for the defective materials of the power grid and selecting a solution strategy, where the Bellman equation is a mathematical form of a scheduling problem, and the selection strategy is to obtain an optimal scheduling result more quickly; the Bellman equation is modified into a data-driven online updating form, namely required data such as material demand data and online storage data can be accessed in real time, and then the Bellman equation can adapt to the updated state so as to better adapt to the dynamic storage and scheduling problems of power grid defect materials to be solved by the invention, and scheduling actions are determined based on an epsilon greedy strategy.
Example 2
Referring to fig. 4 and 5, a second embodiment of the present invention, which is different from the first embodiment, provides an authenticity verification method for an intelligent scheduling method of defective goods and materials of an electric power system based on reinforcement learning, including:
in order to better verify and explain the technical effect adopted in the method, the embodiment selects the traditional greedy algorithm-based intelligent scheduling method to perform a comparison test with the method, compares the test result by means of scientific demonstration, and verifies the real effect of the method.
The convergence and the gain of the traditional intelligent scheduling method based on the greedy algorithm are low, and in order to verify that the method has higher gain and convergence compared with the traditional method, the traditional intelligent scheduling method based on the greedy algorithm is adopted to carry out real-time measurement comparison with the method.
And (3) testing conditions: (1) 15 areas in the jurisdiction or periphery of Guiyang City of Guizhou province were collected: the emergency material requirements of the dolomitic cloud, north city, brook, hui shui, jinyang, kaiyang, longli, south Ming, Qing Zhen, Shuanglong, Wudang, honeycomb, Xiaohe, repair culture and cloud rock are the requirements of the defective materials of each month;
(2) the invention carries out the transformation of the state transition equation aiming at specific problems so as to adapt to the application scene described in the embodiment;
(3) starting the automatic test equipment, simulating by using MATLB and outputting a curve schematic diagram.
Referring to fig. 4, a solid line is a curve output by the method of the present invention, and a dotted line is a curve output by the conventional method, and as the warehouse storage capacity increases, the average profit curves of both curves increase, but it can be seen from fig. 4 that the trend of the solid line is more prominent than that of the dotted line, and the solid line is always kept above the dotted line, thereby illustrating that the method of the present invention has higher gain compared to the conventional method.
Referring to fig. 5, it can be seen that the method of the present invention is always in an increasing trend as the attenuation coefficient γ and the warehouse storage capacity increase, but the benefit is the lowest when the attenuation coefficient γ is 0.95 and the benefit is the highest when the attenuation coefficient γ is 0.8, and based on this, the superiority of the method of the present invention is verified.
It should be noted that the above-mentioned embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions can be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, which should be covered by the claims of the present invention.
Claims (5)
1. A method for intelligently scheduling defective goods and materials of an electric power system based on reinforcement learning is characterized by comprising the following steps: comprises the steps of (a) preparing a mixture of a plurality of raw materials,
defining the state, decision, transfer equation, reward function and the demand and target of the material storage dynamic scheduling problem in reinforcement learning;
solving the material warehousing dynamic scheduling problem by utilizing a Markov decision process;
listing Bellman equations aiming at power grid defective materials and selecting a solving strategy;
modifying the Bellman equation into a data-driven online updating form, and determining a scheduling action based on an epsilon greedy strategy;
comprises the steps of (a) preparing a substrate,
wherein gamma belongs to [0,1) as attenuation factor;
the objective of the warehousing system is to satisfy the problem of emergency material demand in and between regions, and the reward function is the lost income at the current time minus the cost of purchasing materials, as follows:
wherein, the symbol (x)-Comprises the following steps:
wherein the state of the current time is defined, StRepresenting the material stored in each warehouse, Zi,jIndicates the quantity, X, of the materials j in the warehouse i at the current momenti,jAnd Bi,jRespectively representing the ex-warehouse quantity and the purchase quantity S of the goods and materials j in the warehouse i at the current momentt+1Expressing transfer equations, X, B expressing the warehousing system determining the sameScheduling scheme, purchasing scheme of time, current time state St∈Rn×mAnd the demand Q ∈ Rn×m。
2. The reinforcement learning-based intelligent scheduling method for the defective materials of the power system as claimed in claim 1, wherein: comprises the steps of (a) preparing a mixture of a plurality of raw materials,
after the warehousing system decides the scheduling and purchasing scheme, at the next moment, the warehousing state randomly generates state transition, and then the transition equation is expressed as:
St+1=Z-X+B。
3. the reinforcement learning-based intelligent scheduling method for the defective materials of the power system as claimed in claim 2, wherein: comprises the steps of (a) preparing a mixture of a plurality of raw materials,
the Bellman equation is changed into a form of data-driven online update as follows:
V(St)←(1-αt)V(St)+αt[rt+γV(St+1)]
wherein alpha istThe learning rate at time t.
4. The reinforcement learning-based intelligent scheduling method for the defective materials of the power system as claimed in claim 3, wherein: determining Action by adopting the epsilon greedy strategy, and taking the current best Action by the warehouse under the probability of 1-epsilon to obtain V (S)t) And (4) maximization.
5. The reinforcement learning-based intelligent scheduling method for the defective materials of the power system as claimed in claim 4, wherein: also comprises the following steps of (1) preparing,
at a probability of ε, actions are randomly selected as follows:
wherein the random selection action can be self-contained with exploration to generate a variety of good or bad data learned knowledge, thereby improving current strategies.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011144804.0A CN112258039B (en) | 2020-10-23 | 2020-10-23 | Intelligent scheduling method for defective materials of power system based on reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011144804.0A CN112258039B (en) | 2020-10-23 | 2020-10-23 | Intelligent scheduling method for defective materials of power system based on reinforcement learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112258039A CN112258039A (en) | 2021-01-22 |
CN112258039B true CN112258039B (en) | 2022-07-22 |
Family
ID=74264872
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011144804.0A Active CN112258039B (en) | 2020-10-23 | 2020-10-23 | Intelligent scheduling method for defective materials of power system based on reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112258039B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113657757A (en) * | 2021-08-17 | 2021-11-16 | 厦门汇银通达数字科技有限公司 | Chemical storage scheduling optimization method, medium and equipment based on machine learning |
CN117665572B (en) * | 2024-01-31 | 2024-04-12 | 深圳市双合电气股份有限公司 | Synchronous motor rotor conducting bar state evaluation method and system |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5552009B2 (en) * | 2010-09-22 | 2014-07-16 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Method, program, and apparatus for determining optimal action in consideration of risk |
CN110365057B (en) * | 2019-08-14 | 2022-12-06 | 南方电网科学研究院有限责任公司 | Distributed energy participation power distribution network peak regulation scheduling optimization method based on reinforcement learning |
CN110517002B (en) * | 2019-08-29 | 2022-11-15 | 烟台大学 | Production control method based on reinforcement learning |
CN111258734B (en) * | 2020-01-16 | 2022-09-23 | 中国人民解放军国防科技大学 | Deep learning task scheduling method based on reinforcement learning |
CN111382359B (en) * | 2020-03-09 | 2024-01-12 | 北京京东振世信息技术有限公司 | Service policy recommendation method and device based on reinforcement learning, and electronic equipment |
CN111639815B (en) * | 2020-06-02 | 2023-09-05 | 贵州电网有限责任公司 | Method and system for predicting power grid defect materials through multi-model fusion |
-
2020
- 2020-10-23 CN CN202011144804.0A patent/CN112258039B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN112258039A (en) | 2021-01-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112258039B (en) | Intelligent scheduling method for defective materials of power system based on reinforcement learning | |
Li et al. | Solving multi-area environmental/economic dispatch by Pareto-based chemical-reaction optimization algorithm | |
CN103955864B (en) | Based on the electric system multiple target differentiation planing method for improving harmonic search algorithm | |
CN106849097A (en) | A kind of active distribution network tidal current computing method | |
CN112668129B (en) | Space load clustering-based intelligent grid dividing method for power distribution network | |
CN116207739B (en) | Optimal scheduling method and device for power distribution network, computer equipment and storage medium | |
CN113162090A (en) | Energy storage system capacity configuration optimization method considering battery module capacity | |
CN115374692B (en) | Double-layer optimization scheduling decision method for regional comprehensive energy system | |
CN113241759A (en) | Power distribution network and multi-microgrid robust scheduling method, electronic equipment and storage medium | |
CN109214565A (en) | A kind of subregion system loading prediction technique suitable for the scheduling of bulk power grid subregion | |
Ebell et al. | Reinforcement learning control algorithm for a pv-battery-system providing frequency containment reserve power | |
CN109615139A (en) | A kind of long-term electricity demand forecasting method in the resident based on cultural genetic algorithm | |
CN115693652A (en) | Power distribution network frame optimization method and device based on power balance and performance cost | |
CN116128315A (en) | Long-short term energy storage planning method, system, medium and equipment | |
CN117557009B (en) | Power efficiency monitoring method and system | |
CN106712050A (en) | Improved leapfrogging algorithm-based power grid reactive power optimization method and device | |
CN110298456A (en) | Plant maintenance scheduling method and device in group system | |
CN115622056A (en) | Energy storage optimization configuration method and system based on linear weighting and selection method | |
CN108805323A (en) | Data predication method and device | |
Jula et al. | Using CMAC to obtain dynamic mutation rate in a metaheuristic memetic algorithm to solve university timetabling problem | |
CN112200366A (en) | Load prediction method and device, electronic equipment and readable storage medium | |
CN113705067B (en) | Microgrid optimization operation strategy generation method, system, equipment and storage medium | |
CN117556969B (en) | Flexible power distribution network distributed reactive power optimization method based on probability scene driving | |
CN112507603B (en) | DNN algorithm-based electric power system robust optimization extreme scene identification method | |
Firoozjaee et al. | A two‐stage simulation‐based framework for optimal resilient generation and transmission expansion planning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |