CN112258039A - Intelligent scheduling method for defective materials of power system based on reinforcement learning - Google Patents

Intelligent scheduling method for defective materials of power system based on reinforcement learning Download PDF

Info

Publication number
CN112258039A
CN112258039A CN202011144804.0A CN202011144804A CN112258039A CN 112258039 A CN112258039 A CN 112258039A CN 202011144804 A CN202011144804 A CN 202011144804A CN 112258039 A CN112258039 A CN 112258039A
Authority
CN
China
Prior art keywords
materials
reinforcement learning
power system
scheduling
steps
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011144804.0A
Other languages
Chinese (zh)
Other versions
CN112258039B (en
Inventor
俞虹
唐诚旋
蒋群群
陈珏伊
张秀
程文美
代洲
徐一蝶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou Power Grid Co Ltd
Original Assignee
Guizhou Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou Power Grid Co Ltd filed Critical Guizhou Power Grid Co Ltd
Priority to CN202011144804.0A priority Critical patent/CN112258039B/en
Publication of CN112258039A publication Critical patent/CN112258039A/en
Application granted granted Critical
Publication of CN112258039B publication Critical patent/CN112258039B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • G06Q10/06312Adjustment or analysis of established resource schedule, e.g. resource or task levelling, or dynamic rescheduling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Educational Administration (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention discloses an intelligent scheduling method for defective goods and materials of an electric power system based on reinforcement learning, which comprises the steps of defining states, decisions, transfer equations, reward functions and requirements and targets in dynamic scheduling problems of goods and materials storage in the reinforcement learning; solving the material warehousing dynamic scheduling problem by utilizing a Markov decision process; listing Bellman equations aiming at power grid defective materials and selecting a solving strategy; and modifying the Bellman equation into a data-driven online updating form, and determining a scheduling action based on an epsilon greedy strategy. The invention provides a combined control and scheduling problem for solving emergency materials of an electric power system based on a Markov random process and reinforcement learning, and an end-to-end algorithm does not predict the demand and directly makes inventory control and scheduling decisions; meanwhile, the method is verified on a real data set, has good convergence and gain, and proves the usability and practical value of the method.

Description

Intelligent scheduling method for defective materials of power system based on reinforcement learning
Technical Field
The invention relates to the technical field of power grid and artificial intelligence scheduling, in particular to an intelligent scheduling method for defective materials of a power system based on reinforcement learning.
Background
Statistical optimization method: according to statistical rules, the distribution of various emergency demands is modeled, and statistically average and optimal warehousing distribution is calculated through centralized mathematical modeling.
And (3) a data prediction method: based on the idea of data analysis and mining in each region, a sequence-to-sequence model is constructed for different requirements of each region by using an artificial intelligence and machine learning method, so that the time sequence is predicted; and then, on the basis of prediction, centralized layout and optimization are carried out on the warehousing system and the scheduling.
For a statistical optimization method, complete statistics is required for all demand distributions in a region, meanwhile, optimal distribution needs to be recalculated every time state transition and emergency occur, the consumption of computing resources is high, the response is slow, and certain limitations are realized; for a data prediction method, the traditional feature selection is usually based on a feature sorting method, according to the calculated importance and relevance of each feature, the first k features are taken as the input of demand prediction, and the method has the greatest defect that the global information of the system cannot be represented well by selecting the features with the greatest importance and relevance, so that the most abundant information cannot be provided for the prediction system; meanwhile, because the predicted result is not the final result, secondary calculation is carried out according to the predicted result to obtain a scheduling and control scheme, and errors are accumulated by a multi-step framework, so that the deviation of the final result is caused.
Disclosure of Invention
This section is for the purpose of summarizing some aspects of embodiments of the invention and to briefly introduce some preferred embodiments. In this section, as well as in the abstract and the title of the invention of this application, simplifications or omissions may be made to avoid obscuring the purpose of the section, the abstract and the title, and such simplifications or omissions are not intended to limit the scope of the invention.
The present invention has been made in view of the above-mentioned conventional problems.
Therefore, the invention provides an intelligent scheduling method for defective goods and materials of an electric power system based on reinforcement learning, which can solve the problem of joint control and scheduling of emergency goods and materials of the electric power system.
In order to solve the technical problems, the invention provides the following technical scheme: the method comprises the steps of defining states, decisions, transfer equations, reward functions and demands and targets in the dynamic scheduling problem of material storage in reinforcement learning; solving the material warehousing dynamic scheduling problem by utilizing a Markov decision process; listing Bellman equations aiming at power grid defective materials and selecting a solving strategy; and modifying the Bellman equation into a data-driven online updating form, and determining a scheduling action based on an epsilon greedy strategy.
The invention relates to a preferable scheme of an intelligent dispatching method for defective goods and materials of an electric power system based on reinforcement learning, wherein the preferable scheme comprises the following steps: including defining the current state of the warehouse, and storing the material S in each warehouset=Z∈Rn×m(ii) a Wherein Z isi,jRepresenting the number of supplies j in warehouse i at the current time.
The invention relates to a preferable scheme of an intelligent dispatching method for defective goods and materials of an electric power system based on reinforcement learning, wherein the preferable scheme comprises the following steps: including, according to the current time state St∈Rn×mAnd the requirement Q ∈ Rn×mThe warehousing system determines a scheduling scheme X and a purchasing scheme B at the moment, wherein Xi,jAnd Bi,jRespectively representing the ex-warehouse quantity and the purchase quantity of the goods and materials j in the warehouse i at the current moment.
The invention relates to a preferable scheme of an intelligent dispatching method for defective goods and materials of an electric power system based on reinforcement learning, wherein the preferable scheme comprises the following steps: after the warehousing system decides a scheduling scheme and a purchasing scheme, the warehousing state randomly generates state transition at the next moment, and then a transition equation is expressed as follows:
St+1=Z-X+B
wherein, as the storage materials can not be negative physically, and the storage space is always limited, the effective decision (X, B) must satisfy the following inequality:
Figure BDA0002739373000000021
the invention relates to a preferable scheme of an intelligent dispatching method for defective goods and materials of an electric power system based on reinforcement learning, wherein the preferable scheme comprises the following steps: the storage system mainly aims at meeting the problem of emergency material demand in regions and between regions, and the reward function is obtained by subtracting the cost of purchasing materials from the loss income at the current moment as follows:
Figure BDA0002739373000000022
wherein, the symbol (x)-Comprises the following steps:
Figure BDA0002739373000000023
the invention relates to a preferable scheme of an intelligent dispatching method for defective goods and materials of an electric power system based on reinforcement learning, wherein the preferable scheme comprises the following steps: comprises the steps of (a) preparing a mixture of a plurality of raw materials,
Figure BDA0002739373000000031
wherein γ ∈ [0,1) is an attenuation Factor (Discount Factor).
The invention relates to a preferable scheme of an intelligent dispatching method for defective goods and materials of an electric power system based on reinforcement learning, wherein the preferable scheme comprises the following steps: changing the Bellman equation into a data-driven online updating form, wherein the data-driven online updating form comprises the following steps:
V(St)←(1-αt)V(St)+αt[rt+γV(St+1)]
wherein alpha istThe learning rate at time t.
The invention relates to a preferable scheme of an intelligent dispatching method for defective goods and materials of an electric power system based on reinforcement learning, wherein the preferable scheme comprises the following steps: determining Action by adopting the epsilon greedy strategy, and taking the current best Action by the warehouse under the probability of 1-epsilon to obtain V (S)t) And (4) maximizing.
The invention relates to a preferable scheme of an intelligent dispatching method for defective goods and materials of an electric power system based on reinforcement learning, wherein the preferable scheme comprises the following steps: also included is, at a probability of ε, randomly selecting an action as follows:
Figure BDA0002739373000000032
wherein the random actions can be explored themselves, and knowledge learned by exploring to produce a variety of good or bad data, thereby improving current strategies.
The invention has the beneficial effects that: the invention provides a combined control and scheduling problem for solving emergency materials of an electric power system based on a Markov random process and reinforcement learning, and an end-to-end algorithm does not predict the demand and directly makes inventory control and scheduling decisions; the proposed algorithm is an "online" algorithm, i.e. inventory control and scheduling decisions rely only on observations of past events; the proposed algorithm is also a "model-free" algorithm, independent of any assumed stochastic model of uncertain events; meanwhile, the method is verified on a real data set, has good convergence and gain, and proves the usability and practical value of the method.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise. Wherein:
fig. 1 is a schematic flowchart illustrating a method for intelligently scheduling defective materials of an electrical power system based on reinforcement learning according to a first embodiment of the present invention;
fig. 2 is a schematic diagram illustrating defect material scheduling of a reinforcement learning-based intelligent scheduling method for defect materials of an electrical power system according to a first embodiment of the present invention;
fig. 3 is a schematic diagram illustrating reinforcement learning scheduling of defective materials of an electric power system according to a reinforcement learning-based intelligent scheduling method of defective materials of an electric power system according to a first embodiment of the present invention;
fig. 4 is a schematic diagram illustrating a comparison of earnings of a warehousing system of the intelligent scheduling method for defective goods and materials of an electric power system based on reinforcement learning according to a second embodiment of the present invention under different warehousing capacities;
fig. 5 is a schematic diagram illustrating a comparison of profits of a warehousing system of the power system defect material intelligent scheduling method based on reinforcement learning according to the second embodiment of the present invention under different warehousing capacities (C) and attenuation coefficients (Y).
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, specific embodiments accompanied with figures are described in detail below, and it is apparent that the described embodiments are a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making creative efforts based on the embodiments of the present invention, shall fall within the protection scope of the present invention.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways than those specifically described and will be readily apparent to those of ordinary skill in the art without departing from the spirit of the present invention, and therefore the present invention is not limited to the specific embodiments disclosed below.
Furthermore, reference herein to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one implementation of the invention. The appearances of the phrase "in one embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments.
The present invention will be described in detail with reference to the drawings, wherein the cross-sectional views illustrating the structure of the device are not enlarged partially in general scale for convenience of illustration, and the drawings are only exemplary and should not be construed as limiting the scope of the present invention. In addition, the three-dimensional dimensions of length, width and depth should be included in the actual fabrication.
Meanwhile, in the description of the present invention, it should be noted that the terms "upper, lower, inner and outer" and the like indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, and are only for convenience of describing the present invention and simplifying the description, but do not indicate or imply that the referred device or element must have a specific orientation, be constructed in a specific orientation and operate, and thus, cannot be construed as limiting the present invention. Furthermore, the terms first, second, or third are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
The terms "mounted, connected and connected" in the present invention are to be understood broadly, unless otherwise explicitly specified or limited, for example: can be fixedly connected, detachably connected or integrally connected; they may be mechanically, electrically, or directly connected, or indirectly connected through intervening media, or may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
Example 1
Referring to fig. 1, 2 and 3, for a first embodiment of the present invention, there is provided a method for intelligently scheduling defective materials of an electric power system based on reinforcement learning, including:
s1: and defining the state, the decision, the transfer equation, the reward function and the requirement and the target in the dynamic scheduling problem of the material storage in the reinforcement learning. In which it is to be noted that,
defining the state, decision, transfer equation, reward function and the demand and target of the material storage dynamic scheduling problem in the reinforcement learning algorithm aiming at the power system defect material scheduling;
the state is the storage state and the material defect state at the moment t, the decision is made to be the scheduling mode and the purchasing mode adopted at the moment, and the transfer equation is a front-back change equation;
defining the state of the current time, and storing the materials S in each warehouset=Z∈Rn×m
Wherein Z isi,jRepresenting the quantity of materials j in the warehouse i at the current moment;
according to the current time state St∈Rn×mAnd the requirement Q ∈ Rn×mThe warehousing system determines a scheduling scheme X and a purchasing scheme B at the moment, wherein Xi,jAnd Bi,jRespectively representing the ex-warehouse quantity and the purchase quantity of the goods and materials j in the warehouse i at the current moment.
S2: and solving the problem of dynamic scheduling of material warehousing by utilizing a Markov decision process. It should be noted that in this step,
after the warehousing system decides the scheduling and purchasing scheme, at the next moment, the warehousing state randomly undergoes state transition, and then the transition equation is expressed as:
St+1=Z-X+B
wherein, as the storage materials can not be negative physically, and the storage space is always limited, the effective decision (X, B) must satisfy the following inequality:
Figure BDA0002739373000000061
s3: listing Bellman equation aiming at power grid defect materials and selecting a solving strategy. Among them, it is also to be noted that:
the main objective of the warehousing system is to meet the problem of emergency material demand in regions and between regions, and then the reward function is to subtract the cost of purchasing materials from the lost income at the current moment as follows:
Figure BDA0002739373000000062
wherein, the symbol (x)-Comprises the following steps:
Figure BDA0002739373000000063
solving the MDP problem, then:
Figure BDA0002739373000000064
wherein γ ∈ [0,1) is an attenuation Factor (Discount Factor).
S4: and modifying the Bellman equation into a data-driven online updating form, and determining a scheduling action based on an epsilon greedy strategy. What should be further described in this step is:
the Bellman equation is changed to a form of data-driven online update as follows:
V(St)←(1-αt)V(St)+αt[rt+γV(St+1)]
wherein alpha istLearning rate at time t;
determining Action by adopting epsilon greedy strategy, and taking current best Action by the warehouse under the probability of 1-epsilon to obtain V (S)t) Maximization;
at a probability of ε, actions are randomly selected as follows:
Figure BDA0002739373000000071
where random actions can be explored themselves, learning knowledge by exploring to produce a variety of good or bad data, thereby improving current strategies.
Preferably, the embodiment further includes designing a Bellman equation for the defective materials of the power grid and selecting a solution strategy, where the Bellman equation is a mathematical form of a scheduling problem, and the selection strategy is to obtain an optimal scheduling result more quickly; the Bellman equation is modified into a data-driven online updating form, namely required data such as material demand data and online storage data can be accessed in real time, and then the Bellman equation can adapt to the updated state so as to better adapt to the dynamic storage and scheduling problems of power grid defect materials to be solved by the invention, and scheduling actions are determined based on an epsilon greedy strategy.
Example 2
Referring to fig. 4 and 5, a second embodiment of the present invention, which is different from the first embodiment, provides an authenticity verification method for an intelligent scheduling method of defective goods and materials of an electric power system based on reinforcement learning, including:
in order to better verify and explain the technical effect adopted in the method, the embodiment selects the traditional greedy algorithm-based intelligent scheduling method to perform a comparison test with the method, compares the test result by means of scientific demonstration, and verifies the real effect of the method.
The convergence and the gain of the traditional intelligent scheduling method based on the greedy algorithm are low, and in order to verify that the method has higher gain and convergence compared with the traditional method, the traditional intelligent scheduling method based on the greedy algorithm is adopted to carry out real-time measurement comparison with the method.
And (3) testing conditions are as follows: (1) 15 areas in the jurisdiction or periphery of Guiyang City of Guizhou province are collected: the emergency material requirements of the dolomitic cloud, north city, brook, hui shui, jinyang, kaiyang, longli, south Ming, Qing Zhen, Shuanglong, Wudang, honeycomb, Xiaohe, repair culture and cloud rock are the requirements of the defective materials of each month;
(2) the invention carries out the transformation of the state transition equation aiming at the specific problems so as to adapt to the application scene described in the embodiment;
(3) starting the automatic test equipment, simulating by using MATLB and outputting a curve schematic diagram.
Referring to fig. 4, a solid line is a curve output by the method of the present invention, and a dotted line is a curve output by the conventional method, and as the warehouse storage capacity increases, the average profit curves of both curves increase, but it can be seen from fig. 4 that the trend of the solid line is more prominent than that of the dotted line, and the solid line is always kept above the dotted line, thereby illustrating that the method of the present invention has higher gain compared to the conventional method.
Referring to fig. 5, it can be seen that the method of the present invention is always in an increasing trend as the attenuation coefficient γ and the warehouse storage capacity increase, but the benefit is the lowest when the attenuation coefficient γ is 0.95 and the benefit is the highest when the attenuation coefficient γ is 0.8, and based on this, the superiority of the method of the present invention is verified.
It should be noted that the above-mentioned embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, which should be covered by the claims of the present invention.

Claims (9)

1. A method for intelligently scheduling defective materials of an electric power system based on reinforcement learning is characterized by comprising the following steps: comprises the steps of (a) preparing a mixture of a plurality of raw materials,
defining states, decisions, transfer equations, reward functions and requirements and targets in the dynamic scheduling problem of material storage in reinforcement learning;
solving the material warehousing dynamic scheduling problem by utilizing a Markov decision process;
listing Bellman equations aiming at power grid defective materials and selecting a solving strategy;
and modifying the Bellman equation into a data-driven online updating form, and determining a scheduling action based on an epsilon greedy strategy.
2. The reinforcement learning-based intelligent scheduling method for the defective materials of the power system as claimed in claim 1, wherein: comprises the steps of (a) preparing a mixture of a plurality of raw materials,
defining the state of the current time, then each warehouseStored material St=Z∈Rn×m
Wherein Z isi,jRepresenting the number of supplies j in warehouse i at the current time.
3. The reinforcement learning-based intelligent scheduling method for the defective materials of the power system as claimed in claim 1 or 2, wherein: comprises the steps of (a) preparing a mixture of a plurality of raw materials,
according to the current time state St∈Rn×mAnd the requirement Q ∈ Rn×mThe warehousing system determines a scheduling scheme X and a purchasing scheme B at the moment, wherein Xi,jAnd Bi,jRespectively representing the ex-warehouse quantity and the purchase quantity of the goods and materials j in the warehouse i at the current moment.
4. The reinforcement learning-based intelligent scheduling method for defective materials of the power system as claimed in claim 5, wherein: comprises the steps of (a) preparing a mixture of a plurality of raw materials,
after the warehousing system decides the scheduling and purchasing scheme, at the next moment, the warehousing state randomly undergoes state transition, and then the transition equation is expressed as:
St+1=Z-X+B
wherein, as the storage materials can not be negative physically, and the storage space is always limited, the effective decision (X, B) must satisfy the following inequality:
Figure FDA0002739372990000011
5. the reinforcement learning-based intelligent scheduling method for the defective materials of the power system as claimed in claim 4, wherein: comprises the steps of (a) preparing a mixture of a plurality of raw materials,
the storage system mainly aims at meeting the problem of emergency material demand in regions and between regions, and the reward function is obtained by subtracting the cost of purchasing materials from the loss income at the current moment as follows:
Figure FDA0002739372990000021
wherein, the symbol (x)-Comprises the following steps:
Figure FDA0002739372990000022
6. the reinforcement learning-based intelligent scheduling method for defective materials of the power system as claimed in claim 5, wherein: comprises the steps of (a) preparing a mixture of a plurality of raw materials,
Figure FDA0002739372990000023
wherein γ ∈ [0,1) is an attenuation Factor (Discount Factor).
7. The reinforcement learning-based intelligent scheduling method for defective materials of the power system as claimed in claim 6, wherein: comprises the steps of (a) preparing a mixture of a plurality of raw materials,
the Bellman equation is changed into a form of data-driven online update as follows:
V(St)←(1-αt)V(St)+αt[rt+γV(St+1)]
wherein alpha istThe learning rate at time t.
8. The reinforcement learning-based intelligent scheduling method for defective materials of the power system as claimed in claim 7, wherein: determining Action by adopting the epsilon greedy strategy, and taking the current best Action by the warehouse under the probability of 1-epsilon to obtain V (S)t) And (4) maximizing.
9. The reinforcement learning-based intelligent scheduling method for defective materials of the power system as claimed in claim 8, wherein: also comprises the following steps of (1) preparing,
at a probability of ε, actions are randomly selected as follows:
Figure FDA0002739372990000031
wherein the random actions can be explored themselves, and knowledge learned by exploring to produce a variety of good or bad data, thereby improving current strategies.
CN202011144804.0A 2020-10-23 2020-10-23 Intelligent scheduling method for defective materials of power system based on reinforcement learning Active CN112258039B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011144804.0A CN112258039B (en) 2020-10-23 2020-10-23 Intelligent scheduling method for defective materials of power system based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011144804.0A CN112258039B (en) 2020-10-23 2020-10-23 Intelligent scheduling method for defective materials of power system based on reinforcement learning

Publications (2)

Publication Number Publication Date
CN112258039A true CN112258039A (en) 2021-01-22
CN112258039B CN112258039B (en) 2022-07-22

Family

ID=74264872

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011144804.0A Active CN112258039B (en) 2020-10-23 2020-10-23 Intelligent scheduling method for defective materials of power system based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN112258039B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113657757A (en) * 2021-08-17 2021-11-16 厦门汇银通达数字科技有限公司 Chemical storage scheduling optimization method, medium and equipment based on machine learning
CN117665572A (en) * 2024-01-31 2024-03-08 深圳市双合电气股份有限公司 Synchronous motor rotor conducting bar state evaluation method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120072259A1 (en) * 2010-09-22 2012-03-22 International Business Machines Corporation Determining optimal action in consideration of risk
CN110365057A (en) * 2019-08-14 2019-10-22 南方电网科学研究院有限责任公司 Distributed energy based on intensified learning participates in power distribution network peak regulation method for optimizing scheduling
CN110517002A (en) * 2019-08-29 2019-11-29 烟台大学 Production control method based on intensified learning
CN111258734A (en) * 2020-01-16 2020-06-09 中国人民解放军国防科技大学 Deep learning task scheduling method based on reinforcement learning
CN111382359A (en) * 2020-03-09 2020-07-07 北京京东振世信息技术有限公司 Service strategy recommendation method and device based on reinforcement learning and electronic equipment
CN111639815A (en) * 2020-06-02 2020-09-08 贵州电网有限责任公司 Method and system for predicting power grid defect materials through multi-model fusion

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120072259A1 (en) * 2010-09-22 2012-03-22 International Business Machines Corporation Determining optimal action in consideration of risk
CN110365057A (en) * 2019-08-14 2019-10-22 南方电网科学研究院有限责任公司 Distributed energy based on intensified learning participates in power distribution network peak regulation method for optimizing scheduling
CN110517002A (en) * 2019-08-29 2019-11-29 烟台大学 Production control method based on intensified learning
CN111258734A (en) * 2020-01-16 2020-06-09 中国人民解放军国防科技大学 Deep learning task scheduling method based on reinforcement learning
CN111382359A (en) * 2020-03-09 2020-07-07 北京京东振世信息技术有限公司 Service strategy recommendation method and device based on reinforcement learning and electronic equipment
CN111639815A (en) * 2020-06-02 2020-09-08 贵州电网有限责任公司 Method and system for predicting power grid defect materials through multi-model fusion

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
S.KAMAL CHAHARSOOGHI ET AL.: "A reinforcement learning model for supply chain ordering management: An application to the beer game", 《DECISION SUPPORT SYSTEMS》 *
张沛 等: "基于深度增强学习和多目标优化改进的卫星资源分配算法", 《通信学报》 *
汪黎明: "制造企业零库存管理物资调度方法研究", 《价值工程》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113657757A (en) * 2021-08-17 2021-11-16 厦门汇银通达数字科技有限公司 Chemical storage scheduling optimization method, medium and equipment based on machine learning
CN117665572A (en) * 2024-01-31 2024-03-08 深圳市双合电气股份有限公司 Synchronous motor rotor conducting bar state evaluation method and system
CN117665572B (en) * 2024-01-31 2024-04-12 深圳市双合电气股份有限公司 Synchronous motor rotor conducting bar state evaluation method and system

Also Published As

Publication number Publication date
CN112258039B (en) 2022-07-22

Similar Documents

Publication Publication Date Title
CN112258039B (en) Intelligent scheduling method for defective materials of power system based on reinforcement learning
Li et al. Solving multi-area environmental/economic dispatch by Pareto-based chemical-reaction optimization algorithm
US11824360B2 (en) Apparatus and method for optimizing carbon emissions in a power grid
Xu et al. Multi-objective learning backtracking search algorithm for economic emission dispatch problem
CN110033312A (en) Generation method, device, equipment and the storage medium of room rate prediction model
CN106849097A (en) A kind of active distribution network tidal current computing method
CN109066710A (en) A kind of multi-objective reactive optimization method, apparatus, computer equipment and storage medium
Huang et al. A control strategy based on deep reinforcement learning under the combined wind-solar storage system
CN110768262A (en) Active power distribution network reactive power supply configuration method based on node clustering partition
CN112668129B (en) Space load clustering-based intelligent grid dividing method for power distribution network
CN112884270A (en) Multi-scene power distribution network planning method and system considering uncertainty factors
CN116207739B (en) Optimal scheduling method and device for power distribution network, computer equipment and storage medium
CN113162090A (en) Energy storage system capacity configuration optimization method considering battery module capacity
Ebell et al. Reinforcement learning control algorithm for a pv-battery-system providing frequency containment reserve power
CN115693652A (en) Power distribution network frame optimization method and device based on power balance and performance cost
CN109615139A (en) A kind of long-term electricity demand forecasting method in the resident based on cultural genetic algorithm
CN106712050A (en) Improved leapfrogging algorithm-based power grid reactive power optimization method and device
CN117077981B (en) Method and device for distributing stand by fusing neighborhood search variation and differential evolution
CN110298456A (en) Plant maintenance scheduling method and device in group system
CN116128315A (en) Long-short term energy storage planning method, system, medium and equipment
CN108805323A (en) Data predication method and device
CN114759579A (en) Power grid active power optimization control system, method and medium based on data driving
CN116542498B (en) Battery scheduling method, system, device and medium based on deep reinforcement learning
Šantaras A review of voltage and reactive power control algorithms in medium voltage distribution networks
Firoozjaee et al. A two‐stage simulation‐based framework for optimal resilient generation and transmission expansion planning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant