CN111401769A - Intelligent power distribution network fault first-aid repair method and device based on deep reinforcement learning - Google Patents
Intelligent power distribution network fault first-aid repair method and device based on deep reinforcement learning Download PDFInfo
- Publication number
- CN111401769A CN111401769A CN202010218227.9A CN202010218227A CN111401769A CN 111401769 A CN111401769 A CN 111401769A CN 202010218227 A CN202010218227 A CN 202010218227A CN 111401769 A CN111401769 A CN 111401769A
- Authority
- CN
- China
- Prior art keywords
- repair
- distribution network
- reinforcement learning
- deep reinforcement
- fault
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000008439 repair process Effects 0.000 title claims abstract description 129
- 230000002787 reinforcement Effects 0.000 title claims abstract description 42
- 238000000034 method Methods 0.000 title claims abstract description 27
- 230000009471 action Effects 0.000 claims abstract description 39
- 238000013528 artificial neural network Methods 0.000 claims abstract description 33
- 238000012549 training Methods 0.000 claims abstract description 17
- 230000006870 function Effects 0.000 claims abstract description 15
- 238000013468 resource allocation Methods 0.000 claims abstract description 9
- 230000008569 process Effects 0.000 claims abstract description 5
- 238000012423 maintenance Methods 0.000 claims description 4
- 230000004913 activation Effects 0.000 description 2
- 230000000875 corresponding effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0631—Resource planning, allocation, distributing or scheduling for enterprises or organisations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/20—Administration of product repair or maintenance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- General Physics & Mathematics (AREA)
- Tourism & Hospitality (AREA)
- Marketing (AREA)
- Entrepreneurship & Innovation (AREA)
- General Business, Economics & Management (AREA)
- Health & Medical Sciences (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Development Economics (AREA)
- General Engineering & Computer Science (AREA)
- Game Theory and Decision Science (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- Primary Health Care (AREA)
- Educational Administration (AREA)
- Supply And Distribution Of Alternating Current (AREA)
Abstract
A method and a device for intelligently repairing a power distribution network fault based on deep reinforcement learning are disclosed, wherein the method comprises the following steps: 1) firstly, constructing a deep reinforcement learning model, and combining the distance between a fault point and an emergency repair center and the emergency repair task amount of the deep reinforcement learning model into a system state as input data of the deep reinforcement learning model; 2) training the neural network according to input data to obtain system actions, namely a power distribution network first-aid repair resource distribution strategy; 3) bringing the system state and the system action into a reward function to obtain a reward value of the system action, and updating the neural network parameters according to the magnitude of the reward value; 4) and repeating the steps until the reward value tends to be stable, thereby finishing the training process and carrying out distribution network fault first-aid repair resource allocation according to the final system action. The invention can greatly reduce the time for rush repair of faults and improve the power utilization satisfaction of users.
Description
Technical Field
The invention relates to the technical field of power grids, in particular to a power distribution network fault intelligent first-aid repair method and device based on deep reinforcement learning.
Background
In recent years, network technology has been developed rapidly, and various industries have become more modern due to the development of networks more or less. Therefore, the existing power supply system has some problems in dealing with modern power supply, and the power first-aid repair system is produced in order to deal with the sudden problems in all aspects. The power first-aid repair system has the functions that in order to deal with sudden faults of the power system, the power supply command center receives the alarm and then informs the resource distribution center to carry out resource scheduling, so that the time cost from the occurrence of the accident to the start of the treatment is greatly saved, the process of accident treatment is optimized, the satisfaction degree of a user on the power supply system is improved, and the solid power guarantee is provided for the modern construction.
With the continuous expansion of the scale of the power grid in China, the number of power consumers is increased rapidly, and higher requirements are put forward on the service level of power supply enterprises. The original power first-aid repair scheduling systems cannot rapidly and effectively schedule first-aid repair resources due to technical and strategic defects, and economic loss is increased.
Disclosure of Invention
Aiming at the problems, the invention provides the intelligent power distribution network fault emergency repair method and device based on deep reinforcement learning, which are convenient for resource allocation and can reduce the emergency repair time.
The technical scheme of the invention is as follows: the method comprises the following steps:
s1, constructing a power distribution network fault first-aid repair deep reinforcement learning model, formulating a power distribution network fault first-aid repair task, and taking a system state as the input of a deep neural network;
step S2, training the neural network according to the input system state to obtain the system action;
step S3, bringing the system state and the system action into a reward function to obtain a reward value of the system action, and updating the neural network parameters according to the magnitude of the reward value;
and S4, repeating the steps S1-S3 until the reward value tends to be stable, thereby completing the training process and carrying out distribution network fault emergency repair resource allocation according to the final system action.
The system state comprises the distance between a fault point and an emergency repair center and the emergency repair task amount.
In the step S1, the deep reinforcement learning model is built by a neural network, and parameters of the neural network in the model include weight w, bias b, learning rate L and the number of hidden layers of the neural network.
The distribution network fault first-aid repair task in the step S1 is as follows: ru(du,nu) (ii) a Wherein d isuIndicating the distance of the repair point from the repair center, nuIndicating the amount of first-aid repair tasks.
In step S1, the distance between the failure point and the emergency repair center and the emergency repair task amount may be expressed as S { d1, d2 ·, du, n1, n2 ·, nu }, as the system status.
The system action in step S2 is defined as a ═ f1, f2, ·, fu, where fu is the amount of resources allocated to the fault point u by the emergency repair center, including emergency repair personnel, emergency repair vehicles, and emergency repair tools.
The prize winning function in the step S3 is defined as r ═ rmax-TallWherein r ismaxRepresenting the maximum value of the repair time, TallRepresenting the time taken for the first-aid repair.
Time T spent in first-aid repairallThe system consists of two parts, namely journey time and maintenance time; wherein, the journey time refers to the time when the vehicle reaches the fault point and can be expressed asouIndicating the vehicle travel speed assigned to the fault u; repair time refers to the time it takes for a repair person to resolve a fault, and may be expressed as
The utility model provides a device is salvageed to distribution network trouble intelligence based on degree of depth reinforcement study, includes:
the data input module is used for inputting a power distribution network fault first-aid repair task;
the system state module is used for establishing a system state formed by combining the distance between the fault point and the emergency repair center and the emergency repair task amount and used as the input of the deep reinforcement learning model;
the system action module is used for carrying out neural network training on the input system state to obtain system action;
the reward module is used for bringing the system state and the system action into a reward function to obtain a reward value of the system action, and updating the neural network parameters according to the magnitude of the reward value;
and the distribution module is used for carrying out distribution network fault first-aid repair resource distribution according to the last system action. According to the technical scheme, the deep reinforcement learning model reaching the training purpose can be obtained, and the model has strong perception capability and decision-making capability according to the characteristics of the deep reinforcement learning algorithm. Through interacting with the actual emergency repair environment, a solution is provided for the complex emergency repair task.
The method can find the optimal distribution network fault emergency repair resource allocation method based on the deep reinforcement learning algorithm under the condition of a plurality of fault emergency repair tasks, so as to reduce the fault emergency repair time.
The invention greatly reduces the time for rush repair of faults and improves the power utilization satisfaction of users; meanwhile, the economic loss is reduced.
Drawings
FIG. 1 is a flowchart of a method for intelligently repairing a power distribution network fault based on deep reinforcement learning according to an embodiment of the present invention,
fig. 2 is a schematic structural diagram of a power distribution network fault intelligent first-aid repair device based on deep reinforcement learning according to an embodiment of the present invention.
Detailed Description
The invention comprises the following steps:
s1, constructing a power distribution network fault first-aid repair deep reinforcement learning model, formulating a power distribution network fault first-aid repair task, and taking a system state as input of the deep reinforcement learning model;
step S2, training the neural network according to the input system state to obtain the system action;
step S3, bringing the system state and the system action into a reward function to obtain a reward value of the system action, and updating the neural network parameters according to the magnitude of the reward value;
and S4, repeating the steps S1-S3 until the reward value tends to be stable, thereby completing the training process and carrying out distribution network fault emergency repair resource allocation according to the final system action.
The deep reinforcement learning model is an end-to-end perception and decision system, and can obtain an optimal strategy through interaction with the environment. Each moment model needs to interact with the environment to obtain a high-dimensional observation, and the observation is sensed by using a reinforcement learning method to obtain a specific state characteristic representation. And evaluating a value function of each action according to the expected emergency repair effect, and mapping the current state into a corresponding action through a certain strategy. The environment reacts to this action and gets the next observation. And finally, the optimal strategy for realizing the target can be obtained.
In the step S3, the parameter value of the neural network directly affects the distribution strategy of the first-aid repair resources of the power distribution network. Therefore, in order to better adapt to the actual dynamic scene and obtain the optimal resource allocation strategy, the neural network parameters are updated according to the size of the reward value.
In step S4, the reward value is r, which can sufficiently reflect the quality of the resource allocation policy, and the larger the value of r, the better the allocation policy. Meanwhile, the change of the r value reflects the result of the neural network training, and when the r value tends to be stable (namely constant), the neural network training is finished, and the optimal distribution strategy of the emergency repair resources of the power distribution network is obtained.
The system state comprises the distance between a fault point and an emergency repair center and the emergency repair task amount.
In the step S1, the deep reinforcement learning model is built by a neural network, and parameters of the neural network in the model include weight w, bias b, learning rate L and the number of hidden layers of the neural network.
In step S1, the deep reinforcement learning model is built by a neural network. The model interacts with a power distribution network emergency repair environment, and neural network training is completed from the model; after the training is finished, a corresponding power distribution network resource allocation strategy is made according to the actual power distribution network emergency repair environment.
The number of hidden layers, the weight w of each layer, the bias b, the initial value of the learning rate L and the selection of the activation function influence the training result of the whole neural network and influence the distribution of the power distribution network fault first-aid repair resources.
The distribution network fault first-aid repair task in the step S1 is as follows: ru(du,nu) (ii) a Wherein d isuIndicating the distance of the repair point from the repair center, nuIndicating the amount of first-aid repair tasks.
In step S1, the distance between the failure point and the emergency repair center and the emergency repair task amount may be expressed as S { d1, d2 ·, du, n1, n2 ·, nu }, as the system status.
The system action in step S2 is defined as a ═ f1, f2, ·, fu, where fu is the amount of resources allocated to the fault point u by the emergency repair center, including emergency repair personnel, emergency repair vehicles, and emergency repair tools.
The prize winning function in the step S3 is defined as r ═ rmax-TallWherein r ismaxRepresenting the maximum value of the repair time, TallRepresenting the time taken for the first-aid repair.
Time T spent in first-aid repairallThe system consists of two parts, namely journey time and maintenance time; wherein, the journey time refers to the time when the vehicle reaches the fault point and can be expressed asouIndicating the vehicle travel speed assigned to the fault u; repair time refers to the time it takes for a repair person to resolve a fault, and may be expressed as
Fig. 1 is a flowchart of a power distribution network fault intelligent first-aid repair method based on deep reinforcement learning, as shown in fig. 1, the method includes the following steps:
step 101: the power distribution network emergency repair environment is an actual emergency repair scene. The distribution network fault first-aid repair task is defined as follows: ru(du,nu) (ii) a Wherein d isuIndicating the distance of the repair point from the repair center, nuRepresenting the amount of a first-aid repair task;
step 102: the distance between the fault point and the emergency repair center and the emergency repair task amount are combined into a system state, which can be expressed as S ═ d1,d2,…,dU,n1,n2,…,nU};
Step 103: the reward value is formed by the time spent in first-aid repair, and the time spent in first-aid repair TallMay consist of two parts, journey time and maintenance time. Wherein, the distance time refers to the vehicleThe time at which the vehicle reaches the point of failure can be expressed asouIndicating the vehicle travel speed assigned to the fault u. Repair time refers to the time it takes for a repair person to resolve a fault, and may be expressed as
The reward function is defined as r ═ rmax-TallWherein r ismaxRepresents the maximum value of the first-aid repair time;
step 104: the system action is output as a neural network, namely a power distribution network first-aid repair resource distribution strategy; system action, defined as a ═ f1,f2,…,fUIn which fuThe resource amount distributed to a fault point u by an emergency repair center specifically comprises emergency repair personnel, emergency repair vehicles, emergency repair tools and the like;
and 105, establishing a deep reinforcement learning model, and initializing parameters of a deep neural network, such as weight w, bias b, learning rate L, an activation function and the number of hidden layers.
The reinforcement learning obtains the optimal decision through the interactive direct learning with the environment, and the method is applied to a power distribution network emergency repair system, can adaptively learn the optimal signal strategy according to the emergency repair condition, and improves the emergency repair scheme. Fig. 2 is a schematic structural diagram of a power distribution network fault intelligent emergency repair device based on deep reinforcement learning, which is provided in an embodiment of the present invention, and as shown in fig. 2, the power distribution network fault intelligent emergency repair device based on deep reinforcement learning includes:
the data input module is used for inputting a power distribution network fault first-aid repair task;
the system state module is used for establishing a system state formed by combining the distance between the fault point and the emergency repair center and the emergency repair task amount and used as the input of the deep reinforcement learning model;
the system action module is used for carrying out neural network training on the input system state to obtain system action;
the reward module is used for bringing the system state and the system action into a reward function to obtain a reward value of the system action, and updating the neural network parameters according to the magnitude of the reward value;
and the distribution module is used for carrying out distribution network fault first-aid repair resource distribution according to the last system action.
According to the invention, a distribution network fault emergency repair task sequentially passes through the system state module, the system action module and the reward module, and finally passes through the distribution module, so that the reliable distribution of resources is realized, and the emergency repair time is reduced.
The disclosure of the present application also includes the following points:
(1) the drawings of the embodiments disclosed herein only relate to the structures related to the embodiments disclosed herein, and other structures can refer to general designs;
(2) in case of conflict, the embodiments and features of the embodiments disclosed in this application can be combined with each other to arrive at new embodiments;
the above embodiments are only embodiments disclosed in the present disclosure, but the scope of the disclosure is not limited thereto, and the scope of the disclosure should be determined by the scope of the claims.
Claims (9)
1. A power distribution network fault intelligent first-aid repair method based on deep reinforcement learning is characterized by comprising the following steps:
s1, constructing a power distribution network fault first-aid repair deep reinforcement learning model, formulating a power distribution network fault first-aid repair task, and taking a system state as input of the deep reinforcement learning model;
step S2, training the neural network according to the input system state to obtain the system action;
step S3, bringing the system state and the system action into a reward function to obtain a reward value of the system action, and updating the neural network parameters according to the magnitude of the reward value;
and S4, repeating the steps S1-S3 until the reward value tends to be stable, thereby completing the training process and carrying out distribution network fault emergency repair resource allocation according to the final system action.
2. The intelligent power distribution network fault emergency repair method based on deep reinforcement learning of claim 1, wherein the system state comprises a distance between a fault point and an emergency repair center and an emergency repair task amount.
3. The intelligent power distribution network fault emergency repair method based on deep reinforcement learning of claim 1, wherein in step S1, the deep reinforcement learning model is built by a neural network, and parameters of the neural network in the model include weight w, bias b, learning rate L and the number of hidden layers of the neural network.
4. The intelligent power distribution network fault emergency repair method based on deep reinforcement learning of claim 2, wherein the power distribution network fault emergency repair task in step S1 is as follows: ru(du,nu) (ii) a Wherein d isuIndicating the distance of the repair point from the repair center, nuIndicating the amount of first-aid repair tasks.
5. The method for intelligently repairing power distribution network faults based on deep reinforcement learning of claim 2, wherein in step S1, the distance between the fault point and the repair center and the repair task amount are expressed as S { d1, d2, ·, d ═ as the system statusu,n1,n2,···,nu}。
6. The intelligent power distribution network fault emergency repair method based on deep reinforcement learning of claim 1, wherein the system action in step S2 is defined as a ═ f1,f2,···,fuIn which fuThe resource quantity distributed to the fault point u by the emergency repair center comprises emergency repair personnel, emergency repair vehicles and emergency repair tools.
7. The intelligent power distribution network fault robbery based on deep reinforcement learning of claim 1The method, wherein the reward function is defined as r-r in the step S3max-TallWherein r ismaxRepresenting the maximum value of the repair time, TallRepresenting the time taken for the first-aid repair.
8. The intelligent emergency repair method for power distribution network faults based on deep reinforcement learning of claim 7, wherein the emergency repair takes time TallThe system consists of two parts, namely journey time and maintenance time; wherein, the journey time refers to the time when the vehicle reaches the fault point and can be expressed asouIndicating the vehicle travel speed assigned to the fault u; repair time refers to the time it takes for a repair person to resolve a fault, and may be expressed as
9. The utility model provides a device is salvageed to distribution network trouble intelligence based on degree of depth reinforcement study which characterized in that includes:
the data input module is used for inputting a power distribution network fault first-aid repair task;
the system state module is used for establishing a system state formed by combining the distance between the fault point and the emergency repair center and the emergency repair task amount and used as the input of the deep reinforcement learning model;
the system action module is used for carrying out neural network training on the input system state to obtain system action;
the reward module is used for bringing the system state and the system action into a reward function to obtain a reward value of the system action, and updating the neural network parameters according to the magnitude of the reward value;
and the distribution module is used for carrying out distribution network fault first-aid repair resource distribution according to the last system action.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010218227.9A CN111401769A (en) | 2020-03-25 | 2020-03-25 | Intelligent power distribution network fault first-aid repair method and device based on deep reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010218227.9A CN111401769A (en) | 2020-03-25 | 2020-03-25 | Intelligent power distribution network fault first-aid repair method and device based on deep reinforcement learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111401769A true CN111401769A (en) | 2020-07-10 |
Family
ID=71413546
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010218227.9A Pending CN111401769A (en) | 2020-03-25 | 2020-03-25 | Intelligent power distribution network fault first-aid repair method and device based on deep reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111401769A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112149347A (en) * | 2020-09-16 | 2020-12-29 | 北京交通大学 | Power distribution network load transfer method based on deep reinforcement learning |
CN113627733A (en) * | 2021-07-16 | 2021-11-09 | 深圳供电局有限公司 | Post-disaster power distribution network dynamic first-aid repair method and system |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110392377A (en) * | 2019-07-19 | 2019-10-29 | 北京信息科技大学 | A kind of 5G super-intensive networking resources distribution method and device |
CN110557769A (en) * | 2019-09-12 | 2019-12-10 | 南京邮电大学 | C-RAN calculation unloading and resource allocation method based on deep reinforcement learning |
WO2020040763A1 (en) * | 2018-08-23 | 2020-02-27 | Siemens Aktiengesellschaft | Real-time production scheduling with deep reinforcement learning and monte carlo tree search |
-
2020
- 2020-03-25 CN CN202010218227.9A patent/CN111401769A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020040763A1 (en) * | 2018-08-23 | 2020-02-27 | Siemens Aktiengesellschaft | Real-time production scheduling with deep reinforcement learning and monte carlo tree search |
CN110392377A (en) * | 2019-07-19 | 2019-10-29 | 北京信息科技大学 | A kind of 5G super-intensive networking resources distribution method and device |
CN110557769A (en) * | 2019-09-12 | 2019-12-10 | 南京邮电大学 | C-RAN calculation unloading and resource allocation method based on deep reinforcement learning |
Non-Patent Citations (1)
Title |
---|
邓志龙等: "一种基于深度强化学习的调度优化方法", 《西北工业大学学报》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112149347A (en) * | 2020-09-16 | 2020-12-29 | 北京交通大学 | Power distribution network load transfer method based on deep reinforcement learning |
CN112149347B (en) * | 2020-09-16 | 2023-12-26 | 北京交通大学 | Power distribution network load transfer method based on deep reinforcement learning |
CN113627733A (en) * | 2021-07-16 | 2021-11-09 | 深圳供电局有限公司 | Post-disaster power distribution network dynamic first-aid repair method and system |
CN113627733B (en) * | 2021-07-16 | 2024-08-06 | 深圳供电局有限公司 | Post-disaster power distribution network dynamic rush-repair method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112615379B (en) | Power grid multi-section power control method based on distributed multi-agent reinforcement learning | |
CN111934335B (en) | Cluster electric vehicle charging behavior optimization method based on deep reinforcement learning | |
CN110032782B (en) | City-level intelligent traffic signal control system and method | |
CN111062632B (en) | 5G energy Internet virtual power plant economic dispatching method based on edge intelligence | |
CN111401769A (en) | Intelligent power distribution network fault first-aid repair method and device based on deep reinforcement learning | |
CN111160755B (en) | Real-time scheduling method for aircraft overhaul workshop based on DQN | |
CN114565247B (en) | Workshop scheduling method, device and system based on deep reinforcement learning | |
CN113110052B (en) | Hybrid energy management method based on neural network and reinforcement learning | |
CN113159341A (en) | Power distribution network aid decision-making method and system integrating deep reinforcement learning and expert experience | |
CN111352713B (en) | Automatic driving reasoning task workflow scheduling method oriented to time delay optimization | |
CN115239133A (en) | Multi-heat-source heat supply system collaborative optimization scheduling method based on layered reinforcement learning | |
CN115473286A (en) | Distributed economic dispatching optimization method based on constrained projection reinforcement learning | |
CN102745192B (en) | Task allocation system for distributed control system of hybrid vehicle | |
CN115543626A (en) | Power defect image simulation method adopting heterogeneous computing resource load balancing scheduling | |
CN114326806B (en) | Unmanned aerial vehicle cluster maintenance method considering task system | |
CN113966519A (en) | Control architecture for distribution network with distributed energy sources | |
CN110705756B (en) | Electric power energy consumption optimization control method based on input convex neural network | |
CN116362109B (en) | Intelligent unmanned system and method based on digital twinning | |
CN102774376B (en) | Task allocation method of distributed control system of hybrid power vehicle | |
CN115912430A (en) | Cloud-edge-cooperation-based large-scale energy storage power station resource allocation method and system | |
CN111489027A (en) | Hydroelectric generating set waveform data trend prediction method and system | |
CN116367190A (en) | Digital twin function virtualization method for 6G mobile network | |
CN115187056A (en) | Multi-agent cooperative resource allocation method considering fairness principle | |
CN114819273A (en) | Workshop scheduling method based on combination of multi-Agent global optimization and local optimization | |
CN113256128A (en) | Task scheduling method for balancing resource usage by reinforcement learning in power internet of things |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200710 |