CN116154771A - Control method of power equipment, equipment control method and electronic equipment - Google Patents

Control method of power equipment, equipment control method and electronic equipment Download PDF

Info

Publication number
CN116154771A
CN116154771A CN202310424102.5A CN202310424102A CN116154771A CN 116154771 A CN116154771 A CN 116154771A CN 202310424102 A CN202310424102 A CN 202310424102A CN 116154771 A CN116154771 A CN 116154771A
Authority
CN
China
Prior art keywords
instruction
state information
power
power equipment
equipment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310424102.5A
Other languages
Chinese (zh)
Other versions
CN116154771B (en
Inventor
仪忠凯
蒋蔚
王伟
杨程
杨超
钮孟洋
韩佳澦
印卧涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Damo Institute Hangzhou Technology Co Ltd
Original Assignee
Alibaba Damo Institute Hangzhou Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Damo Institute Hangzhou Technology Co Ltd filed Critical Alibaba Damo Institute Hangzhou Technology Co Ltd
Priority to CN202310424102.5A priority Critical patent/CN116154771B/en
Publication of CN116154771A publication Critical patent/CN116154771A/en
Application granted granted Critical
Publication of CN116154771B publication Critical patent/CN116154771B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J13/00Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network
    • H02J13/00001Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network characterised by the display of information or by user interaction, e.g. supervisory control and data acquisition systems [SCADA] or graphical user interfaces [GUI]
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J13/00Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network
    • H02J13/00006Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network characterised by information or instructions transport means between the monitoring, controlling or managing units and monitored, controlled or operated power network element or electrical equipment
    • H02J13/00016Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network characterised by information or instructions transport means between the monitoring, controlling or managing units and monitored, controlled or operated power network element or electrical equipment using a wired telecommunication network or a data transmission bus
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J13/00Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network
    • H02J13/00006Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network characterised by information or instructions transport means between the monitoring, controlling or managing units and monitored, controlled or operated power network element or electrical equipment
    • H02J13/00022Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network characterised by information or instructions transport means between the monitoring, controlling or managing units and monitored, controlled or operated power network element or electrical equipment using wireless data transmission
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2203/00Indexing scheme relating to details of circuit arrangements for AC mains or AC distribution networks
    • H02J2203/20Simulating, e g planning, reliability check, modelling or computer assisted design [CAD]

Landscapes

  • Engineering & Computer Science (AREA)
  • Power Engineering (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Feedback Control In General (AREA)

Abstract

The application discloses a control method of power equipment, an equipment control method and electronic equipment. The method comprises the following steps: collecting measurement data of the power equipment in a power generation state and a charging and discharging state to obtain first state information of the power equipment; the first state information is mapped by using an enhanced instruction reasoning model to obtain a target control instruction of the power equipment, the enhanced instruction reasoning model is trained by using a preset mapping relation and environment feedback data, the preset mapping relation is used for representing the mapping relation between different state information and different control instructions, the environment feedback data is used for representing data obtained by interaction with the environment where the power equipment is located, and the target control instruction comprises: a power generation scheduling instruction of the power generation equipment and charge and discharge power of the energy storage equipment; and triggering the target control instruction to control the power equipment. The method and the device solve the technical problems that in the related art, the execution efficiency of controlling the power equipment through the model is low and the training speed is low.

Description

Control method of power equipment, equipment control method and electronic equipment
Technical Field
The present disclosure relates to the field of data processing of electrical devices, and in particular, to a control method of an electrical device, a device control method, and an electronic device.
Background
With increasing renewable energy duty ratio and increasing complexity of power grid operation environment, the traditional power equipment scheduling mode cannot meet the real-time regulation and control requirement of power equipment in complex environments with severe changes and inaccurate parameters, so that a method for controlling the power equipment through a model is generated, but when the power equipment is controlled through the model, the existing model training method is large in dependence on model parameters and low in decision efficiency, and therefore the problems of low execution efficiency and low training speed of controlling the power equipment through the model are caused.
In view of the above problems, no effective solution has been proposed at present.
Disclosure of Invention
The embodiment of the application provides a control method and a device control method for power equipment and electronic equipment, which at least solve the technical problems of low execution efficiency and low training speed of controlling the power equipment through a model in the related technology.
According to an aspect of the embodiments of the present application, there is provided a control method of an electrical device, including: collecting measurement data of the power equipment in a power generation state and a charging and discharging state to obtain first state information of the power equipment; the first state information is mapped by using an enhanced instruction reasoning model to obtain a target control instruction of the power equipment, wherein the enhanced instruction reasoning model is trained by using a preset mapping relation and environment feedback data, the preset mapping relation is used for representing the mapping relation between different state information and different control instructions, the environment feedback data is used for representing data obtained by interaction with the environment where the power equipment is located, and the target control instruction comprises: a power generation scheduling instruction of the power generation equipment and charge and discharge power of the energy storage equipment; and triggering the target control instruction to control the power equipment.
According to another aspect of the embodiments of the present application, there is also provided a training method of an instruction inference model, including: acquiring a preset mapping relation corresponding to the power equipment, wherein the preset mapping relation is used for representing the mapping relation between different state information and different control instructions; training the initial instruction reasoning model by using a preset mapping relation and environment feedback data to obtain an enhanced instruction reasoning model, wherein the environment feedback data is used for representing data obtained by interaction with an environment where the power equipment is located.
According to another aspect of the embodiments of the present application, there is also provided an apparatus control method, including: collecting measurement data of equipment to be controlled in an operation state to obtain first state information of the equipment to be controlled; the first state information is mapped by using an enhanced instruction reasoning model to obtain a target control instruction of the equipment to be controlled, wherein the enhanced instruction reasoning model is trained by using a preset mapping relation and environment feedback data, the preset mapping relation is used for representing the mapping relation between different state information and different control instructions, the environment feedback data is used for representing data obtained by interaction with the environment where the power equipment is located, and the preset mapping relation is used for representing the mapping relation between the state information of different equipment and the control instructions; and triggering the target control instruction to control the equipment to be controlled.
According to another aspect of the embodiments of the present application, there is also provided a control method of an electrical device, including: responding to an input instruction acted on an operation interface, and displaying first state information of the power equipment on the operation interface, wherein the first state information is obtained by measuring the running state of the power equipment; and responding to the instruction generation instruction acted on the operation interface, and displaying a target control instruction of the power equipment on the operation interface, wherein the target control instruction is obtained by mapping the state information by using an enhanced instruction reasoning model, the enhanced instruction reasoning model is obtained by training by using a preset mapping relation and environment feedback data, the environment feedback data is used for representing data obtained by interaction with the environment where the power equipment is located, and the preset mapping relation is used for representing the mapping relation between different state information and different control instructions.
According to another aspect of the embodiments of the present application, there is also provided a control method of an electrical device, including: acquiring first state information of the power equipment by calling an enhancement interface, wherein the enhancement interface comprises enhancement parameters, the parameter values of the enhancement parameters are the first state information, and the first state information is obtained by acquiring measurement data of the power equipment in an operation state; mapping the state information by using an enhanced instruction reasoning model to obtain a target control instruction of the power equipment, wherein the enhanced instruction reasoning model is trained by using a preset mapping relation and environment feedback data, the environment feedback data is used for representing data obtained by interaction with an environment where the power equipment is located, and the preset mapping relation is used for representing mapping relations between different state information and different control instructions; and outputting a target control instruction by calling a teaching interface, wherein the teaching interface comprises teaching parameters, and parameter values of the teaching parameters are the target control instruction.
According to another aspect of the embodiments of the present application, there is also provided an electronic device, including: a memory storing an executable program; and a processor for running the program, wherein the program executes the method of any one of the above.
According to another aspect of the embodiments of the present application, there is also provided a computer readable storage medium, including a stored executable program, where the executable program when executed controls a device in which the computer readable storage medium is located to perform the method of any one of the above.
In the embodiment of the application, measurement data of the power equipment in a power generation state and a charging and discharging state are acquired to obtain first state information of the power equipment; the first state information is mapped by using an enhanced instruction reasoning model to obtain a target control instruction of the power equipment, wherein the enhanced instruction reasoning model is trained by using a preset mapping relation and environment feedback data, the preset mapping relation is used for representing the mapping relation between different state information and different control instructions, the environment feedback data is used for representing data obtained by interaction with the environment where the power equipment is located, and the target control instruction comprises: a power generation scheduling instruction of the power generation equipment and charge and discharge power of the energy storage equipment; and triggering a mode of controlling the power equipment by the target control instruction. It is easy to notice that, through training preset mapping relation and environment feedback data, the reinforced instruction reasoning model can be obtained fast, because the reinforced reasoning model can dynamically control the equipment of the power system in real time and accurately, the aim of controlling the equipment of the power system can be achieved fast, the technical effects of low execution efficiency and low training speed of the equipment of the power system are achieved, and the technical problems of low execution efficiency and low training speed of the control of the equipment of the power system through the model in the related art are solved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:
fig. 1 is a hardware configuration block diagram of a computer terminal (or mobile device) for implementing a control method of a power device according to an embodiment of the present application;
fig. 2 is a flowchart of a control method of an electric power apparatus according to embodiment 1 of the present application;
FIG. 3 is a schematic diagram of an alternative policy network according to embodiment 1 of the present application;
FIG. 4 is a schematic diagram of an alternative Q network according to embodiment 1 of the present application;
FIG. 5 is a schematic diagram of an alternative power system reinforcement learning scheduling method based on teaching learning acceleration training according to embodiment 1 of the present application;
FIG. 6 is a flowchart of an implementation of an alternative power system reinforcement learning scheduling method based on teaching learning acceleration training according to embodiment 1 of the present application;
FIG. 7 is an alternative reinforcement learning and power system interaction architecture diagram according to embodiment 1 of the present application;
FIG. 8 is a flow chart of a method of training a instructional inference model according to embodiment 2 of the present application;
fig. 9 is a flowchart of a device control method according to embodiment 3 of the present application;
fig. 10 is a flowchart of a control method of an electric power apparatus according to embodiment 4 of the present application;
FIG. 11 is a schematic diagram of an interface for operation of an alternative control method according to embodiment 4 of the present application;
fig. 12 is a flowchart of a control method of an electric power apparatus according to embodiment 5 of the present application;
fig. 13 is a control device of an electric power apparatus according to embodiment 1 of the present application;
FIG. 14 is a training apparatus of a instructional reasoning model according to embodiment 2 of the present application;
fig. 15 is a device control apparatus according to embodiment 3 of the present application;
fig. 16 is a control device of an electric power apparatus according to embodiment 4 of the present application;
fig. 17 is a control device of an electric power apparatus according to embodiment 5 of the present application;
fig. 18 is a block diagram of a computer device according to an embodiment of the present application.
Detailed Description
In order to make the present application solution better understood by those skilled in the art, the following description will be made in detail and with reference to the accompanying drawings in the embodiments of the present application, it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present application.
It should be noted that the terms "enhanced," "taught," and the like in the description and claims of the present application and in the above-described figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the present application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
First, partial terms or terminology appearing in describing embodiments of the present application are applicable to the following explanation:
reinforcement learning (Reinforcement Learning), a machine learning decision method that gradually achieves a greater return or achieves a specific goal through interaction with the environment.
Teaching learning (Learning From Instruction), which may also be referred to as imitation learning, makes decisions like a human expert by learning an exemplary training set given by the human expert.
Knowledge distillation, a model compression method, protects the knowledge learned in the original model by means of distillation, and then transfers the knowledge to the compression model, so that the compression model can learn the same knowledge in a smaller volume.
Example 1
According to embodiments of the present application, there is also provided an embodiment of a control method of an electrical device, it being noted that the steps shown in the flowchart of the drawings may be performed in a computer system such as a set of computer executable instructions, and, although a logical order is shown in the flowchart, in some cases, the steps shown or described may be performed in an order different from that shown or described herein.
The method embodiment provided in the first embodiment of the present application may be executed in a mobile terminal, a computer terminal or a similar computing device. Fig. 1 is a block diagram of a hardware configuration of a computer terminal (or mobile device) for implementing a control method of a power device according to an embodiment of the present application. As shown in fig. 1, the computer terminal 10 (or mobile device) may include one or more processors 102 (shown as 102a,102b, … …,102n in the figures), which processor 102 may include, but is not limited to, a processing means such as a microprocessor MCU or a programmable logic device FPGA, a memory 104 for storing data, and a transmission means 106 for communication functions. In addition, the method may further include: a display, an input/output interface (I/O interface), a Universal Serial BUS (USB) port (which may be included as one of the ports of the BUS), a network interface, a power supply, and/or a camera. It will be appreciated by those of ordinary skill in the art that the configuration shown in fig. 1 is merely illustrative and is not intended to limit the configuration of the electronic device described above. For example, the computer terminal 10 may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
It should be noted that the one or more processors 102 and/or other data processing circuits described above may be referred to generally herein as "data processing circuits. The data processing circuit may be embodied in whole or in part in software, hardware, firmware, or any other combination. Furthermore, the data processing circuitry may be a single stand-alone processing module, or incorporated, in whole or in part, into any of the other elements in the computer terminal 10 (or mobile device). As referred to in the embodiments of the present application, the data processing circuit acts as a processor control (e.g., selection of the path of the variable resistor termination to interface).
The memory 104 may be used to store software programs and modules of application software, such as program instructions/data storage devices corresponding to the control method of the electrical device in the embodiments of the present application, and the processor 102 executes the software programs and modules stored in the memory 104, thereby executing various functional applications and data processing, that is, implementing the control method of the electrical device described above. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the computer terminal 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission means 106 is arranged to receive or transmit data via a network. The specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal 10. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module for communicating with the internet wirelessly.
The display may be, for example, a touch screen type Liquid Crystal Display (LCD) that may enable a user to interact with a user interface of the computer terminal 10 (or mobile device).
It should be noted here that, in some alternative embodiments, the computer device (or mobile device) shown in fig. 1 described above may include hardware elements (including circuitry), software elements (including computer code stored on a computer-readable medium), or a combination of both hardware and software elements. It should be noted that fig. 1 is only one example of a specific example, and is intended to illustrate the types of components that may be present in the computer device (or mobile device) described above.
In the above-described operating environment, the present application provides a control method of the power device as shown in fig. 2. Fig. 2 is a flowchart of a control method of an electric power apparatus according to embodiment 1 of the present application, as shown in fig. 2, the method including the steps of:
step S202, collecting measurement data of the power equipment in a power generation state and a charging and discharging state, and obtaining first state information of the power equipment.
In the technical solution provided in step S202 of the present application, the above-mentioned power device may be any one or more devices that need to be controlled in the power system, and may include, but is not limited to: generating set, energy storage equipment. The above measurement data may be data acquired after the power device is measured in the current power generation state and the charge-discharge state, and may include, but is not limited to: the method comprises the steps of generating and using power of adjustable equipment, voltage of each node, current energy state of energy storage equipment, predicted power of each renewable energy unit, predicted power of each node load, connection state of a transmission line, active load and reactive load of each node and time labels. The first state information may be information that is obtained by analyzing based on measurement data and is capable of representing a power generation state and a charge/discharge state of the power device, and may be, for example, but not limited to, a normal power generation operation state, an abnormal power utilization operation state, an abnormal voltage operation state of each node, and the like of the adjustable device.
In an alternative embodiment, when control (e.g. scheduling) of the power equipment in the power system is required, firstly measurement data measured on the power equipment in a power generation state and a charge-discharge state may be collected, and secondly state information of the power equipment may be obtained based on the collected measurement data.
Step S204, mapping the first state information by using an enhanced instruction reasoning model to obtain a target control instruction of the power equipment, wherein the enhanced instruction reasoning model is trained by using a preset mapping relation and environment feedback data, the preset mapping relation is used for representing the mapping relation between different state information and different control instructions, the environment feedback data is used for representing data obtained by interaction with an environment where the power equipment is located, and the target control instruction comprises: and a power generation scheduling instruction of the power generation equipment and charge and discharge power of the energy storage equipment.
The above-mentioned preset mapping relationship may be a mapping relationship capable of reflecting the corresponding relationship between the state information and the control instruction of different electric power devices, for example, but not limited to, a mapping relationship in an expert knowledge base, where the preset mapping relationship may be represented as a state-expert scheduling instruction, but not limited to; for example, the power generation operation state abnormality of the adjustable device may be, but not limited to, the control of the operation stop of the adjustable device, or the power consumption operation state abnormality of the adjustable device may be, but not limited to, the control of the operation stop of the adjustable device. It should be noted that the expert knowledge base is a set of domain knowledge required for solving a problem, including but not limited to: the knowledge in the expert knowledge base, which is derived from the domain expert, may be, in this embodiment, composed of a large number of discrete power scheduling cases, may be obtained by querying a scheduler history, consulting expert experience, solving an optimization model, etc., and may be, for example, but not limited to, a production knowledge base.
The reinforced instruction inference model may be a multi-layer neural network inference model obtained by training a preset mapping relation and environment feedback data, where the specific type and layer number of the neural network are not limited in this embodiment, the user may set the neural network according to the actual requirement, the environment feedback data may be data obtained by interacting with an environment where the power equipment is located, for example, after the reinforced instruction inference model sends a control instruction of "power failure" to an adjustable device in the power equipment, the adjustable device may feedback an execution result to the reinforced instruction inference model, for example, may be "power failure" or "abnormal occurrence of executing power failure operation", and the like, but is not limited thereto.
The target control command may be a command output by the enhanced command inference model, and the command used for controlling the power equipment may include, but is not limited to: and a power generation scheduling instruction of the power generation equipment and charge and discharge power of the energy storage equipment.
In an alternative embodiment, after the first state information of the power equipment is obtained, mapping processing can be performed on the first state information through the enhanced instruction inference model, so that the target control instruction of the power equipment can be obtained. It should be noted that, the mapping process in the embodiment of the present application may include, but is not limited to: one-to-one mapping, many-to-one mapping, and one-to-many mapping, the specific mapping relationship user can set itself according to the actual requirement.
The reinforced instruction reasoning model is obtained through training of a preset mapping relation and environment feedback data, wherein the preset mapping relation is used for representing the mapping relation between state information of different power equipment and control instructions, and the state information can include but is not limited to: the method comprises the steps of generating power and electricity consumption power of different types of adjustable equipment at the previous moment, different node voltages, the current energy state of energy storage equipment, different renewable energy unit predicted powers, different node load predicted powers, transmission line connection states, active loads and reactive loads of different nodes and time labels. The control instructions may include, but are not limited to: generating scheduling instructions of all generator sets (including a conventional set and a renewable energy set) in the power equipment and charging/discharging power of the energy storage equipment.
Step S206, triggering a target control instruction to control the power equipment.
In an alternative embodiment, after the intensive command inference model outputs the target control command of the device to be controlled, the power system may trigger the target control command to control the power device.
In another alternative embodiment, when the target control command is "control the tunable device to stop running", the power system may trigger the target control command to control the tunable device (i.e., the power device).
In another alternative embodiment, when the target control command is "control the generated power of the tunable device to decrease", the power system may trigger the target control command to control the tunable device (i.e., the power device).
In the embodiment of the application, measurement data of the power equipment in a power generation state and a charging and discharging state are acquired to obtain first state information of the power equipment; the first state information is mapped by using an enhanced instruction reasoning model to obtain a target control instruction of the power equipment, wherein the enhanced instruction reasoning model is trained by using a preset mapping relation and environment feedback data, the preset mapping relation is used for representing the mapping relation between different state information and different control instructions, the environment feedback data is used for representing data obtained by interaction with the environment where the power equipment is located, and the target control instruction comprises: a power generation scheduling instruction of the power generation equipment and charge and discharge power of the energy storage equipment; and triggering a mode of controlling the power equipment by the target control instruction. It is easy to notice that, through training preset mapping relation and environment feedback data, the reinforced instruction reasoning model can be obtained fast, because the reinforced reasoning model can carry out dynamic control on the power equipment in real time and accurately, the purpose of controlling the power equipment fast is achieved, the technical effects of improving the execution efficiency and training speed of controlling the power equipment through the model are achieved, and the technical problems of low execution efficiency and training speed of controlling the power equipment through the model in the related art are solved.
The above-described method of this embodiment is further described below.
In the above embodiment of the present application, the method further includes: acquiring second state information of the power equipment and a teaching instruction reasoning model, wherein the teaching instruction reasoning model is obtained by training through a preset mapping relation; mapping the second state information by using a teaching instruction reasoning model to obtain a teaching control instruction; mapping the second state information by using the initial instruction reasoning model to obtain an enhanced control instruction, wherein the environment feedback data is feedback data obtained by controlling the power equipment based on the enhanced control instruction; and adjusting model parameters of the initial instruction reasoning model based on the teaching control instruction, the strengthening control instruction and the environment feedback data to obtain the strengthening instruction reasoning model.
The second state information may be information that is obtained in advance by a user and can indicate a historical operation state of the power device, and in this embodiment of the present application, the second state information may include, but is not limited to: the historical operating state of the power equipment in the simulation environment of the power system and the historical operating state of the power equipment in the actual measurement environment of the power system. It should be noted that the power system has been highly digitized, many parameters can be obtained through evaluation, the accuracy of the simulation environment of the power system (the difference between the simulation environment and the actual/measured environment is smaller) is higher, and more power grid scenes and training data can be generated through supporting simulation.
The teaching instruction reasoning model can be a multi-layer neural network model obtained by teaching and learning a preset mapping relation, wherein the specific type and the layer number of the neural network are not limited in the embodiment, and a user can set the model according to actual requirements.
In an alternative embodiment, firstly, an inference model can be endowed with the capability of teaching learning, and secondly, the inference model can obtain a teaching instruction inference model through teaching learning the mapping relation in the expert knowledge base, wherein the teaching instruction inference model can make a decision like a human expert through learning the mapping relation in the expert knowledge base, that is, a teaching control instruction output by the teaching instruction inference model can be directly used as a final control instruction corresponding to the second state information, and no expert is required to make a decision. Therefore, the teaching instruction control model is added into the training process of the reinforced instruction reasoning model, the reinforced instruction reasoning model can realize unsupervised training, and further the reinforced instruction reasoning model can rapidly and accurately obtain a target control instruction based on a preset mapping relation and environment feedback data, and further the execution efficiency and the training speed of controlling the power equipment through the model can be improved.
The teaching control instruction may be an instruction output by a teaching instruction reasoning model and used for controlling the power equipment, where the teaching control instruction is obtained by mapping the second state information by the teaching instruction reasoning model.
The historical operating state may be a motion state of the power device before the current state, for example, may be a power generation operating state, a power utilization operating state, a different node voltage operating state, or the like of the adjustable device before the current state, but is not limited thereto. The reinforcement control instruction may be an instruction for controlling the power device obtained by mapping the second state information through the initial instruction inference model. The initial instruction inference model may be an inference model formed by a multi-layer neural network, wherein the initial instruction inference model may be used for teaching learning to obtain a teaching instruction inference model.
In an optional embodiment, when a user needs to control the power equipment in the power system, first, second state information of the power equipment and a teaching instruction reasoning model are acquired, wherein the teaching instruction reasoning model is obtained by training with a preset mapping relation; secondly, mapping the second state information through a teaching instruction reasoning model to obtain a teaching control instruction; then, mapping the second state information by using an initial instruction reasoning model to obtain an enhanced control instruction, wherein the environment feedback data is feedback data obtained by controlling the power equipment based on the enhanced control instruction; and adjusting model parameters of the initial instruction reasoning model based on the teaching control instruction, the strengthening control instruction and the environment feedback data to obtain the strengthening instruction reasoning model.
In another alternative embodiment, knowledge distillation can be performed on a preset mapping relation in an expert knowledge base through teaching learning, simulation learning is performed on a state-action mapping relation in the expert knowledge base through a multi-layer neural network in teaching learning, the multi-layer neural network is trained with a small deviation of a neural network output instruction and an expert knowledge base instruction as a target, a multi-layer neural network corresponding to a teaching learning scheduling strategy, namely a teaching instruction reasoning model is obtained, and then the training can be performed on the teaching instruction reasoning model through environment feedback data, so that a reinforced instruction reasoning model is obtained. After the inference control instruction of the power equipment is output through the reinforcement instruction inference model, the power equipment can be scheduled based on the inference control instruction, at the moment, the power equipment can return a scheduling execution result (namely environment feedback data), the reinforcement instruction model can be continuously trained through a preset mapping relation and the environment feedback data, the trained reinforcement instruction inference model can be continuously obtained, further, a more accurate reinforcement instruction inference model can be obtained, namely, model parameters of the reinforcement instruction inference model can be adjusted through the preset mapping relation and the environment feedback data, and further, the execution efficiency and the training speed for controlling the power equipment can be continuously improved.
The second state information may be a historical running state of the power equipment in a simulation environment and an actual measurement environment; secondly, mapping the second state information through a teaching instruction reasoning model to obtain a teaching control instruction; then, mapping the second state information through an initial instruction reasoning model to obtain an enhanced control instruction; then, a target loss function value can be constructed based on the strengthening control instruction and the teaching control instruction; finally, model parameters of the initial instruction reasoning model can be adjusted based on the objective loss function value, and the enhanced instruction reasoning model is obtained.
In the above embodiment of the present application, model parameters of an initial instruction inference model are adjusted based on a teaching control instruction, an enhancement control instruction and environmental feedback data to obtain an enhancement instruction inference model, including: determining a first loss function value and a second loss function value based on environmental feedback data, wherein the first loss function value is used for representing the running cost of the power equipment under the condition that the power equipment is executed according to the strengthening control instruction, and the second loss function value is used for representing the probability that the branch power connected with the power equipment exceeds the preset power under the condition that the power equipment is executed according to the strengthening control instruction; obtaining the deviation of the strengthening control command and the teaching control command to obtain a third loss function value; and adjusting model parameters based on the first loss function, the second loss function and the third loss function to obtain an enhanced instruction reasoning model.
The first loss function described above may be expressed as
Figure SMS_1
The second loss function mentioned above can be expressed as +.>
Figure SMS_2
But is not limited thereto. The above-mentioned preset power may be set in advance by a user and used for determining the power of the second loss function, where the second loss function may be determined by a probability that the branch power of the power device connection exceeds the preset power. The third loss function described above may be expressed as +.>
Figure SMS_3
But is not limited thereto.
In an alternative embodiment, the first loss function and the second loss function may be determined based on the environmental feedback data, where it is noted that the first loss function may be an operation cost of the power device when the power device executes according to the enhanced control instruction; the second loss function may be a probability that the branch power of the power equipment connection exceeds a preset power under the condition that the power equipment executes according to the strengthening control instruction; secondly, the deviation of the strengthening control instruction and the teaching control instruction can be obtained, and a third loss function is obtained; finally, model parameters can be adjusted based on the first loss function, the second loss function and the third loss function to obtain the enhanced instruction inference model.
In another alternative embodiment, two types of neural networks may be constructed during the reinforcement learning training process, which may be denoted as a policy network and a Q network. FIG. 3 is a schematic diagram of an alternative policy network according to embodiment 1 of the present applicationAs shown in fig. 3, the policy network is formed by multiple-input and multiple-output multi-layer neural networks, the policy network takes the current state of the system as input, takes the mean and variance of actions as output, and takes the smaller relative entropy (Kullback-Leibler) divergence as a target to optimize the scheduling policy, wherein the specific neural network type is not limited in the embodiment, and a user can set the policy according to actual requirements. As can be seen from FIG. 3, the average value can be output by inputting the state St of the current system into the policy network
Figure SMS_4
Sum of variances->
Figure SMS_5
Further, action can be obtained>
Figure SMS_6
(i.e., scheduling instructions), wherein ∈>
Figure SMS_7
Obeying a gaussian distribution N, which can be expressed as +.>
Figure SMS_8
Fig. 4 is a schematic diagram of an alternative Q network according to embodiment 1 of the present application, as shown in fig. 4, where the Q network is formed by a multi-input, single-output, multi-layer neural network, and the specific neural network type is not limited in this embodiment, and a user can set the Q network according to the actual requirement. As can be seen from FIG. 4, the state St and actions taken at St of the current system
Figure SMS_9
The Q value may be output when the Q value is input to the Q network, where the Q value is used to represent a future prize value corresponding to the scheduling policy, and in this embodiment of the present application, the future prize value may be a smaller Soft Bellman (Soft Bellman) residual, but is not limited thereto.
In the above embodiment of the present application, adjusting the model parameters based on the first loss function, the second loss function, and the third loss function to obtain the enhanced instruction inference model includes: obtaining a target weight coefficient corresponding to the third loss function value; obtaining a product of the target weight coefficient and the third loss function value to obtain a fourth loss function value; obtaining the sum of a first loss function value, a second loss function value and a fourth loss function value to obtain a target loss function value; and adjusting the model parameters based on the objective loss function value to obtain the reinforced instruction reasoning model.
The target weight coefficient can be expressed as
Figure SMS_10
. The fourth loss function value described above may be expressed as +.>
Figure SMS_11
. The objective loss function value may be a function value that can be obtained by adjusting model parameters of the initial instruction inference model to obtain the enhanced instruction inference model, and may be expressed as +.>
Figure SMS_12
But is not limited thereto.
In an alternative embodiment, the objective loss function value may be derived from the following equation:
Figure SMS_13
wherein,,
Figure SMS_14
for the objective loss function value, < >>
Figure SMS_15
To strengthen the loss function value->
Figure SMS_16
To teach the loss function value, < >>
Figure SMS_17
Is the fourth loss function value.
In another alternative embodiment, after the objective loss function value is obtained, parameters of the model may be adjusted based on the objective loss function value, so as to obtain the enhanced instruction inference model.
In the initial stage of training, in order to improve the convergence rate of reinforcement learning, the instruction given by teaching learning should be referred to, and the weight coefficient corresponding to the deviation of the instruction given by teaching learning should be set to a larger value, so as to promote the reinforcement learning scheduling strategy to quickly become optimal; along with training, the weight coefficient corresponding to the instruction deviation of teaching learning guide is gradually reduced, the decision advantage of reinforcement learning is fully exerted, and the long-term benefit of a scheduling strategy is improved.
In the above embodiment of the present application, obtaining the target weight coefficient corresponding to the third loss function value includes: determining the current iteration times; updating the initial weight coefficient of the third loss function value based on the current iteration number, the preset attenuation coefficient and the preset updating step length to obtain a target weight coefficient.
The current iteration number may be expressed as t, and the preset decay coefficient may be expressed as
Figure SMS_18
The above-mentioned preset update step size can be expressed as +.>
Figure SMS_19
The initial weight coefficient described above can be expressed as +.>
Figure SMS_20
The target weight coefficient may be expressed as
Figure SMS_21
In an alternative embodiment, the target weight coefficient may be derived from the following formula:
Figure SMS_22
where e represents a constant.
In the above embodiment of the present application, the method further includes: acquiring historical state information, wherein the historical state information at least comprises: status information of different devices over a historical period of time; generating a history control instruction corresponding to the history state information based on the history state information; and constructing a preset mapping relation based on the history state information and the history control instruction.
The above-described history period may be a period of time before the current state. The upper depth history control instruction may be an instruction to control a different device based on history state information.
In an alternative embodiment, first, historical state information of different power devices can be obtained; and generating a corresponding historical control instruction based on the historical state information, and finally constructing a preset mapping relation based on the historical state information and the historical control instruction.
In another alternative embodiment, the expert knowledge base (i.e., the preset mapping relationship) is composed of a large number of discrete scheduling cases, and can be obtained by querying the scheduler's historical scheduling record, consulting expert experience, solving an optimization model, and the like. The expert scheduling instruction includes a scheduling instruction (i.e., a strengthening control instruction) of the energy storage device and a scheduling instruction (i.e., a teaching instruction) of the power generation device.
In the above embodiment of the present application, when the history control instruction includes an enhanced control instruction of the energy storage device, generating, based on the history state information, the history control instruction corresponding to the history state information includes: determining a charge-discharge switching threshold of the energy storage device; based on the payload values and charge-discharge switching thresholds of the different devices, historical control instructions are generated.
The above-described charge-discharge switching threshold may be a threshold designed based on expert experience.
In an alternative embodiment, a net load prediction curve of the power system may be calculated first, wherein the net load prediction curve is derived from the load prediction curve minus the renewable energy prediction curve; secondly, a charge-discharge switching threshold value of the energy storage device can be designed based on expert experience; and finally, calculating a scheduling instruction (namely an intensified control instruction) of the energy storage device according to the system payload values of different scenes (such as a simulation environment and an actual measurement environment) and the charge-discharge switching threshold value of the energy storage device, so that the energy storage device is charged in the low valley period of the system payload and discharged in the peak period of the system payload, and the renewable energy source consumption and the peak load clipping and valley filling of the electric load are promoted.
In the above embodiment of the present application, in a case where the history control instruction includes a teaching control instruction of the power generation apparatus, the method further includes: acquiring energy storage charging power, load power and renewable energy power in the historical state information; constructing an instruction generating function based on the stored energy charging power, the load power and the renewable energy power; and solving the instruction generating function to obtain the teaching control instruction.
In an alternative embodiment, the energy storage charging point electric power, the load power of different nodes of the power grid and the renewable energy power of different nodes of the power grid in different scenes can be used as the input of a teaching instruction reasoning model, the running cost of the power system is reduced, an instruction generating function is constructed by considering the power balance constraint of the power system, the flow safety constraint of different branches and the processing range constraint of each generator set, an optimization solver is called to solve the instruction generating function, and the dispatching instructions (namely teaching control instructions) of different generator sets are output.
In the above embodiment of the present application, the second status information is obtained, including one of the following: collecting simulation data of the power equipment in a historical operation state to obtain second state information; and acquiring the historical operation state of the power equipment for measurement to obtain second state information.
In an alternative embodiment, the second status information may include, but is not limited to: collecting simulation data of the power equipment in a historical operation state; and acquiring historical operation states of the power equipment for measurement, and obtaining measurement data.
As renewable energy duty cycle increases and the grid operating environment becomes more complex, there is a need to build new power system scheduling modes for knowledge-data fusion. In order to improve the training efficiency of reinforcement learning, the present application proposes a reinforcement learning method based on teaching learning auxiliary acceleration, and further constructs a novel power scheduling mode of knowledge-data fusion, and fig. 5 is a schematic diagram of an optional reinforcement learning scheduling method of a power system based on teaching learning acceleration training according to embodiment 1 of the present application, as shown in fig. 5, where the system includes: the system comprises a reinforcement learning algorithm part, a power system environment part, a data storage and a teaching learning part, wherein the reinforcement learning algorithm transmits output actions (namely target control instructions) to the power system environment, the power system environment transmits the actions to the data storage for storage, the data storage transmits scene case states to the teaching learning, and the state, rewards and safety out-of-limit conditions are transmitted to the reinforcement learning algorithm after batch training, and the teaching learning output teaching learning guiding instructions assist in accelerating the training of the reinforcement learning algorithm.
The reinforcement learning algorithm consists of a multi-layer neural network, the power grid state and the equipment state of the power system are input into the multi-layer neural network, parameters of the neural network can be updated through strategy training, and scheduling instructions of various adjustable equipment such as source-network-load-storage and the like can be output time by time after the training of the neural network. The power system environment includes: the simulation environment and the actual environment simulated by the simulator are high-digitized, and many parameters can be obtained through evaluation, so that the accuracy of the simulation environment of the power system (the difference from the actual/measured environment is smaller) is higher and higher, and more power grid scenes and training data can be generated through supporting simulation. The memory is used for storing training data. Teaching learning can constrain the power grid and the equipment model by minimizing the system cost and expert experience, and the specific contents include: expert knowledge base composed of discrete space, simulation learning of multi-layer neural network and teaching scheduling strategy of continuous space.
FIG. 6 is a flowchart of an implementation of an alternative power system reinforcement learning scheduling method based on teaching learning acceleration training according to embodiment 1 of the present application, as shown in FIG. 6, the method includes the steps of:
The first step: different types of neural networks in teaching learning and reinforcement learning are initialized, and learning speed, updating step length and power system simulation environment parameters are input. Extracting a large number of operation scenes from historical data of the power system, calculating corresponding expert scheduling instructions of each operation scene based on expert experience and optimization modeling, generating a large number of discrete state-action mapping relation cases, and forming an expert knowledge base. It should be noted that, the action is a term of reinforcement learning, and the expert scheduling instruction belongs to a physical problem, and the expert scheduling instruction are essentially the same. The expert knowledge base consists of a large number of discrete scheduling cases, and can be obtained by inquiring the historical scheduling records of the schedulers, consulting expert experience, solving an optimization model and the like. The expert scheduling instructions comprise scheduling instructions of the energy storage equipment and scheduling instructions of the generator set. The application provides a method for calculating expert scheduling instructions corresponding to different operation scenes based on expert experience and optimization modeling, wherein the specific calculation content is as follows:
(1) The energy storage scheduling instruction calculating method comprises the following steps: a system payload prediction curve (equal to the load prediction curve minus the renewable energy prediction curve) is calculated. And designing a charge-discharge switching threshold of the energy storage device based on expert experience. According to the system payload values of different scenes and the charge-discharge switching threshold values of the energy storage equipment, the scheduling instructions of the energy storage equipment are calculated, so that the energy storage equipment is charged in the low valley period of the system payload and discharged in the peak period of the system payload, and the renewable energy consumption and the peak load clipping and valley filling of the electric power load are promoted.
(2) The method for calculating the dispatching instruction of the generator set comprises the following steps: and taking the stored energy charging point electric power, the load power of different nodes of the power grid and the renewable energy power of different nodes of the power grid in different scenes as inputs, taking the operation of a smaller electric power system as a target, taking the power balance constraint of the electric power system, the power flow safety constraint of different branches and the output range constraint of different generating sets into consideration, constructing an optimization problem, calling an optimization solver to solve the optimization problem, and outputting scheduling instructions of different generating sets.
And a second step of: and carrying out knowledge distillation on the expert knowledge base by utilizing teaching learning, carrying out imitation learning on the state-action mapping relation in the expert knowledge base by adopting a multi-layer neural network, and training the multi-layer neural network by taking the small deviation between the neural network output instruction and the expert knowledge base instruction as a target to obtain the multi-layer neural network corresponding to the teaching learning scheduling strategy.
And a third step of: according to the simulation environment of the power system or the actual measurement state of the power system, training the power system dispatching strategy by using a reinforcement learning method, and providing dispatching reference instructions for reinforcement learning by using teaching learning.
The training method of the reinforcement learning model is as follows:
(1) The reinforcement learning reward function (training goal,
Figure SMS_23
) Consists of three parts, which respectively comprise the operation cost of the electric power system +.>
Figure SMS_24
Branch power out-of-limit penalty>
Figure SMS_25
Deviation between reinforcement learning output instruction and teaching learning guide instruction ∈ ->
Figure SMS_26
In the initial stage of training, the instruction given by the teaching learning should be referred to in order to improve the convergence rate of reinforcement learning, and the weight coefficient corresponding to the deviation of the instruction for teaching learning should be set to a larger value to promote the reinforcement learning scheduling strategy to quickly become optimal; along with training, the weight coefficient corresponding to the instruction deviation of teaching learning guide is gradually reduced, the decision advantage of reinforcement learning is fully exerted, and the long-term benefit of a scheduling strategy is improved. Thus, the weight coefficient
Figure SMS_27
The updating method of (2) is designed as follows:
Figure SMS_28
wherein,,
Figure SMS_29
is the instruction weight coefficient +.>
Figure SMS_30
Attenuation coefficient of (a); />
Figure SMS_31
Is the initial value of the teaching instruction weight coefficient;
Figure SMS_32
is the initial update step of the teaching instruction weight coefficient. />
(2) The state space of reinforcement learning comprises power generation and power utilization power of different types of adjustable equipment at the previous moment, voltage of different points, current energy state of energy storage equipment, predicted power of different renewable energy units, predicted power of different node loads, connection state of a transmission line, active load and reactive load of different nodes and time labels. The action space of reinforcement learning comprises power generation scheduling instructions of each generator set (comprising a conventional set and a renewable energy set) and charge/discharge power of energy storage equipment.
(3) In the reinforcement learning training process, two types of neural networks, namely a strategy network and a Q network, are constructed, as shown in fig. 3 and 4. Wherein the Q network is used to evaluate the state and action conditions (action is taken in the case of state St
Figure SMS_33
) A corresponding future prize value; the policy network outputs actions according to the state St of the current system>
Figure SMS_34
(scheduling instructions) to achieve a larger future reward. The Q network takes the current state of the system and the reinforcement learning output action as input, takes the Q value (representing future rewards) as output, and takes the minimized Soft Bellman residual error as a target to evaluate the future rewards of the scheduling strategy; the policy network takes the current state of the system as input and takes actionThe mean and variance are used as outputs, and the scheduling strategy is optimized with the aim of minimizing the Kullback-Leibler divergence.
Fig. 7 is an optional reinforcement learning and power system interaction structure diagram according to embodiment 1 of the present application, as shown in fig. 7, in the reinforcement learning intelligent agent, the power grid regulation center continuously interacts with the power grid and the adjustable device in the training environment by sending actions (i.e. scheduling instructions) in real time, and the power grid and the adjustable device in the training environment feed back states (i.e. system states) to the power grid regulation center in real time, so as to generate and update sample spaces (including current states, current actions, current rewards and states at the next moment), respectively train different types of neural networks in the reinforcement learning according to the sample data of the sample spaces, dynamically update parameters of the different types of neural networks, and realize continuous optimization of scheduling strategies.
The application provides a power system reinforcement learning scheduling method based on teaching learning acceleration training, which is used for solving the scheduling problem of a complex power system. Compared with the prior art, the method has the advantages of high decision efficiency, high convergence speed and low dependence on environmental models and parameters. The specific contents are as follows:
1. the decision efficiency is high: the method and the device fully utilize the rapid decision advantage of reinforcement learning, and compared with an optimization method considering safety constraint, the method and the device do not depend on an accurate power system model any more, and the decision efficiency is higher.
2. The convergence speed is high: the method and the device adopt expert knowledge and a teaching learning method to guide the reinforcement learning output action, promote the reinforcement learning algorithm to quickly trend to be excellent, and have higher convergence speed compared with the traditional reinforcement learning method.
3. The degree of dependence on environmental models and parameters is low: the method adopts a knowledge-data fusion reinforcement learning architecture, takes a scheduling instruction as the action quantity of reinforcement learning, takes the running environment of the electric power system as the exploration object of reinforcement learning, and dynamically adjusts and updates the scheduling strategy according to the actual feedback state of the electric power system. The training and optimization of the scheduling strategy can be completed through continuous interaction with the environment, and compared with the existing products and technologies, the method has the advantages that the dependence on the power system model and parameters is greatly reduced.
The key technical innovation points of the application are listed as follows:
(1) The teaching learning scheduling method based on expert knowledge distillation is provided, limited and discrete scheduling cases in the expert knowledge base are distilled into a multi-layer neural network, and then infinite and continuous guiding instructions are output, so that scheduling experience and knowledge can be provided for scenes lacking expert experience.
(2) Compared with the existing scheduling method, the proposed method has higher decision efficiency compared with the optimization method and has higher convergence speed compared with the traditional reinforcement learning method.
(3) The method realizes the fusion of expert knowledge and a data driving method, and fully exerts respective advantages of teaching learning and reinforcement learning. In the initial stage of training, training acceleration is carried out on reinforcement learning by using expert knowledge and teaching learning guiding instructions, and rapid optimization of reinforcement learning algorithm is promoted. With the continuous deep training, the bottleneck of expert experience and knowledge model can be further broken through by combining environmental feedback and mass data.
It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or fully authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region, and provide corresponding operation entries for the user to select authorization or rejection.
It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of action combinations, but it should be understood by those skilled in the art that the present application is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required in the present application.
From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus a necessary general hardware platform, but that it may also be implemented by means of hardware. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method described in the embodiments of the present application.
Example 2
In accordance with embodiments of the present application, there is also provided an embodiment of a training method of an instruction inference model, it being noted that the steps illustrated in the flowchart of the figures may be performed in a computer system, such as a set of computer executable instructions, and that, although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order different from that illustrated herein.
The present application provides a training method of a instructional reasoning model as shown in fig. 8. Fig. 8 is a flowchart of a training method of a instructional inference model according to embodiment 2 of the present application, as shown in fig. 8, the method may include the steps of:
step S802, a preset mapping relation corresponding to the power equipment is obtained, wherein the preset mapping relation is used for representing the mapping relation between different state information and different control instructions;
step S804, training an initial instruction reasoning model by using a preset mapping relation and environment feedback data to obtain an enhanced instruction reasoning model, wherein the environment feedback data is used for representing data obtained by interaction with an environment where the power equipment is located;
in an optional embodiment, when the instruction inference model needs to be trained, a preset mapping relation corresponding to the power equipment can be obtained first, wherein the preset mapping relation is used for representing mapping relations between different state information and different control instructions; and training a preset mapping relation and environment feedback data to obtain an enhanced instruction reasoning model, wherein the environment feedback data is used for representing data obtained based on interaction with the environment where the power equipment is located.
It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of action combinations, but it should be understood by those skilled in the art that the present application is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required in the present application.
From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus a necessary general hardware platform, but that it may also be implemented by means of hardware. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method described in the embodiments of the present application.
Example 3
In accordance with the embodiments of the present application, there is also provided an embodiment of a device control method, it being noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer executable instructions, and although a logical order is illustrated in the flowcharts, in some cases the steps illustrated or described may be performed in an order other than that illustrated herein.
The present application provides a device control method as shown in fig. 9. Fig. 9 is a flowchart of a device control method according to embodiment 3 of the present application, as shown in fig. 9, which may include the steps of:
step S902, collecting measurement data of equipment to be controlled in an operation state to obtain first state information of the equipment to be controlled;
step S904, mapping the first state information by using an enhanced instruction reasoning model to obtain a target control instruction of the equipment to be controlled, wherein the enhanced instruction reasoning model is trained by using a preset mapping relation and environment feedback data, the preset mapping relation is used for representing the mapping relation between different state information and different control instructions, the environment feedback data is used for representing data obtained by interaction with the environment where the power equipment is located, and the preset mapping relation is used for representing the mapping relation between the state information of different equipment and the control instructions;
Step S906, triggering a target control command to control the device to be controlled.
The device to be controlled may be a device that needs to perform power scheduling in the power device, and may be a plurality of devices or one device, and the specific number is not limited in this embodiment.
In an alternative embodiment, when the device to be controlled in the power device needs to be controlled, measurement data of the device to be controlled in an operation state can be collected first to obtain first state information of the device to be controlled; and secondly, mapping the first state information by using an enhanced instruction reasoning model to obtain a target control instruction of the equipment to be controlled, wherein the enhanced instruction reasoning model is trained by using a preset mapping relation and environment feedback data, the preset mapping relation is used for representing the mapping relation between different state information and different control instructions, the environment feedback data is used for representing data obtained by interaction with the environment where the power equipment is located, and the preset mapping relation is used for representing the mapping relation between the state information of different equipment and the control instructions.
It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of action combinations, but it should be understood by those skilled in the art that the present application is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required in the present application.
From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus a necessary general hardware platform, but that it may also be implemented by means of hardware. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method described in the embodiments of the present application.
Example 4
In accordance with the embodiments of the present application, there is also provided an embodiment of a device control method, it being noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer executable instructions, and although a logical order is illustrated in the flowcharts, in some cases the steps illustrated or described may be performed in an order other than that illustrated herein.
The present application provides a control method of an electrical device as shown in fig. 10. Fig. 10 is a flowchart of a control method of an electric power apparatus according to embodiment 4 of the present application, as shown in fig. 10, the method including the steps of:
Step S1002, in response to an input instruction acting on an operation interface, displaying first state information of the power equipment on the operation interface, wherein the first state information is obtained by measuring an operation state of the power equipment;
step S1004, a command is generated in response to a command acting on an operation interface, and a target control command of the power equipment is displayed on the operation interface, wherein the target control command is obtained by mapping state information by using an enhanced command inference model, the enhanced command inference model is obtained by training by using a preset mapping relation and environment feedback data, the environment feedback data is used for representing data obtained by interaction with an environment where the power equipment is located, and the preset mapping relation is used for representing mapping relations between different state information and different control commands.
Fig. 11 is a schematic diagram of an operation interface of an alternative control method according to embodiment 4 of the present application, as shown in fig. 11, including: the display area is used for displaying state information of the power equipment and a target control instruction of the power equipment; an input instruction input area for receiving an input instruction input by a user; the instruction generation instruction input area is used for receiving an instruction generation instruction input by a user.
In an alternative embodiment, in response to an input instruction acted on the operation interface by a user, first state information of the power equipment can be displayed on the operation interface, wherein the first state information is obtained by measuring the operation state of the power equipment; and then, responding to an instruction generation instruction acted on the operation interface by a user, and displaying a target control instruction of the power equipment on the operation interface, wherein the target control instruction is obtained by mapping state information by utilizing an enhanced instruction reasoning model, the enhanced instruction reasoning model is obtained by training by utilizing a preset mapping relation and environment feedback data, the environment feedback data is used for representing data obtained by interaction with an environment where the power equipment is located, and the preset mapping relation is used for representing mapping relations between different state information and different control instructions.
It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of action combinations, but it should be understood by those skilled in the art that the present application is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required in the present application.
From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus a necessary general hardware platform, but that it may also be implemented by means of hardware. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method described in the embodiments of the present application.
Example 5
According to embodiments of the present application, there is also provided an embodiment of a control method of an electrical device, it being noted that the steps shown in the flowchart of the drawings may be performed in a computer system such as a set of computer executable instructions, and, although a logical order is shown in the flowchart, in some cases, the steps shown or described may be performed in an order different from that shown or described herein.
The present application provides a control method of an electrical device as shown in fig. 12. Fig. 12 is a flowchart of a control method of an electric power apparatus according to embodiment 5 of the present application, as shown in fig. 12, the method including the steps of:
Step S1202, acquiring first state information of the power equipment by calling an enhancement interface, wherein the enhancement interface comprises enhancement parameters, the parameter values of the enhancement parameters are the first state information, and the first state information is acquired by acquiring measurement data of the power equipment in an operation state;
step S1204, mapping the state information by using an enhanced instruction reasoning model to obtain a target control instruction of the power equipment, wherein the enhanced instruction reasoning model is trained by using a preset mapping relation and environment feedback data, the environment feedback data is used for representing data obtained by interaction with an environment where the power equipment is located, and the preset mapping relation is used for representing mapping relations between different state information and different control instructions;
step S1206, outputting a target control instruction by calling a teaching interface, wherein the teaching interface comprises teaching parameters, and parameter values of the teaching parameters are the target control instruction.
The enhanced interface may be an interface for the enhanced instruction inference model to obtain the first state information from the power device. The teaching interface may be an interface for outputting the target control command to the power equipment by the enhanced command inference model.
In an alternative embodiment, first state information of the power equipment can be obtained by calling an enhancement interface, wherein the enhancement interface comprises enhancement parameters, parameter values of the enhancement parameters are the first state information, and the first state information is obtained by collecting measurement data of the power equipment in an operation state; secondly, mapping the state information by using an enhanced instruction reasoning model to obtain a target control instruction of the power equipment, wherein the enhanced instruction reasoning model is trained by using a preset mapping relation and environment feedback data, the environment feedback data is used for representing data obtained by interaction with an environment where the power equipment is located, and the preset mapping relation is used for representing mapping relations between different state information and different control instructions; and outputting a target control instruction to the power equipment by calling a teaching interface, wherein the teaching interface comprises teaching parameters, and parameter values of the teaching parameters are the target control instruction.
It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of action combinations, but it should be understood by those skilled in the art that the present application is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required in the present application.
From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus a necessary general hardware platform, but that it may also be implemented by means of hardware. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method described in the embodiments of the present application.
Example 6
According to an embodiment of the present application, there is also provided an apparatus control device for implementing the control method of an electric apparatus described above, and fig. 13 is a control device of an electric apparatus according to embodiment 1 of the present application, as shown in fig. 13, the device including: an acquisition module 1302, a first processing module 1304, and a control module 1306.
The acquisition module is used for acquiring measurement data of the power equipment in a power generation state and a charging and discharging state to obtain first state information of the power equipment; the first processing module is used for mapping the first state information by using an enhanced instruction reasoning model to obtain a target control instruction of the power equipment, wherein the enhanced instruction reasoning model is trained by using a preset mapping relation and environment feedback data, the preset mapping relation is used for representing the mapping relation between different state information and different control instructions, the environment feedback data is used for representing data obtained by interaction with an environment where the power equipment is located, and the target control instruction comprises: a power generation scheduling instruction of the power generation equipment and charge and discharge power of the energy storage equipment; the control module is used for triggering the target control instruction to control the power equipment.
It should be noted that, the above-mentioned acquisition module 1302, the first processing module 1304, and the control module 1306 correspond to steps S202 to S206 in embodiment 1, and the three modules are the same as the corresponding steps and the examples and application scenarios, but are not limited to the disclosure in the above-mentioned embodiment one. It should be noted that the above-mentioned modules or units may be hardware components or software components stored in a memory (for example, the memory 104) and processed by one or more processors (for example, the processors 102a,102b, … …,102 n), or the above-mentioned modules may be part of the apparatus and may be executed in the computer terminal 10 provided in the first embodiment.
In the above embodiments of the present application, the apparatus further includes: the device comprises a first acquisition module, a second processing module, a third processing module and an adjustment module.
The first acquisition module is used for acquiring second state information of the power equipment and a teaching instruction reasoning model, wherein the teaching instruction reasoning model is obtained by training through a preset mapping relation; the second processing module is used for mapping the second state information by utilizing the teaching instruction reasoning model to obtain a teaching control instruction; the third processing module is used for mapping the second state information by using the initial instruction reasoning model to obtain an enhanced control instruction, wherein the environment feedback data is feedback data obtained by controlling the power equipment based on the enhanced control instruction; the adjustment module is used for adjusting model parameters of the initial instruction reasoning model based on the teaching control instruction, the strengthening control instruction and the environment feedback data to obtain the strengthening instruction reasoning model.
In the above embodiments of the present application, the adjusting module includes: the device comprises a first determining unit, a first acquiring unit and an adjusting unit.
The first determining unit is used for determining a first loss function value and a second loss function value based on environment feedback data, wherein the first loss function value is used for representing the running cost of the power equipment under the condition that the power equipment is executed according to the strengthening control instruction, and the second loss function value is used for representing the probability that the branch power connected with the power equipment exceeds the preset power under the condition that the power equipment is executed according to the strengthening control instruction; the first acquisition unit is used for acquiring the deviation of the strengthening control instruction and the teaching control instruction to obtain a third loss function value; the adjusting unit is used for adjusting the model parameters based on the first loss function, the second loss function and the third loss function to obtain the reinforced instruction reasoning model.
In the above embodiments of the present application, the adjusting unit includes: the device comprises a first acquisition subunit, a second acquisition subunit, a third acquisition subunit and an adjustment subunit.
The first acquisition subunit is used for acquiring a target weight coefficient corresponding to the third loss function value; the second acquisition subunit is used for acquiring the product of the target weight coefficient and the third loss function value to obtain a fourth loss function value; the third acquisition subunit is used for acquiring the sum of the first loss function value, the second loss function value and the fourth loss function value to obtain a target loss function value; the adjusting subunit is used for adjusting the model parameters based on the target loss function value to obtain the enhanced instruction reasoning model.
In the above embodiment of the present application, the first obtaining subunit is further configured to: determining the current iteration times; updating the initial weight coefficient of the third loss function value based on the current iteration number, the preset attenuation coefficient and the preset updating step length to obtain a target weight coefficient.
In the above embodiments of the present application, the apparatus further includes: the device comprises a second acquisition module, a generation module and a construction module.
The second obtaining module is configured to obtain historical state information, where the historical state information at least includes: status information of different devices over a historical period of time; the generation module is used for generating a history control instruction corresponding to the history state information based on the history state information; the construction module is used for constructing a preset mapping relation based on the historical state information and the historical control instruction.
In the foregoing embodiments of the present application, in a case where the history control instruction includes an enhanced control instruction of the energy storage device, the generating module includes: a second determination unit and a generation unit.
The second determining unit determines a charge-discharge switching threshold value of the energy storage device; the generation unit is used for generating a history control instruction based on the net load value and the charge-discharge switching threshold value of different devices.
In the above embodiment of the present application, in a case where the history control instruction includes a teaching control instruction of the power generation apparatus, the generating module further includes: the system comprises a second acquisition unit, a construction unit and a solving unit.
The second acquisition unit is used for acquiring energy storage charging power, load power and renewable energy power in the historical state information; the construction unit is used for constructing an instruction generating function based on the energy storage charging power, the load power and the renewable energy source power; and the solving unit is used for solving the instruction generating function to obtain the teaching control instruction.
In the foregoing embodiment of the present application, the first obtaining module includes: a first acquisition unit and a second acquisition unit.
The first acquisition unit is used for acquiring simulation data of the power equipment in a historical operation state to obtain second state information; the second acquisition unit is used for acquiring the historical operation state of the power equipment to measure so as to obtain second state information.
Example 7
According to an embodiment of the present application, there is further provided a training apparatus for a command inference model for implementing the training method for a command inference model, and fig. 14 is a training apparatus for a command inference model according to embodiment 2 of the present application, as shown in fig. 14, including: an acquisition module 1402 and a training module 1404.
The power equipment comprises an acquisition module, a control module and a control module, wherein the acquisition module is used for acquiring a preset mapping relation corresponding to the power equipment, and the preset mapping relation is used for representing the mapping relation between different state information and different control instructions; the training module is used for training the initial instruction reasoning model by using a preset mapping relation and environment feedback data to obtain an enhanced instruction reasoning model, wherein the environment feedback data is used for representing data obtained by interaction with an environment where the power equipment is located.
It should be noted that, the acquiring module 1402 and the training module 1404 correspond to steps S802 to S804 in embodiment 2, and the two modules are the same as the examples and application scenarios implemented by the corresponding steps, but are not limited to the disclosure of the first embodiment. It should be noted that the above-mentioned modules or units may be hardware components or software components stored in a memory (for example, the memory 104) and processed by one or more processors (for example, the processors 102a,102b, … …,102 n), or the above-mentioned modules may be part of the apparatus and may be executed in the computer terminal 10 provided in the first embodiment.
Example 8
There is also provided, according to an embodiment of the present application, an apparatus control device for implementing the above apparatus control method, and fig. 15 is an apparatus control device according to embodiment 3 of the present application, as shown in fig. 15, including: an acquisition module 1502, a processing module 1504, and a control module 1506.
The acquisition module is used for acquiring measurement data of the equipment to be controlled in an operation state to obtain first state information of the equipment to be controlled; the processing module is used for carrying out mapping processing on the first state information by utilizing the enhanced instruction reasoning model to obtain a target control instruction of the equipment to be controlled, wherein the enhanced instruction reasoning model is trained by utilizing a preset mapping relation and environment feedback data, the preset mapping relation is used for representing the mapping relation between different state information and different control instructions, the environment feedback data is used for representing data obtained by interaction with the environment where the power equipment is located, and the preset mapping relation is used for representing the mapping relation between the state information of different equipment and the control instructions; the control module is used for triggering the target control instruction to control the equipment to be controlled. .
It should be noted that, the above-mentioned acquisition module 1502, processing module 1504 and control module 1506 correspond to steps S902 to S906 in embodiment 3, and the three modules are the same as the examples and application scenarios implemented by the corresponding steps, but are not limited to the disclosure of the above-mentioned embodiment one. It should be noted that the above-mentioned modules or units may be hardware components or software components stored in a memory (for example, the memory 104) and processed by one or more processors (for example, the processors 102a,102b, … …,102 n), or the above-mentioned modules may be part of the apparatus and may be executed in the computer terminal 10 provided in the first embodiment.
Example 9
According to an embodiment of the present application, there is also provided an apparatus control device for implementing the control method of an electric apparatus described above, and fig. 16 is a control device of an electric apparatus according to embodiment 4 of the present application, as shown in fig. 16, including: a first display module 1602 and a second display module 1604.
The first display module is used for responding to an input instruction acted on the operation interface and displaying first state information of the power equipment on the operation interface, wherein the first state information is obtained by measuring the running state of the power equipment; the second display module is used for responding to an instruction generation instruction acting on the operation interface and displaying a target control instruction of the power equipment on the operation interface, wherein the target control instruction is obtained by mapping state information by utilizing an enhanced instruction reasoning model, the enhanced instruction reasoning model is obtained by training by utilizing a preset mapping relation and environment feedback data, the environment feedback data is used for representing data obtained by interaction with an environment where the power equipment is located, and the preset mapping relation is used for representing mapping relations between different state information and different control instructions.
Here, it should be noted that the first display module 1602 and the second display module 1604 correspond to the steps S1002 to S1004 in the embodiment 4, and the two modules are the same as the examples and the application scenarios implemented by the corresponding steps, but are not limited to the disclosure of the first embodiment. It should be noted that the above-mentioned modules or units may be hardware components or software components stored in a memory (for example, the memory 104) and processed by one or more processors (for example, the processors 102a,102b, … …,102 n), or the above-mentioned modules may be part of the apparatus and may be executed in the computer terminal 10 provided in the first embodiment.
Example 10
According to an embodiment of the present application, there is also provided an apparatus control device for implementing the control method of an electric apparatus described above, and fig. 17 is a control device of an electric apparatus according to embodiment 5 of the present application, as shown in fig. 17, including: an acquisition module 1702, a processing module 1704, and an output module 1706.
The power equipment comprises an acquisition module, a power equipment management module and a power equipment management module, wherein the acquisition module is used for acquiring first state information of the power equipment by calling an enhancement interface, the enhancement interface comprises enhancement parameters, parameter values of the enhancement parameters are the first state information, and the first state information is acquired by acquiring measurement data of the power equipment in an operation state; the processing module is used for carrying out mapping processing on the state information by utilizing the enhanced instruction reasoning model to obtain a target control instruction of the power equipment, wherein the enhanced instruction reasoning model is obtained by training by utilizing a preset mapping relation and environment feedback data, the environment feedback data is used for representing data obtained by interaction with an environment where the power equipment is located, and the preset mapping relation is used for representing the mapping relation between different state information and different control instructions; the output module is used for outputting a target control instruction by calling the teaching interface, wherein the teaching interface comprises teaching parameters, and parameter values of the teaching parameters are the target control instruction.
It should be noted that, the above-mentioned obtaining module 1702, processing module 1704 and output module 1706 correspond to steps S1202 to S1206 in embodiment 5, and the three modules are the same as the examples and application scenarios implemented by the corresponding steps, but are not limited to the disclosure of the above-mentioned embodiment one. It should be noted that the above-mentioned modules or units may be hardware components or software components stored in a memory (for example, the memory 104) and processed by one or more processors (for example, the processors 102a,102b, … …,102 n), or the above-mentioned modules may be part of the apparatus and may be executed in the computer terminal 10 provided in the first embodiment.
Example 11
Embodiments of the present application may provide a computer device, which may be any one of a group of computer terminals. Alternatively, in the present embodiment, the above-mentioned computer device may be replaced with a terminal device such as a mobile terminal.
Alternatively, in this embodiment, the above-mentioned computer device may be located in at least one network device among a plurality of network devices of the computer network.
In this embodiment, the above-described computer device may execute the program code of the following steps in the device control method: collecting measurement data of the power equipment in a power generation state and a charging and discharging state to obtain first state information of the power equipment; the first state information is mapped by using an enhanced instruction reasoning model to obtain a target control instruction of the power equipment, wherein the enhanced instruction reasoning model is trained by using a preset mapping relation and environment feedback data, the preset mapping relation is used for representing the mapping relation between different state information and different control instructions, the environment feedback data is used for representing data obtained by interaction with the environment where the power equipment is located, and the target control instruction comprises: a power generation scheduling instruction of the power generation equipment and charge and discharge power of the energy storage equipment; and triggering the target control instruction to control the power equipment.
Alternatively, fig. 18 is a block diagram of a computer device according to an embodiment of the present application. As shown in fig. 18, the computer device a may include: one or more (only one is shown) processors 1802, memory 1804, memory controllers, and peripheral interfaces, wherein the peripheral interfaces are coupled to the radio frequency module, the audio module, and the display.
The memory may be used to store software programs and modules, such as program instructions/modules corresponding to the device control methods and apparatuses in the embodiments of the present application, and the processor executes the software programs and modules stored in the memory, thereby executing various functional applications and data processing, that is, implementing the device control methods described above. The memory may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory may further include memory remotely located with respect to the processor, which may be connected to device a via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The processor may call the information and the application program stored in the memory through the transmission device to perform the following steps: collecting measurement data of the power equipment in a power generation state and a charging and discharging state to obtain first state information of the power equipment; the first state information is mapped by using an enhanced instruction reasoning model to obtain a target control instruction of the power equipment, wherein the enhanced instruction reasoning model is trained by using a preset mapping relation and environment feedback data, the preset mapping relation is used for representing the mapping relation between different state information and different control instructions, the environment feedback data is used for representing data obtained by interaction with the environment where the power equipment is located, and the target control instruction comprises: a power generation scheduling instruction of the power generation equipment and charge and discharge power of the energy storage equipment; and triggering the target control instruction to control the power equipment.
Optionally, the above processor may further execute program code for: acquiring second state information of the power equipment and a teaching instruction reasoning model, wherein the teaching instruction reasoning model is obtained by training through a preset mapping relation; mapping the second state information by using a teaching instruction reasoning model to obtain a teaching control instruction; mapping the second state information by using the initial instruction reasoning model to obtain an enhanced control instruction, wherein the environment feedback data is feedback data obtained by controlling the power equipment based on the enhanced control instruction; and adjusting model parameters of the initial instruction reasoning model based on the teaching control instruction, the strengthening control instruction and the environment feedback data to obtain the strengthening instruction reasoning model.
Optionally, the above processor may further execute program code for: determining a first loss function value and a second loss function value based on environmental feedback data, wherein the first loss function value is used for representing the running cost of the power equipment under the condition that the power equipment is executed according to the strengthening control instruction, and the second loss function value is used for representing the probability that the branch power connected with the power equipment exceeds the preset power under the condition that the power equipment is executed according to the strengthening control instruction; obtaining the deviation of the strengthening control command and the teaching control command to obtain a third loss function value; and adjusting model parameters based on the first loss function, the second loss function and the third loss function to obtain an enhanced instruction reasoning model.
Optionally, the above processor may further execute program code for: obtaining a target weight coefficient corresponding to the third loss function value; obtaining a product of the target weight coefficient and the third loss function value to obtain a fourth loss function value; obtaining the sum of a first loss function value, a second loss function value and a fourth loss function value to obtain a target loss function value; and adjusting the model parameters based on the objective loss function value to obtain the reinforced instruction reasoning model.
Optionally, the above processor may further execute program code for: determining the current iteration times; updating the initial weight coefficient of the third loss function value based on the current iteration number, the preset attenuation coefficient and the preset updating step length to obtain a target weight coefficient.
Optionally, the above processor may further execute program code for: acquiring historical state information, wherein the historical state information at least comprises: status information of different devices over a historical period of time; generating a history control instruction corresponding to the history state information based on the history state information; and constructing a preset mapping relation based on the history state information and the history control instruction.
Optionally, the above processor may further execute program code for: determining a charge-discharge switching threshold of the energy storage device; based on the payload values and charge-discharge switching thresholds of the different devices, historical control instructions are generated.
Optionally, the above processor may further execute program code for: acquiring energy storage charging power, load power and renewable energy power in the historical state information; constructing an instruction generating function based on the stored energy charging power, the load power and the renewable energy power; and solving the instruction generating function to obtain the teaching control instruction.
Optionally, the above processor may further execute a program code for one of the following steps: collecting simulation data of the power equipment in a historical operation state to obtain second state information; and acquiring the historical operation state of the power equipment for measurement to obtain second state information.
By adopting the embodiment of the application, the measurement data of the power equipment in the power generation state and the charging and discharging state are acquired, and the first state information of the power equipment is obtained; the first state information is mapped by using an enhanced instruction reasoning model to obtain a target control instruction of the power equipment, wherein the enhanced instruction reasoning model is trained by using a preset mapping relation and environment feedback data, the preset mapping relation is used for representing the mapping relation between different state information and different control instructions, the environment feedback data is used for representing data obtained by interaction with the environment where the power equipment is located, and the target control instruction comprises: a power generation scheduling instruction of the power generation equipment and charge and discharge power of the energy storage equipment; triggering a target control instruction to control the power equipment. It is easy to notice that, through training preset mapping relation and environment feedback data, the reinforced instruction reasoning model can be obtained fast, because the reinforced reasoning model can carry out dynamic control on the power equipment in real time and accurately, the purpose of controlling the power equipment fast is achieved, and therefore the technical effects of improving the low execution efficiency and the low training speed of controlling the power equipment through the model are achieved, and further the technical problems of low execution efficiency and low training speed of controlling the power equipment through the model in the related technology are solved.
It will be appreciated by those of ordinary skill in the art that the configuration shown in the figures is merely illustrative, and that the computer device may be a smart phone (e.g.
Figure SMS_35
Terminal devices such as tablet computers, palm computers, mobile internet devices (Mobile Internet Devices, MID), PAD and the like. Fig. 18 is not limited to the structure of the above-described computer device. For example, computer device A may also include more or fewer components (e.g., network interfaces, display devices, etc.) than shown in FIG. 18, or have a different configuration than shown in FIG. 18.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program for instructing a terminal device to execute in association with hardware, the program may be stored in a computer readable storage medium, and the storage medium may include: flash disk, read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), magnetic or optical disk, and the like.
Example 12
Embodiments of the present application also provide a computer-readable storage medium. Alternatively, in this embodiment, the storage medium may be used to store the program code executed by the device control method provided in the first embodiment.
Alternatively, in this embodiment, the storage medium may be located in any one of the computer devices in the computer terminal group in the computer network, or in any one of the mobile terminals in the mobile terminal group.
Optionally, in the present embodiment, the storage medium is further configured to store program code for performing the steps of: collecting measurement data of the power equipment in a power generation state and a charging and discharging state to obtain first state information of the power equipment; the first state information is mapped by using an enhanced instruction reasoning model to obtain a target control instruction of the power equipment, wherein the enhanced instruction reasoning model is trained by using a preset mapping relation and environment feedback data, the preset mapping relation is used for representing the mapping relation between different state information and different control instructions, the environment feedback data is used for representing data obtained by interaction with the environment where the power equipment is located, and the target control instruction comprises: a power generation scheduling instruction of the power generation equipment and charge and discharge power of the energy storage equipment; and triggering the target control instruction to control the power equipment.
Optionally, in the present embodiment, the storage medium is further configured to store program code for performing the steps of: acquiring second state information of the power equipment and a teaching instruction reasoning model, wherein the teaching instruction reasoning model is obtained by training through a preset mapping relation; mapping the second state information by using a teaching instruction reasoning model to obtain a teaching control instruction; mapping the second state information by using the initial instruction reasoning model to obtain an enhanced control instruction, wherein the environment feedback data is feedback data obtained by controlling the power equipment based on the enhanced control instruction; and adjusting model parameters of the initial instruction reasoning model based on the teaching control instruction, the strengthening control instruction and the environment feedback data to obtain the strengthening instruction reasoning model.
Optionally, in the present embodiment, the storage medium is further configured to store program code for performing the steps of: determining a first loss function value and a second loss function value based on environmental feedback data, wherein the first loss function value is used for representing the running cost of the power equipment under the condition that the power equipment is executed according to the strengthening control instruction, and the second loss function value is used for representing the probability that the branch power connected with the power equipment exceeds the preset power under the condition that the power equipment is executed according to the strengthening control instruction; obtaining the deviation of the strengthening control command and the teaching control command to obtain a third loss function value; and adjusting model parameters based on the first loss function, the second loss function and the third loss function to obtain an enhanced instruction reasoning model.
Optionally, in the present embodiment, the storage medium is further configured to store program code for performing the steps of: obtaining a target weight coefficient corresponding to the third loss function value; obtaining a product of the target weight coefficient and the third loss function value to obtain a fourth loss function value; obtaining the sum of a first loss function value, a second loss function value and a fourth loss function value to obtain a target loss function value; and adjusting the model parameters based on the objective loss function value to obtain the reinforced instruction reasoning model.
Optionally, in the present embodiment, the storage medium is further configured to store program code for performing the steps of: determining the current iteration times; updating the initial weight coefficient of the third loss function value based on the current iteration number, the preset attenuation coefficient and the preset updating step length to obtain a target weight coefficient.
Optionally, in the present embodiment, the storage medium is further configured to store program code for performing the steps of: acquiring historical state information, wherein the historical state information at least comprises: status information of different devices over a historical period of time; generating a history control instruction corresponding to the history state information based on the history state information; and constructing a preset mapping relation based on the history state information and the history control instruction.
Optionally, in the present embodiment, the storage medium is further configured to store program code for performing the steps of: determining a charge-discharge switching threshold of the energy storage device; based on the payload values and charge-discharge switching thresholds of the different devices, historical control instructions are generated.
Optionally, in the present embodiment, the storage medium is further configured to store program code for performing the steps of: acquiring energy storage charging power, load power and renewable energy power in the historical state information; constructing an instruction generating function based on the stored energy charging power, the load power and the renewable energy power; and solving the instruction generating function to obtain the teaching control instruction.
Optionally, in the present embodiment, the storage medium is further configured to store program code for performing one of the following steps: collecting simulation data of the power equipment in a historical operation state to obtain second state information; and acquiring the historical operation state of the power equipment for measurement to obtain second state information.
The foregoing embodiment numbers of the present application are merely for describing, and do not represent advantages or disadvantages of the embodiments.
In the foregoing embodiments of the present application, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed technology content may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, such as the division of the units, is merely a logical function division, and may be implemented in another manner, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely a preferred embodiment of the present application and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present application and are intended to be comprehended within the scope of the present application.

Claims (14)

1. A control method of an electric power apparatus, characterized by comprising:
collecting measurement data of the power equipment in a power generation state and a charging and discharging state to obtain first state information of the power equipment;
mapping the first state information by using an enhanced instruction reasoning model to obtain a target control instruction of the power equipment, wherein the enhanced instruction reasoning model is trained by using a preset mapping relation and environment feedback data, the preset mapping relation is used for representing the mapping relation between different state information and different control instructions, the environment feedback data is used for representing data obtained by interaction with an environment where the power equipment is located, and the target control instruction comprises: a power generation scheduling instruction of the power generation equipment and charge and discharge power of the energy storage equipment;
and triggering the target control instruction to control the power equipment.
2. The method according to claim 1, wherein the method further comprises:
acquiring second state information of the power equipment and a teaching instruction reasoning model, wherein the teaching instruction reasoning model is obtained by training through the preset mapping relation;
mapping the second state information by using the teaching instruction reasoning model to obtain a teaching control instruction;
mapping the second state information by using an initial instruction reasoning model to obtain an enhanced control instruction, wherein the environment feedback data is feedback data obtained by controlling the power equipment based on the enhanced control instruction;
and adjusting model parameters of the initial instruction reasoning model based on the teaching control instruction, the strengthening control instruction and the environment feedback data to obtain the strengthening instruction reasoning model.
3. The method of claim 2, wherein adjusting model parameters of the initial command inference model based on the teaching control commands, the reinforcement control commands, and the environmental feedback data to obtain the reinforcement command inference model comprises:
Determining a first loss function value and a second loss function value based on the environmental feedback data, wherein the first loss function value is used for representing the running cost of the power equipment when the power equipment is executed according to the strengthening control instruction, and the second loss function value is used for representing the probability that the branch power connected with the power equipment exceeds the preset power when the power equipment is executed according to the strengthening control instruction;
obtaining the deviation of the strengthening control command and the teaching control command to obtain a third loss function value;
and adjusting the model parameters based on the first loss function, the second loss function and the third loss function to obtain the enhanced instruction reasoning model.
4. The method of claim 3, wherein adjusting the model parameters based on the first, second, and third loss functions results in the enriched instruction inference model, comprising:
obtaining a target weight coefficient corresponding to the third loss function value;
obtaining the product of the target weight coefficient and the third loss function value to obtain a fourth loss function value;
Obtaining the sum of the first loss function value, the second loss function value and the fourth loss function value to obtain a target loss function value;
and adjusting the model parameters based on the objective loss function value to obtain the reinforced instruction reasoning model.
5. The method according to claim 2, wherein the method further comprises:
acquiring historical state information, wherein the historical state information at least comprises: status information of different devices over a historical period of time;
generating a history control instruction corresponding to the history state information based on the history state information;
and constructing the preset mapping relation based on the historical state information and the historical control instruction.
6. The method of claim 5, wherein, in the case where the history control instruction includes an enriched control instruction of an energy storage device, generating a history control instruction corresponding to the history state information based on the history state information comprises:
determining a charge-discharge switching threshold of the energy storage device;
the history control instruction is generated based on the payload value of the different device and the charge-discharge switching threshold.
7. The method according to claim 5, wherein in the case where the history control instruction includes a teaching control instruction of a power generation apparatus, the method further comprises:
acquiring energy storage charging power, load power and renewable energy power in the history state information;
constructing an instruction generating function based on the energy storage charging power, the load power and the renewable energy power;
and solving the instruction generating function to obtain the teaching control instruction.
8. The method of claim 2, wherein obtaining the second status information comprises one of:
collecting simulation data of the power equipment in a historical operation state to obtain second state information;
and acquiring the historical running state of the power equipment for measurement to obtain the second state information.
9. A method of training a command inference model, comprising:
acquiring a preset mapping relation corresponding to the power equipment, wherein the preset mapping relation is used for representing the mapping relation between different state information and different control instructions;
training an initial instruction reasoning model by using the preset mapping relation and environment feedback data to obtain an enhanced instruction reasoning model, wherein the environment feedback data is used for representing data obtained by interaction with an environment where the power equipment is located.
10. A device control method, characterized by comprising:
collecting measurement data of equipment to be controlled in an operation state to obtain first state information of the equipment to be controlled;
mapping the first state information by using an enhanced instruction reasoning model to obtain a target control instruction of the equipment to be controlled, wherein the enhanced instruction reasoning model is trained by using a preset mapping relation and environment feedback data, the preset mapping relation is used for representing the mapping relation between different state information and different control instructions, the environment feedback data is used for representing data obtained by interaction with the environment where the power equipment is located, and the preset mapping relation is used for representing the mapping relation between the state information of different equipment and the control instructions;
and triggering the target control instruction to control the equipment to be controlled.
11. A control method of an electric power apparatus, characterized by comprising:
responding to an input instruction acted on an operation interface, and displaying first state information of the power equipment on the operation interface, wherein the first state information is obtained by measuring the running state of the power equipment;
And responding to an instruction generation instruction acting on the operation interface, and displaying a target control instruction of the power equipment on the operation interface, wherein the target control instruction is obtained by mapping the state information by using an enhanced instruction reasoning model, the enhanced instruction reasoning model is obtained by training by using a preset mapping relation and environment feedback data, the environment feedback data is used for representing data obtained by interaction with an environment where the power equipment is located, and the preset mapping relation is used for representing the mapping relation between different state information and different control instructions.
12. A control method of an electric power apparatus, characterized by comprising:
acquiring first state information of the power equipment by calling an strengthening interface, wherein the strengthening interface comprises strengthening parameters, the parameter values of the strengthening parameters are the first state information, and the first state information is obtained by collecting measurement data of the power equipment in an operation state;
mapping the state information by using an enhanced instruction reasoning model to obtain a target control instruction of the power equipment, wherein the enhanced instruction reasoning model is trained by using a preset mapping relation and environment feedback data, the environment feedback data is used for representing data obtained by interaction with an environment where the power equipment is located, and the preset mapping relation is used for representing mapping relations between different state information and different control instructions;
And outputting the target control instruction by calling a teaching interface, wherein the teaching interface comprises teaching parameters, and parameter values of the teaching parameters are the target control instruction.
13. An electronic device, comprising:
a memory storing an executable program;
a processor for executing the program, wherein the program when run performs the method of any of claims 1 to 12.
14. A computer readable storage medium, characterized in that the computer readable storage medium comprises a stored executable program, wherein the executable program when run controls a device in which the computer readable storage medium is located to perform the method of any one of claims 1 to 12.
CN202310424102.5A 2023-04-17 2023-04-17 Control method of power equipment, equipment control method and electronic equipment Active CN116154771B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310424102.5A CN116154771B (en) 2023-04-17 2023-04-17 Control method of power equipment, equipment control method and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310424102.5A CN116154771B (en) 2023-04-17 2023-04-17 Control method of power equipment, equipment control method and electronic equipment

Publications (2)

Publication Number Publication Date
CN116154771A true CN116154771A (en) 2023-05-23
CN116154771B CN116154771B (en) 2023-07-21

Family

ID=86352786

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310424102.5A Active CN116154771B (en) 2023-04-17 2023-04-17 Control method of power equipment, equipment control method and electronic equipment

Country Status (1)

Country Link
CN (1) CN116154771B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117713153A (en) * 2023-12-08 2024-03-15 广州汇电云联数科能源有限公司 System and method for day-ahead market optimization scheduling

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109255443A (en) * 2018-08-07 2019-01-22 阿里巴巴集团控股有限公司 The method and device of training deeply learning model
CN112862281A (en) * 2021-01-26 2021-05-28 中国电力科学研究院有限公司 Method, device, medium and electronic equipment for constructing scheduling model of comprehensive energy system
CN113541192A (en) * 2021-07-27 2021-10-22 重庆大学 Offshore wind farm reactive power-voltage coordination control method based on deep reinforcement learning
US20210356923A1 (en) * 2020-05-15 2021-11-18 Tsinghua University Power grid reactive voltage control method based on two-stage deep reinforcement learning
CN114880932A (en) * 2022-05-12 2022-08-09 中国电力科学研究院有限公司 Power grid operating environment simulation method, system, equipment and medium
CN115296306A (en) * 2022-07-12 2022-11-04 华电电力科学研究院有限公司 Thermal power plant frequency control method and system based on reinforcement learning algorithm
CN115542736A (en) * 2022-09-28 2022-12-30 阿里巴巴达摩院(杭州)科技有限公司 Device control method, computer-readable storage medium, and computer terminal

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109255443A (en) * 2018-08-07 2019-01-22 阿里巴巴集团控股有限公司 The method and device of training deeply learning model
US20210356923A1 (en) * 2020-05-15 2021-11-18 Tsinghua University Power grid reactive voltage control method based on two-stage deep reinforcement learning
CN112862281A (en) * 2021-01-26 2021-05-28 中国电力科学研究院有限公司 Method, device, medium and electronic equipment for constructing scheduling model of comprehensive energy system
CN113541192A (en) * 2021-07-27 2021-10-22 重庆大学 Offshore wind farm reactive power-voltage coordination control method based on deep reinforcement learning
CN114880932A (en) * 2022-05-12 2022-08-09 中国电力科学研究院有限公司 Power grid operating environment simulation method, system, equipment and medium
CN115296306A (en) * 2022-07-12 2022-11-04 华电电力科学研究院有限公司 Thermal power plant frequency control method and system based on reinforcement learning algorithm
CN115542736A (en) * 2022-09-28 2022-12-30 阿里巴巴达摩院(杭州)科技有限公司 Device control method, computer-readable storage medium, and computer terminal

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ZHONGKAI YI ET AL.: "An Improved Two-Stage Deep Reinforcement Learning Approach for Regulation Service Disaggregation in a Virtual Power Plant", 《IEEE TRANSACTIONS ON SMART GRID》, vol. 13, no. 4, pages 2844 - 2858, XP011912245, DOI: 10.1109/TSG.2022.3162828 *
宋伟业等: "基于深度强化学习的海上风电集群自进化功率平滑控制方法", 《中国电力》, vol. 56, no. 3 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117713153A (en) * 2023-12-08 2024-03-15 广州汇电云联数科能源有限公司 System and method for day-ahead market optimization scheduling

Also Published As

Publication number Publication date
CN116154771B (en) 2023-07-21

Similar Documents

Publication Publication Date Title
CN112186799B (en) Distributed energy system autonomous control method and system based on deep reinforcement learning
CN112117760A (en) Micro-grid energy scheduling method based on double-Q-value network deep reinforcement learning
Jasmin et al. Reinforcement learning approaches to economic dispatch problem
CN116154771B (en) Control method of power equipment, equipment control method and electronic equipment
CN112329948A (en) Multi-agent strategy prediction method and device
CN110326008A (en) Machine learning is integrated into control system
CN109672795A (en) Call center resource management method and device, electronic equipment, storage medium
CN110781969B (en) Air conditioner air volume control method, device and medium based on deep reinforcement learning
CN112491094B (en) Hybrid-driven micro-grid energy management method, system and device
CN113361680A (en) Neural network architecture searching method, device, equipment and medium
CN113627533B (en) Power equipment overhaul decision generation method based on reinforcement learning
CN115542736B (en) Device control method, computer-readable storage medium, and computer terminal
CN111799820B (en) Double-layer intelligent hybrid zero-star cloud energy storage countermeasure regulation and control method for power system
CN109543879A (en) Load forecasting method and device neural network based
CN115953009B (en) Scheduling method of power system and training method of scheduling decision model
CN115525979B (en) Multi-time scale evaluation method and system for schedulable capacity of active power distribution network
CN112224083A (en) Electric vehicle charging method and device based on Internet of things
CN115268259A (en) PID control loop setting method and device
CN115629576A (en) Non-invasive flexible load aggregation characteristic identification and optimization method, device and equipment
CN115438588A (en) Temperature prediction method, system, equipment and storage medium of lithium battery
CN115360768A (en) Power scheduling method and device based on muzero and deep reinforcement learning and storage medium
CN116404745B (en) Intelligent regulation and control auxiliary decision-making system and method for power distribution network based on big data
CN112163709B (en) Method and device for electricity utilization promotion, storage medium, and electronic device
CN117808259A (en) Method and device for acquiring energy scheduling strategy
CN117876156B (en) Multi-task-based electric power Internet of things terminal monitoring method, electric power Internet of things terminal and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant