CN118131045A

CN118131045A - Mobile energy storage online decision method and device based on porous electrode aging model

Info

Publication number: CN118131045A
Application number: CN202410089917.7A
Authority: CN
Inventors: 何冠楠; 丁永康; 陈新江; 宋洁
Original assignee: Peking University
Current assignee: Peking University
Priority date: 2024-01-22
Filing date: 2024-01-22
Publication date: 2024-06-04

Abstract

The invention relates to the technical field of data processing, and discloses a mobile energy storage online decision method and device based on a porous electrode aging model, wherein the method comprises the following steps: acquiring an initial state of health of a battery; determining initial state parameters at a first moment based on the initial state of health of a battery and a pre-trained mobile energy storage online decision model based on deep reinforcement learning; determining an action parameter at a first moment based on the initial state parameter and the decision model; if the working state is a charging state, determining the battery aging amount of the mobile energy storage of the next station position at the second moment based on the power corresponding to the first moment, the initial state parameter and a pre-trained mobile energy storage aging evaluation model of the multi-layer perceptron, and updating the initial state parameter based on the decision model to obtain a first target state parameter; and if the working state is a discharge state or a holding state, updating the initial state parameters by using the decision model to obtain second target state parameters.

Description

Mobile energy storage online decision method and device based on porous electrode aging model

Technical Field

The invention relates to the technical field of data processing, in particular to a mobile energy storage online decision method and device based on a porous electrode aging model.

Background

The movable energy storage is used as a novel energy storage technology with a more flexible deployment mode, and energy benefit and power grid blockage relief can be carried out by charging or discharging among power grid nodes. However, battery aging of the mobile energy storage is affected by factors such as charge and discharge rate and temperature, and the battery aging can cause functional degradation of the mobile energy storage, such as energy capacity and power capacity attenuation, and the like, so that the mobile energy storage operation decision is affected.

The existing energy storage technology is mainly oriented to fixed energy storage, for example, the capacity of an energy storage battery may gradually decrease along with the passage of time, and the capacity attenuation rate of the battery can be calculated by comparing the capacities of new and old batteries, so that the aging degree of the battery is estimated.

However, existing models for aging evaluation of electrochemical stored energy are not suitable for mobile stored energy. The existing aging evaluation model related to electrochemical energy storage is mainly oriented to fixed energy storage, the environment (such as temperature and the like) and working conditions (such as charge and discharge power and the like) faced by mobile energy storage are more complex, and the aging evaluation of a battery is also more complex. The existing fixed energy storage aging evaluation model is directly applied in the mobile energy storage scheduling decision process, so that the estimation deviation of the battery aging condition is caused, and the decision accuracy and benefit of mobile energy storage are reduced. Secondly, most of existing battery aging evaluation models are mostly single linear aging models, mechanism models and data driving models, wherein the linear aging models are too simple and cannot accurately describe nonlinear factors influencing energy storage aging. The mechanism model has high evaluation accuracy and good interpretability on energy storage aging, but the modeling is complex and the online calculation is difficult. The data driven model does not need complex modeling, but lacks physical interpretation, and has high data requirements. Thirdly, the aging of the battery can cause the degradation of the function of the mobile energy storage (such as the attenuation of energy capacity, power capacity and charge-discharge efficiency), thereby affecting the decision accuracy of the mobile energy storage. But current research rarely integrates battery aging assessment in mobile energy storage decision optimization. The battery aging evaluation or the mobile energy storage optimization scheduling needs to be integrated in a unified decision frame to improve the accuracy of battery aging evaluation and mobile energy storage decision.

Disclosure of Invention

In view of the above, the invention provides a mobile energy storage online decision method and device based on a porous electrode aging model, so as to solve the problems that an existing fixed energy storage aging evaluation model is directly applied in a mobile energy storage scheduling decision process, estimation deviation of battery aging conditions is caused, and then decision accuracy and benefit are reduced.

In a first aspect, the present invention provides a mobile energy storage online decision method based on a porous electrode aging model, the method comprising: acquiring an initial state of health of a battery; based on the initial state of health of the battery, determining initial state parameters of the mobile energy storage at a first moment by using a pre-trained mobile energy storage online decision model based on deep reinforcement learning; determining action parameters of the mobile energy storage at a first moment by utilizing a pre-trained mobile energy storage online decision model based on deep reinforcement learning based on initial state parameters; wherein, the action parameters include: the power, the working state and the next site position corresponding to the first moment; if the working state is a charging state, determining the battery aging amount of the mobile energy storage of the next station position at the second moment based on the power corresponding to the first moment, the initial state parameters and a pre-trained mobile energy storage aging evaluation model of the multi-layer perceptron, and updating the initial state parameters based on the pre-trained mobile energy storage on-line decision model based on deep reinforcement learning to obtain a first target state parameter; and if the working state is a discharging state or a holding state, updating initial state parameters by using a pre-trained mobile energy storage online decision model based on deep reinforcement learning to obtain second target state parameters.

In an alternative embodiment, the initial state parameters include: the method comprises the steps of moving remaining energy of energy storage at a first moment, stock energy cost of the energy storage at the first moment, station marginal electricity price of a station at the first moment, temperature of the station at the first moment, and health state of the energy storage at the first moment; the method for determining the battery aging amount of the mobile energy storage of the next station position at the second moment based on the power corresponding to the first moment and a pre-trained mobile energy storage aging evaluation model of the multi-layer perceptron comprises the following steps: determining a termination residual capacity based on the power corresponding to the first moment and the current residual capacity; based on the corresponding power, the current residual capacity, the termination residual capacity and the temperature of the station at the first moment, the health state of the mobile energy storage at the first moment, and the mobile energy storage aging evaluation model of the multi-layer perceptron trained in advance is utilized to determine the battery aging amount of the mobile energy storage at the next station position at the second moment.

In an alternative embodiment, the method further comprises: detecting whether the second moment meets the termination moment parameter; and if the second moment does not meet the termination moment parameter, repeatedly executing the steps of obtaining a second target state parameter by utilizing the pre-trained mobile energy storage on-line decision model based on the deep reinforcement learning, determining the initial state parameter of the mobile energy storage at the first moment to a discharge state or a holding state if the working state is the discharge state or the holding state, and updating the initial state parameter by utilizing the pre-trained mobile energy storage on-line decision model based on the deep reinforcement learning until the second moment meets the termination moment parameter.

In an alternative embodiment, the method further comprises: and determining a target rewarding function by utilizing a pre-trained mobile energy storage online decision model based on deep reinforcement learning based on the target state parameters.

In an alternative embodiment, updating the initial state parameters based on a pre-trained mobile energy storage online decision model based on deep reinforcement learning to obtain the first target state parameters includes:

The first target state parameter is determined by the following formula:

Wherein E _h is the remaining energy of h mobile energy storage, E _h' is the remaining energy of h ' mobile energy storage, n is the current site position, n ' is the next site position, h is the first time, h ' is the second time, COST _E,h is the stock energy COST of h ' mobile energy storage at the first time, COST _E,h' is the stock energy COST of h ' mobile energy storage, lambda _n,h is the site marginal price of h current site n, lambda _n',h' is the site marginal price of h ' next site n ', TEMP _n,h is the temperature of h current site n, TEMP _n',h' is the temperature of h ' next site n ', SOH _h is the health state of h mobile energy storage, SOH _h' is the health state of h mobile energy storage,/> Charging power of h' mobile energy storage at next station,/>Charging power of the mobile energy storage at the next station at the second moment, LMP (·) being station marginal electricity price updating function, temperature (·) being environment Temperature updating function, soh (·) being health state updating function of the mobile energy storage mobile battery, duration _n,n' being time distance of stations n and n'.

In an alternative embodiment, updating the initial state parameters based on a pre-trained mobile energy storage online decision model based on deep reinforcement learning to obtain the second target state parameters includes: if the working state is a discharge state, determining a second target state parameter by the following formula:

Wherein/> The discharge power of the mobile energy storage at the next site for the second moment, the residual energy of the mobile energy storage for E _h, the residual energy of the mobile energy storage for E _h', the current site position for n ', the next site position for n ', the first moment for h, the second moment for h ', the stock energy COST of the mobile energy storage for COST _E,h at the first moment, the stock energy COST of the mobile energy storage for COST _E,h' for h ', the site marginal electricity price for lambda _n,h for h current site n, the site marginal electricity price for lambda _n',h' for h ' next site n ', the temperature of TEMP _n,h for h current site n, the temperature of TEMP _n',h' for h ' next site n ', the health state of SOH _h for h, the health state of SOH _h' for h ' mobile energy storage, >/Charging power of h 'mobile energy storage at the next station, LMP (·) as station marginal electricity price updating function, temperature (·) as environment Temperature updating function, soh (·) as health state updating function of mobile energy storage mobile battery, duration _n,n' as time distance of stations n and n';

If the operating state is a hold state, a second target state parameter is determined by the following formula:

Wherein Action (n) is the Action selection for the second site.

In an alternative embodiment, determining the target reward function based on the target state parameter using a pre-trained mobile energy storage online decision model based on deep reinforcement learning, comprises:

if the operating state is a charging state, determining a target rewarding function by using the following formula:

r _h＝-C^TRA-C^DEG; wherein, C ^TRA is the transportation cost of the mobile energy storage, and C ^DEG is the ageing cost of the mobile energy storage;

if the working state is a discharge state, determining a target rewarding function by using the following formula:

Wherein lambda _n',h' is the node marginal electricity price of node n 'at time h' and is/> The discharge power of the stored energy at node n ' for the time h ', Δh as the time interval, COST _E,h' as the stock energy COST of the stored energy for the time h ' and E _h as the remaining energy of the stored energy for the time h.

If the operating state is a hold state, the target reward function is determined using the following formula:

r_h＝-C^TRA-C^DEG。

In a second aspect, the invention provides a mobile energy storage online decision device based on a porous electrode aging model, which comprises an acquisition module for acquiring an initial health state of a battery; the first determining module is used for determining initial state parameters of the mobile energy storage at a first moment by utilizing a pre-trained mobile energy storage on-line decision model based on deep reinforcement learning based on the initial state of the battery; the second determining module is used for determining action parameters of the mobile energy storage at the first moment by utilizing a pre-trained mobile energy storage online decision model based on deep reinforcement learning based on initial state parameters; wherein, the action parameters include: the power, the working state and the next site position corresponding to the first moment; the third determining module is used for determining the battery aging amount of the mobile energy storage of the next station position at the second moment based on the power corresponding to the first moment and a mobile energy storage aging evaluation model of the multi-layer perceptron trained in advance if the working state is a charging state, and updating initial state parameters based on a mobile energy storage online decision model trained in advance based on deep reinforcement learning to obtain first target state parameters; and the fourth determining module is used for updating the initial state parameters by using a pre-trained mobile energy storage online decision model based on deep reinforcement learning to obtain second target state parameters if the working state is a discharging state or a holding state.

In a third aspect, the present invention provides a computer device comprising: the mobile energy storage online decision-making method based on the porous electrode aging model according to the first aspect or any one of the corresponding embodiments is implemented by the processor and the memory, wherein the memory is in communication connection with the processor, and the memory stores computer instructions.

In a fourth aspect, the present invention provides a computer readable storage medium, on which computer instructions are stored, the computer instructions being configured to cause a computer to perform the mobile energy storage online decision method based on the porous electrode aging model according to the first aspect or any one of the embodiments corresponding thereto.

The invention provides a mobile energy storage online decision method and device based on a porous electrode aging model, wherein the method comprises the following steps: acquiring an initial state of health of a battery; based on the initial state of health of the battery, determining initial state parameters of the mobile energy storage at a first moment by using a pre-trained mobile energy storage online decision model based on deep reinforcement learning; determining action parameters of the mobile energy storage at a first moment by utilizing a pre-trained mobile energy storage online decision model based on deep reinforcement learning based on initial state parameters; wherein, the action parameters include: the power, the working state and the next site position corresponding to the first moment; if the working state is a charging state, determining the battery aging amount of the mobile energy storage of the next station position at the second moment based on the power corresponding to the first moment and a mobile energy storage aging evaluation model of the multi-layer perceptron trained in advance, and updating initial state parameters based on a mobile energy storage online decision model based on deep reinforcement learning trained in advance to obtain first target state parameters; and if the working state is a discharging state or a holding state, updating initial state parameters by using a pre-trained mobile energy storage online decision model based on deep reinforcement learning to obtain second target state parameters.

According to the invention, through the initial state of health of the battery and the pre-trained mobile energy storage online decision model based on the deep reinforcement learning, the initial state parameter of the mobile energy storage at the first moment and the action parameter of the mobile energy storage at the first moment can be determined, then the corresponding processing mode is determined through the different working states of the mobile energy storage, namely, if the working states are charging states, the battery aging amount of the mobile energy storage is determined through the pre-trained mobile energy storage aging evaluation model of the multi-layer perceptron, then the initial state parameter is updated through the pre-trained mobile energy storage online decision model based on the deep reinforcement learning, and if the working states are discharging states or keeping states, the initial state parameter is updated through the pre-trained mobile energy storage online decision model based on the deep reinforcement learning, so that the second target state parameter is obtained. Therefore, the invention combines the mobile energy storage aging evaluation model based on the multi-layer perceptron with the mobile energy storage online decision model based on the deep reinforcement learning, and can accurately determine the battery aging condition in the mobile energy storage scheduling decision process, thereby improving the decision accuracy and benefit of the mobile energy storage.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a mobile energy storage online decision method based on a porous electrode aging model according to an embodiment of the invention;

FIG. 2 is a schematic diagram of a mobile energy storage online decision method based on a porous electrode aging model according to an embodiment of the present invention;

FIG. 3 is a flow chart of another mobile energy storage online decision method based on a porous electrode aging model according to an embodiment of the present invention;

FIG. 4 is a block diagram of a mobile energy storage online decision device based on a porous electrode aging model according to an embodiment of the present invention;

fig. 5 is a schematic diagram of a hardware structure of a computer device according to an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Based on the related technology, the mobile energy storage is taken as a novel energy storage technology with a more flexible deployment mode, and energy benefit and power grid blockage relief can be performed by charging or discharging among power grid nodes. However, battery aging of the mobile energy storage is affected by factors such as charge and discharge rate and temperature, and the battery aging can cause functional degradation of the mobile energy storage, such as energy capacity and power capacity attenuation, and the like, so that the mobile energy storage operation decision is affected.

Based on the method, the initial state parameters of the mobile energy storage at the first moment and the action parameters of the mobile energy storage at the first moment can be determined through the initial state of the battery and the pre-trained mobile energy storage on-line decision model based on the deep reinforcement learning, then corresponding processing modes are determined through different working states of the mobile energy storage, namely, if the working states are charging states, the battery aging amount of the mobile energy storage is determined through the mobile energy storage aging evaluation model of the multi-layer perceptron which is pre-trained, then the initial state parameters are updated through the mobile energy storage on-line decision model based on the deep reinforcement learning which is pre-trained, and if the working states are discharging states or holding states, the initial state parameters can be updated directly through the pre-trained mobile energy storage on-line decision model based on the deep reinforcement learning, so that the second target state parameters are obtained. Therefore, the invention combines the mobile energy storage aging evaluation model based on the multi-layer perceptron with the mobile energy storage online decision model based on the deep reinforcement learning, and can accurately determine the estimated deviation of the battery aging condition in the mobile energy storage scheduling decision process, thereby improving the decision accuracy and benefit of the mobile energy storage.

In accordance with an embodiment of the present invention, there is provided an embodiment of a mobile energy storage online decision method based on a porous electrode aging model, it being noted that the steps illustrated in the flowchart of the figures may be performed in a computer system, such as a set of computer executable instructions, and, although a logical sequence is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in a different order than that illustrated herein.

In this embodiment, a mobile energy storage online decision method based on a porous electrode aging model is provided, which may be used in a computer device, such as a computer, a server, etc., fig. 1 is a schematic flow diagram of a mobile energy storage online decision method based on a porous electrode aging model according to an embodiment of the present invention, as shown in fig. 1, and the flow includes the following steps:

step S101, obtaining an initial state of health of the battery.

The initial state of health of a battery refers to the performance and capacity state of the battery after manufacture, typically expressed as a percentage, reflecting the initial state of health of the battery in proportion to its current capacity relative to its original design capacity, depending on the manufacturing process and quality control of the battery, as well as the materials and design of the battery. The higher the initial state of health of the battery, the better the performance and capacity of the battery, and the better the use requirements can be met. The initial state of health of the battery is not constant, and the performance and capacity of the battery gradually decrease with the lapse of the service time. Therefore, during long-term use, the state of health of the battery needs to be checked and evaluated periodically to see whether the performance and capacity of the battery meet the use requirements, and measures are taken in time to maintain the battery.

The mobile energy storage is portable and movable energy storage equipment, and electric energy can be stored in the mobile equipment by using technologies such as a storage battery and a fuel cell, so that requirements of emergency power supply, mobile equipment charging and the like are met. The mobile energy storage device is advantageous in its flexibility, portability and portability. Compared with the traditional fixed energy storage system, the mobile energy storage device can provide power at any time and any place, is not limited by geographic positions, and has wider application scenes.

Step S102, based on the initial state of health of the battery, determining initial state parameters of the mobile energy storage at a first moment by using a pre-trained mobile energy storage online decision model based on deep reinforcement learning.

The pre-trained mobile energy storage online decision model based on deep reinforcement learning is a model combining deep learning and reinforcement learning technologies and is used for making real-time decisions in a mobile energy storage system. The model can learn an optimal decision strategy through a reinforcement learning algorithm according to the real-time state and the historical data (namely the initial state of health) of the system so as to realize the optimal management and control of the energy storage system. The initial state of health of the battery may be used as an input, and the initial state parameter at the first time may be used as an output.

Preferably, the initial state parameter may beWherein/>For the initial state parameter at the time of h, E _h is the remaining energy of the mobile energy storage at the time of h, COST _E,h is the stock energy COST of the mobile energy storage at the time of h, lambda _n,h is the current station position marginal electricity price at the current station position n at the time of h, TEMP _n,h is the temperature at the current station position n at the time of h, SOH _h is the health state of the mobile energy storage at the time of h, and h is the first time. Wherein COST _E,h、λ_n,h may be a value set at current site location n; /(I)E _h and TEMP _n,h can be measured directly by a measuring instrument or calculated, and are not particularly limited herein.

Step S103, determining action parameters of the mobile energy storage at a first moment by utilizing a pre-trained mobile energy storage online decision model based on deep reinforcement learning based on initial state parameters; wherein, the action parameters include: the power, the working state and the next site position corresponding to the first moment.

After determining the initial state parameter, the initial state parameter may be used as an input, and the motion parameter of the mobile energy storage at the first moment may be used as an output. Wherein, the action parameters include: the power, the working state and the next site position corresponding to the first moment.

Preferably, the action parameter may be a _h = (n', cdh, pwr); wherein n' is a destination node of the mobile energy storage, cdh is charge and discharge selection of the mobile energy storage, namely a charge state, a discharge state and a holding state, and pwr is charge and discharge power of the mobile energy storage; the charge and discharge power can be calculated by a mode of 'power=energy/time'; the next site location may be used to characterize the next site of the current site location, i.e., the next destination of the mobile stored energy movement.

Step S104, if the working state is a charging state, determining the battery aging amount of the mobile energy storage of the next station position at the second moment based on the power corresponding to the first moment, the initial state parameters and the mobile energy storage aging evaluation model of the multi-layer perceptron trained in advance, and updating the initial state parameters based on the mobile energy storage online decision model trained in advance based on deep reinforcement learning to obtain the first target state parameters.

With reference to fig. 2, the pre-trained mobile energy storage aging evaluation model of the multi-layer perceptron is a supervised learning model based on a feedforward neural network, and the multi-layer perceptron comprises an input layer, a hidden layer and an output layer. The input layer is responsible for receiving data features, the hidden layer is composed of a plurality of neurons, and is responsible for extracting features and learning data representations from the input layer. The patent adopts a regression model based on a multi-layer perceptron, takes power charge and discharge rate, initial SOC, termination SOC, current battery State of Health (SOH) and temperature as input characteristics, takes data generated based on PETLION as training data, and finally fits and outputs attenuation delta SOH of the battery State of Health.

Specifically, the power corresponding to the first time can determine the end residual capacity (SOC) of the mobile energy storage, and then the power corresponding to the first time, the end residual capacity and the initial State parameter are used as inputs, so that the attenuation delta SOH of the battery health State can be obtained.

After the attenuation of the battery health state is determined, the initial state parameters can be updated through a state transition equation in a pre-trained mobile energy storage online decision model based on deep reinforcement learning, so that the first target state parameters are obtained. How the initial state parameters are updated by the state transition equation is described in detail below.

Step S105, if the working state is the discharging state or the holding state, updating the initial state parameters by using a pre-trained mobile energy storage online decision model based on deep reinforcement learning to obtain second target state parameters.

If the working state is a discharging state or a holding state (i.e. not discharging or not charging), the initial state parameters can be updated directly through a state transition equation of the pre-trained mobile energy storage on-line decision model based on deep reinforcement learning, so as to obtain the second target state parameters. How the initial state parameters are updated by the state transition equation is described in detail below.

In this embodiment, a mobile energy storage online decision method based on a porous electrode aging model is provided, which can be used in the above-mentioned computer equipment, such as a computer, a server, etc., and fig. 3 is a schematic flow chart of the mobile energy storage online decision method based on the porous electrode aging model according to an embodiment of the present invention, as shown in fig. 3, the flow chart includes the following steps:

Step S201, obtaining an initial state of health of the battery. Please refer to step S101 in the embodiment shown in fig. 1 in detail, which is not described herein.

Step S202, based on the initial state of health of the battery, determining initial state parameters of the mobile energy storage at a first moment by using a pre-trained mobile energy storage online decision model based on deep reinforcement learning. Please refer to step S102 in the embodiment shown in fig. 1 in detail, which is not described herein.

Step S203, based on initial state parameters, determining action parameters of the mobile energy storage at a first moment by using a pre-trained mobile energy storage online decision model based on deep reinforcement learning; wherein, the action parameters include: the power, the working state and the next site position corresponding to the first moment. Please refer to step S103 in the embodiment shown in fig. 1 in detail, which is not described herein.

Step S204, if the working state is a charging state, determining the battery aging amount of the mobile energy storage at the next station position at the second moment based on the power corresponding to the first moment, the initial state parameters and the mobile energy storage aging evaluation model of the multi-layer perceptron trained in advance, and updating the initial state parameters based on the mobile energy storage on-line decision model trained in advance based on deep reinforcement learning to obtain the first target state parameters.

Specifically, the initial state parameters include: the method comprises the steps of moving remaining energy of energy storage at a first moment, stock energy cost of the energy storage at the first moment, station marginal electricity price of a station at the first moment, temperature of the station at the first moment, and health state of the energy storage at the first moment; the step S204 includes:

in step S2041, a termination remaining power is determined based on the current remaining power and the power corresponding to the first time.

The terminating remaining charge may be used to characterize that when the remaining charge of the battery increases to the threshold, the battery may be deemed to have been charged. Specifically, a power corresponding to the first time is obtained, the power representing a charging power of the battery at the first time. And obtaining the current residual electric quantity, namely the current residual electric quantity of the battery. And calculating the charging current of the battery at the first moment according to the power at the first moment and the current residual quantity. The calculation formula of the charging current is as follows: charging current = power/voltage. And predicting the electric quantity change trend of the battery in a future period of time according to the charging current and the current residual electric quantity. The predictions may be made using a charging profile of the battery or other relevant model. And determining to terminate the residual electric quantity according to the predicted electric quantity change trend.

Step S2042, based on the corresponding power, the current residual capacity, the terminated residual capacity and the temperature of the station at the first moment, determining the battery aging amount of the mobile energy storage at the next station position at the second moment by using the mobile energy storage aging evaluation model of the multi-layer perceptron trained in advance.

After determining the terminating remaining power, the initial state parameter and the terminating remaining power may be input, and then the battery aging amount of the mobile energy storage at the next site location at the second time may be determined. The two determining modes of battery aging can be SOH _h+1＝SOH_h +delta SOH; SOH _h+1 is the battery aging amount.

Step S2043, updating initial state parameters based on a pre-trained mobile energy storage online decision model based on deep reinforcement learning to obtain first target state parameters.

Step S205, if the working state is the discharging state or the maintaining state, updating the initial state parameters by using the pre-trained mobile energy storage on-line decision model based on the deep reinforcement learning to obtain the second target state parameters. Please refer to step S105 in the embodiment shown in fig. 1 in detail, which is not described herein.

In an optional embodiment, in step S204, updating the initial state parameters based on the pre-trained mobile energy storage online decision model based on deep reinforcement learning to obtain the first target state parameters includes:

The first target state parameter is determined by the following formula:

In an alternative embodiment, in step S205, updating the initial state parameters based on the pre-trained mobile energy storage online decision model based on deep reinforcement learning to obtain the second target state parameters includes:

If the working state is a discharge state, determining a second target state parameter by the following formula:

Wherein Action (n) is the Action selection for the second site.

In an alternative embodiment, the method further comprises:

Step a1, detecting whether the second moment meets the termination moment parameter.

The termination time parameter may be used to characterize a threshold value for terminating the loop. The termination state parameter may be 24 times, 25 times, etc., and is not specifically limited herein. Detecting whether the second time satisfies the termination time parameter may include two cases, case one: the second moment satisfies the termination moment parameter; and a second case: the second time does not satisfy the termination time parameter.

And a2, if the second moment does not meet the termination moment parameter, repeating the step of executing the initial state based on the battery, determining the initial state parameter of the mobile energy storage at the first moment to a discharge state or a holding state by using a pre-trained mobile energy storage on-line decision model based on the deep reinforcement learning, and updating the initial state parameter by using the pre-trained mobile energy storage on-line decision model based on the deep reinforcement learning until the second moment meets the termination moment parameter.

For the first case: when the second time does not satisfy the termination time parameter, the steps S202 to S205 may be repeatedly performed until the second time satisfies the termination time parameter. Reference is made to the above steps S202 to S205 for specific processing methods, and redundant description is omitted herein.

According to the mobile energy storage online decision method based on the porous electrode aging model, through determining the first target state parameter or the second target state parameter corresponding to each moment, the corresponding result of each site position, namely whether the mobile energy storage can go to the next site or not, can be determined.

The reward function is a core concept in reinforcement learning that determines the feedback of an agent to the environment after taking a certain action. This feedback may be positive (positive rewards), negative (negative rewards), or both (positive and negative rewards). The design of the reward function is critical to the learning efficiency and effectiveness of the agent. By taking as input the target state parameter, a target reward function may be determined to determine whether the mobile energy store may be directed to the next site.

Specifically, the determining the target rewarding function based on the target state parameter by using a pre-trained mobile energy storage online decision model based on deep reinforcement learning includes:

r_h＝-C^TRA-C^DEG。

According to the mobile energy storage online decision method based on the porous electrode aging model, the battery aging evaluation method is integrated in the mobile energy storage inventory path decision, the battery health of the mobile energy storage system is monitored in real time according to decision information, and the mobile energy storage is decided in real time according to the battery aging condition, so that the mobile energy storage is more suitable for a complex operation environment.

In an alternative embodiment, the porous electrode theory is a model describing the physical and electrochemical properties of the battery electrode, assuming that the electrode is a porous medium containing electrolyte and active materials (e.g., lithium salts and electrode materials in lithium ion batteries), suitable for describing different types of batteries, such as lithium ion batteries, fuel cells, supercapacitors, and the like. The model uses differential equations on the electrode scale to describe the physical and chemical processes within the electrode based on principles of conservation of mass, conservation of energy, and conservation of charge. The model mainly considers the interface growth and lithium separation mechanism of the solid electrolyte of the battery. In a battery aging model based on a porous electrode theory, lithium ions under the electrode scale are obtained through calculation of a charge conservation equation and an ion conservation equation in the transmission process of electrolyte. The conservation of charge equations in the solid electrode and electrolyte are as follows:

Wherein, For equivalent electron conductivity at the electrode scale, phi _s solid phase potential, j _tot total volume current density,Is equivalent conductivity of lithium ions in electrolytePhi _e is the electrolyte potential, and c _e is the lithium ion concentration in the electrolyte, which is the equivalent diffusivity of lithium ions in the electrolyte. /(I)

The ion conservation equation in the electrolyte and active material particles is as follows

Wherein epsilon is the porosity of the electrode, t is the time,T ₊ is the migration number of lithium ions in the electrolyte, F is the Faraday constant, c _s is the solid-phase lithium ion concentration, r is the particle radius, and D _s is the ion diffusion coefficient inside the particles.

The interfacial growth and lithium evolution of solid state electrolytes are two important phenomena of lithium ion batteries during charge and discharge, which have a significant impact on the performance and life of the battery, with reactions occurring primarily at the anode. Thus, a total of three chemical reactions can occur at the anode, i.e., a total volume current density j _tot of

j_tot＝j_int+j_SEI+j_lpl；

Where j _int is the transfer current density of the lithium intercalation reaction, j _SEI is the current density of the solid electrolyte interface growth process, and j _lpl is the current density of the lithium evolution process.

The transfer current density j _int of the lithium intercalation reaction is calculated by the Butler-Volmer equation, which is an empirical equation relating the reaction rate at the electrode to the overpotential at the electrode (the difference between the actual potential of the electrode and the equilibrium potential):

where a is the specific surface area, i _0,int is the exchange current density, a _a,int、a_c,int is the transfer coefficient of the anode reaction and the cathode reaction, respectively, R is the ideal gas constant, T is the temperature, and η _int is the overvoltage.

The current density j _SEI during solid electrolyte interface growth is calculated from the cathodic Tafel expression, which is derived from the Butler-Volmer equation in the limit, and is a simplified equation describing the relationship between current density and overpotential during electrochemical reactions, applicable to the region where the current density is far from the exchange current density:

Wherein k _0,SEI is the kinetic rate constant, A _c,SEI is the transfer coefficient of the cathode reaction, phi _e is the potential in the electrolyte, R _film is the electrolyte particle radius, and U _SEI is the equilibrium potential of the SEI growth reaction, which is the concentration of the electrolyte on the graphite surface.

The current density j _lpl during the lithium evolution is calculated from the cathode Tafel expression as follows

Where i _0,lpl is the exchange current density in the lithium evolution process and a _c,lpl is the transfer coefficient of the cathodic reaction.

The mass balance of SEI and metallic lithium can be expressed as

/>

Wherein c _SEI、c_Li is the molar concentration of the solid electrolyte interface and lithium in each unit volume of the electrode, and beta is the proportion of precipitated lithium forming the solid electrolyte interface.

Based on the battery aging model of the porous electrode theory, the charge-discharge cycle process of the battery under the working conditions of different temperatures, SOCs, power and the like can be simulated. The extent of battery aging can be assessed by tracking key variables in the model, such as lithium ion concentration, electrolyte consumption, and solid state electrolyte interface thickness. The method can be used for realizing the evaluation of the aging degree of the battery and the real-time SOH of the battery by analyzing indexes such as the capacity loss and the internal resistance increase of the battery.

In this embodiment, a mobile energy storage online decision device based on a porous electrode aging model is further provided, and the device is used to implement the foregoing embodiments and preferred embodiments, and is not described in detail. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.

The embodiment provides a mobile energy storage online decision device based on a porous electrode aging model, as shown in fig. 4, which comprises: an obtaining module 401, configured to obtain an initial state of health of a battery; a first determining module 402, configured to determine, based on an initial state of health of the battery, an initial state parameter of the mobile energy storage at a first moment by using a pre-trained mobile energy storage online decision model based on deep reinforcement learning; a second determining module 403, configured to determine, based on the initial state parameter, an action parameter of the mobile energy storage at a first moment by using a pre-trained mobile energy storage online decision model based on deep reinforcement learning; wherein, the action parameters include: the power, the working state and the next site position corresponding to the first moment; the third determining module 404 is configured to determine, if the working state is a charging state, an aging amount of the battery for mobile energy storage at a next site location at a second moment based on the power corresponding to the first moment and a pre-trained mobile energy storage aging evaluation model of the multi-layer perceptron, and update an initial state parameter based on a pre-trained mobile energy storage online decision model based on deep reinforcement learning, so as to obtain a first target state parameter; and a fourth determining module 405, configured to update the initial state parameter with a pre-trained mobile energy storage online decision model based on deep reinforcement learning to obtain a second target state parameter if the working state is a discharging state or a holding state.

In an alternative embodiment, the initial state parameters include: the method comprises the steps of moving remaining energy of energy storage at a first moment, stock energy cost of the energy storage at the first moment, station marginal electricity price of a station at the first moment, temperature of the station at the first moment, and health state of the energy storage at the first moment; wherein the third determining module 404 includes: the first determining unit is used for determining the termination residual capacity based on the power corresponding to the first moment and the current residual capacity; the second determining unit is used for determining the battery aging amount of the mobile energy storage at the next station position at the second moment by utilizing the mobile energy storage aging evaluation model of the multi-layer perceptron trained in advance based on the power corresponding to the first moment, the current residual electric quantity, the termination residual electric quantity and the temperature of the station at the first moment.

In an alternative embodiment, the apparatus further comprises: the detection module is used for detecting whether the second moment meets the termination moment parameter; and the repeated execution module is used for repeatedly executing the steps of obtaining the second target state parameter by repeatedly executing the step of obtaining the second target state parameter by utilizing the pre-trained mobile energy storage on-line decision model based on the deep reinforcement learning if the second moment does not meet the termination moment parameter, determining the initial state parameter of the mobile energy storage at the first moment to be in a discharge state or a holding state if the working state is the discharge state or the holding state, and utilizing the pre-trained mobile energy storage on-line decision model based on the deep reinforcement learning until the second moment meets the termination moment parameter.

In an alternative embodiment, the apparatus further comprises: and a fifth determining module, configured to determine a target rewarding function based on the target state parameter by using a pre-trained mobile energy storage online decision model based on deep reinforcement learning.

In an alternative embodiment, the third determining module 404 includes: the first target state parameter is determined by the following formula:

Wherein E _h is the remaining energy of h mobile energy storage, E _h' is the remaining energy of h ' mobile energy storage, n is the current site position, n ' is the next site position, h is the first time, h ' is the second time, COST _E,h is the stock energy COST of h ' mobile energy storage at the first time, COST _E,h' is the stock energy COST of h ' mobile energy storage, lambda _n,h is the site marginal price of h current site n, lambda _n',h' is the site marginal price of h ' next site n ', TEMP _n,h is the temperature of h current site n, TEMP _n',h' is the temperature of h ' next site n ', SOH _h is the health state of h mobile energy storage, SOH _h' is the health state of h mobile energy storage,/> Charging power of h' mobile energy storage at next station,/>Charging power of the mobile energy storage at the next station at the second moment, LMP (·) being station marginal electricity price updating function, temperature (·) being environment Temperature updating function, soh (·) being health state updating function of the mobile energy storage mobile battery, duration _n,n' being time distance of n and n'.

In an alternative embodiment, the fourth determining module 405 includes: if the working state is a discharge state, determining a second target state parameter by the following formula:

Wherein Action (n) is the Action selection for the second site.

In an alternative embodiment, the fifth determining module includes: if the operating state is a charging state, determining a target rewarding function by using the following formula:

r_h＝-C^TRA-C^DEG。

further functional descriptions of the above respective modules and units are the same as those of the above corresponding embodiments, and are not repeated here.

According to the mobile energy storage online decision device based on the porous electrode aging model, through the initial state of health of a battery and the mobile energy storage online decision model based on deep reinforcement learning which is trained in advance, initial state parameters of the mobile energy storage at a first moment and action parameters of the mobile energy storage at the first moment can be determined, then corresponding processing modes are determined through different working states of the mobile energy storage, namely if the working states are charging states, firstly, the battery aging amount of the mobile energy storage is determined through the mobile energy storage aging evaluation model of the multi-layer perceptron which is trained in advance, then the initial state parameters are updated through the mobile energy storage online decision model based on deep reinforcement learning which is trained in advance, and if the working states are discharging states or holding states, the initial state parameters can be updated through the mobile energy storage online decision model based on deep reinforcement learning which is trained in advance, so that second target state parameters are obtained. Therefore, the invention combines the mobile energy storage aging evaluation model based on the multi-layer perceptron with the mobile energy storage online decision model based on the deep reinforcement learning, and can accurately determine the battery aging condition in the mobile energy storage scheduling decision process, thereby improving the decision accuracy and benefit of the mobile energy storage.

The mobile energy storage online decision device based on the porous electrode aging model in this embodiment is presented in the form of a functional unit, where the functional unit refers to an ASIC (Application SPECIFIC INTEGRATED Circuit) Circuit, a processor and a memory that execute one or more software or fixed programs, and/or other devices that can provide the above functions.

The embodiment of the invention also provides a computer device which is provided with the mobile energy storage online decision device based on the porous electrode aging model shown in the figure 4.

Referring to fig. 5, fig. 5 is a schematic structural diagram of a computer device according to an alternative embodiment of the present invention, as shown in fig. 5, the computer device includes: one or more processors 10, memory 20, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are communicatively coupled to each other using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the computer device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In some alternative embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple computer devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 10 is illustrated in fig. 5.

The processor 10 may be a central processor, a network processor, or a combination thereof. The processor 10 may further include a hardware chip, among others. The hardware chip may be an application specific integrated circuit, a programmable logic device, or a combination thereof. The programmable logic device may be a complex programmable logic device, a field programmable gate array, a general-purpose array logic, or any combination thereof.

Wherein the memory 20 stores instructions executable by the at least one processor 10 to cause the at least one processor 10 to perform a method for implementing the embodiments described above.

The memory 20 may include a storage program area that may store an operating system, at least one application program required for functions, and a storage data area; the storage data area may store data created according to the use of the computer device, etc. In addition, the memory 20 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some alternative embodiments, memory 20 may optionally include memory located remotely from processor 10, which may be connected to the computer device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

Memory 20 may include volatile memory, such as random access memory; the memory may also include non-volatile memory, such as flash memory, hard disk, or solid state disk; the memory 20 may also comprise a combination of the above types of memories.

The computer device also includes a communication interface 30 for the computer device to communicate with other devices or communication networks.

The embodiments of the present invention also provide a computer readable storage medium, and the method according to the embodiments of the present invention described above may be implemented in hardware, firmware, or as a computer code which may be recorded on a storage medium, or as original stored in a remote storage medium or a non-transitory machine readable storage medium downloaded through a network and to be stored in a local storage medium, so that the method described herein may be stored on such software process on a storage medium using a general purpose computer, a special purpose processor, or programmable or special purpose hardware. The storage medium can be a magnetic disk, an optical disk, a read-only memory, a random access memory, a flash memory, a hard disk, a solid state disk or the like; further, the storage medium may also comprise a combination of memories of the kind described above. It will be appreciated that a computer, processor, microprocessor controller or programmable hardware includes a storage element that can store or receive software or computer code that, when accessed and executed by the computer, processor or hardware, implements the methods illustrated by the above embodiments.

Although embodiments of the present invention have been described in connection with the accompanying drawings, various modifications and variations may be made by those skilled in the art without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope of the invention as defined by the appended claims.

Claims

1. The mobile energy storage online decision-making method based on the porous electrode aging model is characterized by comprising the following steps of:

Acquiring an initial state of health of a battery;

Based on the initial state of health of the battery, determining initial state parameters of the mobile energy storage at a first moment by using a pre-trained mobile energy storage online decision model based on deep reinforcement learning;

Determining action parameters of the mobile energy storage at a first moment by utilizing a pre-trained mobile energy storage online decision model based on deep reinforcement learning based on the initial state parameters; wherein the action parameters include: the power, the working state and the next site position corresponding to the first moment;

if the working state is a charging state, determining the battery aging amount of the mobile energy storage of the next station position at the second moment based on the power corresponding to the first moment, the initial state parameter and a mobile energy storage aging evaluation model of the multi-layer perceptron trained in advance, and updating the initial state parameter based on a mobile energy storage online decision model trained in advance and based on deep reinforcement learning to obtain a first target state parameter;

And if the working state is a discharging state or a holding state, updating the initial state parameters by using a pre-trained mobile energy storage online decision model based on deep reinforcement learning to obtain second target state parameters.

2. The mobile energy storage online decision method based on the porous electrode aging model according to claim 1, wherein the initial state parameters comprise: the method comprises the steps of moving remaining energy of energy storage at a first moment, stock energy cost of the energy storage at the first moment, station marginal electricity price of a station at the first moment, temperature of the station at the first moment, and health state of the energy storage at the first moment; the method for determining the battery aging amount of the mobile energy storage of the next site position at the second moment based on the power corresponding to the first moment, the initial state parameter and the mobile energy storage aging evaluation model of the multi-layer perceptron trained in advance comprises the following steps:

determining a termination residual capacity based on the power corresponding to the first moment and the current residual capacity;

Based on the corresponding power, the current residual capacity, the termination residual capacity and the temperature of the station at the first moment, the health state of the mobile energy storage at the first moment, and the mobile energy storage aging evaluation model of the multi-layer perceptron trained in advance is utilized to determine the battery aging amount of the mobile energy storage at the next station position at the second moment.

3. The mobile energy storage online decision method based on the porous electrode aging model according to claim 1, further comprising:

Detecting whether the second moment meets a termination moment parameter;

And if the second moment does not meet the termination moment parameter, repeating the step of executing the initial state parameter based on the battery by utilizing a pre-trained mobile energy storage online decision model based on the deep reinforcement learning, determining the initial state parameter of the mobile energy storage at the first moment to a discharge state or a holding state if the working state is the discharge state or the holding state, and updating the initial state parameter by utilizing the pre-trained mobile energy storage online decision model based on the deep reinforcement learning until the second moment meets the termination moment parameter.

4. The mobile energy storage online decision method based on the porous electrode aging model according to claim 1, further comprising:

and determining a target rewarding function by utilizing a pre-trained mobile energy storage online decision model based on deep reinforcement learning based on the target state parameters.

5. The mobile energy storage online decision method based on the porous electrode aging model according to claim 1, wherein updating the initial state parameters based on a pre-trained mobile energy storage online decision model based on deep reinforcement learning to obtain first target state parameters comprises:

The first target state parameter is determined by the following formula:

6. The mobile energy storage online decision method based on the porous electrode aging model according to claim 1, wherein updating the initial state parameters based on a pre-trained mobile energy storage online decision model based on deep reinforcement learning to obtain second target state parameters comprises:

if the working state is a holding state, determining a second target state parameter by the following formula:

Wherein Action (n) is the Action selection for the second site.

7. The mobile energy storage online decision method based on the porous electrode aging model according to claim 4, wherein determining the target rewarding function based on the target state parameter by using a pre-trained mobile energy storage online decision model based on deep reinforcement learning comprises:

Wherein lambda _n',h' is the node marginal electricity price of node n 'at time h' and is/> The discharging power of the energy storage at the node n ' is moved for the time h ', delta h is a time interval, COST _E,h' is the stock energy COST of the energy storage at the time h ' and E _h is the residual energy of the energy storage at the time h;

if the working state is a holding state, determining a target rewarding function by using the following formula:

r_h＝-C^TRA-C^DEG。

8. A mobile energy storage on-line decision device based on a porous electrode aging model, characterized in that the device comprises:

the acquisition module is used for acquiring the initial state of health of the battery;

The first determining module is used for determining initial state parameters of the mobile energy storage at a first moment by utilizing a pre-trained mobile energy storage on-line decision model based on deep reinforcement learning based on the initial state of the battery;

The second determining module is used for determining action parameters of the mobile energy storage at the first moment by utilizing a pre-trained mobile energy storage online decision model based on deep reinforcement learning based on the initial state parameters; wherein the action parameters include: the power, the working state and the next site position corresponding to the first moment;

the third determining module is used for determining the battery aging amount of the mobile energy storage of the next station position at the second moment based on the power corresponding to the first moment and a mobile energy storage aging evaluation model of the multi-layer perceptron trained in advance if the working state is a charging state, and updating the initial state parameter based on a mobile energy storage online decision model trained in advance based on deep reinforcement learning to obtain a first target state parameter;

And the fourth determining module is used for updating the initial state parameters by using a pre-trained mobile energy storage online decision model based on deep reinforcement learning to obtain second target state parameters if the working state is a discharge state or a holding state.

9. A computer device, comprising:

A memory and a processor, the memory and the processor being communicatively connected to each other, the memory having stored therein computer instructions, the processor executing the computer instructions to perform the mobile energy storage online decision method based on the porous electrode aging model of any one of claims 1 to 7.

10. A computer-readable storage medium having stored thereon computer instructions for causing a computer to perform the mobile energy storage online decision method based on a porous electrode aging model according to any one of claims 1 to 7.