CN113536671B

CN113536671B - Lithium battery life prediction method based on LSTM

Info

Publication number: CN113536671B
Application number: CN202110778444.8A
Authority: CN
Inventors: 马剑; 徐沛洋; 许庶; 陶来发; 程玉杰; 丁宇; 王超; 索明亮; 吕琛
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2021-07-09
Filing date: 2021-07-09
Publication date: 2023-05-30
Anticipated expiration: 2041-07-09
Also published as: CN113536671A

Abstract

The invention provides a lithium battery life prediction method based on LSTM, comprising the following steps: acquiring a capacity degradation data set of a lithium battery; preprocessing the capacity degradation data set; constructing a residual life prediction model based on LSTM; after a residual life prediction model based on LSTM is built, three lithium battery local life prediction models and a central server-side global life prediction model are further built, wherein the LSTM is of a network structure, and any one of the LSTM units comprises a forgetting gate, an input gate and an output gate; the three lithium battery local life prediction models have the same structure, and the model structure comprises two LSTM layers, two Dropout layers and a top prediction output layer.

Description

Lithium battery life prediction method based on LSTM

Technical Field

The invention relates to the technical field of fault test and prediction of high-end equipment, in particular to a lithium battery life prediction method based on LSTM.

Background

In recent years, with the improvement of industrialization degree, the group products gradually develop from singleness and simplification to intellectualization and complexity, and have more and more important application values in a plurality of fields, and certain requirements are also put on the reliability of the group products. The occurrence of fault prediction and health management ((Prognostics and Health Management, PHM)) provides a feasible solution for improving the reliability of group products, and the fault prediction and health management technology utilizes the technologies of data mining, information fusion and the like to research and analyze the real-time state data of the products, so that the maintenance cost of the products can be reduced, and the reliability of the products can be improved.

The artificial intelligence technology is adopted in the field of fault diagnosis and prediction to challenge and opportunity simultaneously, and how to rapidly diagnose faults and accurately give out prediction results by using the artificial intelligence technology becomes a research hot spot. The group products are of various types, the fault modes of each type of group products are complex and various, the development of diagnosis and prediction models for all possible fault modes is impractical, and the acquisition of all state data of the whole life cycle of the products is huge in cost. The similarity of the product running state data collected on the group products is limited, the accuracy of the prediction result is not facilitated by directly integrating the data, great difficulty occurs in joint modeling, the dilemma of 'data island' and the related laws related to the data privacy protection problem issued in recent years further limit the freedom of data exchange among enterprises and industries, and the dilemma of 'data island' and the privacy protection problem seriously prevent the further development of artificial intelligence technology.

The research of developing fault prediction and health management related technologies has great significance for group products. On the one hand, the waste of maintenance time is avoided, and the economic cost of maintenance is reduced. On the other hand, compared with the traditional maintenance at regular intervals and after-fault maintenance, the PHM can be developed on the product, so that the reliability and safety of the product can be effectively improved, and the occurrence of disastrous accidents can be reduced. However, in practical application, the privacy of data needs to be protected, meanwhile, the data which can be fully collected is insufficient, the cost for collecting the data is huge, and in addition, the products of different enterprises are in different working condition environments, so that the direct integration modeling effect is poor.

The lifetime prediction includes an End of Life (EOL) prediction and a remaining lifetime (Remaining Useful Life, RUL) prediction, and the relationship between these two may be represented by rul=eol-T, where T is a given time. The RUL prediction method based on the analysis model is to construct a mathematical model describing a product degradation process by researching a physical mechanism of product failure, wherein model parameters are required to be continuously updated according to state data of the product, and finally, the aim of predicting the residual life of the product by utilizing historical data is fulfilled. According to the method, the information behind the failure of the product is researched and analyzed through an advanced data mining technology, so that the life prediction is realized, a prediction model is constructed by utilizing a probability statistical method or a machine learning method, and the physical and mathematical capability requirements of researchers are reduced on the premise of ensuring high accuracy of a prediction result. Artificial neural networks are most frequently used in machine learning based life prediction methods. The deep learning model such as the cyclic neural network has strong characteristic self-learning capability, and how to apply the deep learning model to life prediction is a research hot spot. In summary, the life prediction technology based on data driving does not need to study the physical mechanism behind product failure, reduces the requirements of model designers, and has been widely applied to the field of residual life prediction of various products.

In summary, since the group products are complex and diverse, the failure mechanisms underlying the group products are complex and diverse, and the residual life prediction method based on the analysis model is not suitable for researching the corresponding failure mechanism and establishing a proper model for each failure mode. Based on the method, the method selects the LSTM (long short time memory network) residual life prediction method based on data driving to predict, overcomes the defect that a large amount of data is seriously relied on based on a probability statistical method, accurately predicts the residual life of group products, effectively guides the maintenance of the products and provides the reliability of the products.

Disclosure of Invention

In order to solve the problems existing in the prior art, the invention provides a residual life prediction method based on LSTM.

According to one aspect of the invention, the method comprises: an LSTM-based lithium battery life prediction method, comprising:

acquiring a capacity degradation data set of a lithium battery; preprocessing the capacity degradation data set, wherein the preprocessing comprises the following steps: data normalization processing, wherein the data normalization method is min-max normalization: dividing the data set into a training set and a testing set before model training; constructing a residual life prediction model based on LSTM; the residual life prediction model is based on a neural network RNN for processing time series data and a long-short-term memory (LSTM) network, and the LSTM is used for eliminating gradient explosion and gradient disappearance problems in the RNN network; after constructing a residual life prediction model based on LSTM, further constructing three lithium battery local life prediction models and a central server-side global life prediction model, and wherein; the LSTM is of a network structure, and any LSTM unit comprises a forgetting gate, an input gate and an output gate; the three lithium battery local life prediction models have the same structure, and the model structure comprises two LSTM layers, two Dropout layers and a top prediction output layer, wherein the two Dropout layers prevent overfitting; and finally, sending the training set into the constructed lithium battery local life prediction model and the central server-side global life prediction model, performing model training, and outputting the predicted residual life of the lithium battery.

According to the embodiment of the invention, the activation functions of the two LSTM layers are hyperbolic tangent tanh functions, and finally the activation function of the output layer Dense layer is a linear function, and the Dropout rate is set to 0.3.

Further, in the forgetting door, the following operations are performed:

the information from the previous time is selectively discarded before being transferred to the next time: will h _t-1 And x _t Is carried into the following formula to calculate a value of [0,1]]Vector therebetween, the value of which represents cell state C _t-1 How much information is retained or discarded; 0 means no reservation, 1 means all reservations;

f _t ＝σ(W _f ·[h _t-1 ，x _t ]+b _f )。

further, it is further decided what new information to add to the cell state: i.e _t Is the weight coefficient of the updated information, will h _t-1 And x _t Substituting the following formula to obtain the i _t Then by activating the function tanh, h is used _t-1 And x _t Generating new candidate state vectors

The method comprises the following steps:

i _t ＝σ(W _i ·[h _t-1 ，x _t ]+b _i )

further, the status information may be updated by the following formula, wherein:

the output of cells also needs to be according to h _t-1 And x _t To judge, first, h _t-1 And x _t Carry-in

Obtaining the judging condition, then taking Ct into the tanh activation function to calculate a value of [ -1,1]And multiplying the vector between the two vectors by a judging condition to obtain a final output, wherein: />

o _t ＝σ(W _o [h _t-1 ，x _t ]+b _o )

h _t ＝o _t *tanh(C _t )。

Further, in the model training process, batch training sample batch_size=32, single sample data length is 50, model training selection optimizer Adam, learning rate is 0.001, and training frequency epochs is 20.

Further, after the model is trained, the lithium battery local life prediction model sends a weight coefficient and a loss value to the central server-side global life prediction model, and the central server acquires the weight transmitted by the lithium battery product side and needs to perform further processing so as to weight the weight value.

Further, the weight value is determined by:

w _i weight, loss, representing i-th product-side predictive model _i The loss value representing the i-th product end prediction model, w represents the weight of the central server prediction model, and the calculated w is then sent to the single product end by the central server to update the weight, wherein the communication times between the single device end and the central server end are set to be 20.

Further, after model training, determining local prediction model initialization parameters of the next iteration of the lithium battery local life prediction model; the initialization parameters are determined by the following method: directly adopting the structure of the local life prediction model of the previous round, ensuring the model structure unchanged, freezing the weight parameters and the structure parameters of the previous four layers, randomizing the weights of the other layers, performing the next round of training, inputting the state data of the previous 50 moments, and outputting the state data of the next moment.

Further, by adopting the initialization parameter determining method, the average accuracy of the prediction result is 95.48%.

Drawings

Various embodiments or examples ("examples") of the present disclosure are disclosed in the following detailed description and drawings. The drawings are not necessarily drawn to scale. In general, the operations of the disclosed methods may be performed in any order, unless otherwise specified in the claims. In the accompanying drawings:

FIG. 1 illustrates the overall framework of the present invention for group product remaining life prediction;

FIG. 2 illustrates a federal migration learning process for group product life prediction of the present invention;

FIG. 3 is a design of a single product side local life prediction model and a central server side global life prediction model;

FIG. 4 is a schematic diagram of an LSTM network architecture;

FIG. 4A is an LSTM forget gate cell;

FIG. 4B is an LSTM input gate unit;

FIG. 4C is an LSTM update operation;

FIG. 4D is a graph showing the status of the output cells;

FIG. 5 is an LSTM lifetime prediction model;

FIG. 6 is raw data for a lithium battery at 25 ℃;

FIG. 7 is a graph of normalized results of lithium battery data at 25 ℃;

FIG. 8 is a graph showing a comparison of the data smoothing of a lithium battery before and after;

FIG. 9 is a model structure of lithium battery cell end life prediction based on LSTM;

FIG. 10 is an example of a plot of local prediction results for a monomer product end;

FIG. 11 is a LSTM based learning strategy;

FIG. 12 is an example of a scheme one prediction outcome plot;

FIG. 13 is a scheme II predicted outcome plot example;

fig. 14 is a scheme three prediction result plot example.

Detailed Description

Before explaining one or more embodiments of the disclosure in detail, it is to be understood that the embodiments are not limited in their application to the details of construction and to the steps or methods set forth in the following description or illustrated in the drawings.

The prediction of the residual life of the group product is studied, and a general framework aiming at the prediction of the residual life of the group product is designed first. The client-server architecture is one of the most common, and its architecture is relatively simple. In this framework, multiple participants (also referred to as users or clients) who possess local data collaboratively co-train a machine learning model for all participants with the aid of a central server (also referred to as a parameters server or aggregation server). The specific workflow under the framework is as follows: (1) The multiple participants train the model locally to obtain parameters such as the weight of the model, and then send the parameters to a central server to wait for further processing. (2) And the central server obtains the parameters uploaded by the participants and then carries out weighted average processing. (3) And the central server transmits the weighted average processed result to each participant. (4) Each participant updates the local model with parameters issued by the central server. The above steps are continued until a given number of iterations is reached, resulting in a final model.

Specifically, the overall framework of the present invention design for the prediction of remaining life of a group product is shown in FIG. 1. As shown in fig. 1, the overall framework of the group product remaining life prediction of the present invention includes a plurality of individual product ends and a central server end, three individual products being illustratively shown. Each monomer product end obtains the original data of group products (such as lithium batteries) through a means such as a sensor, firstly, the original data is preprocessed, the preprocessing comprises the modes of data normalization, data smoothing, data set division and the like, then training is carried out on the local monomer product end, and the obtained weight parameters are uploaded to a central server end. The exemplary three monomer product ends of the monomer product-server architecture shown in fig. 1 do not have any interaction with each other, only the weight parameters of the residual life prediction model trained on the monomer product end are required to be uploaded to the server end, then the server end performs weighted average processing on the weight parameters and sends the processed weight parameters to the three monomer product ends, the three monomer product ends adopt the thought selection part parameters of transfer learning to update the model, and the operations are repeated until the residual life prediction global model of the server end converges.

FIG. 2 is a flow chart of federal transfer learning for residual life prediction for a group product according to the present invention. As shown in fig. 2, the learning flow chart includes: at the beginning, obtaining the original data of group products; sending the original data into a residual life prediction model of a single product end, training, and obtaining weight parameters of the residual life prediction model of the single product end after training; the single product end sends the trained weight parameters to the server end; the server side gathers the weight and the loss value of all the local life prediction models, and sends the weight to each monomer product side after weighting treatment; after receiving weight parameters issued by a server, each monomer product end selects a proper migration learning strategy to update a local life prediction model; and the monomer product end receives the original data again on the basis of the updated local life prediction model, trains the model, and uploads the weight parameters of the residual life prediction model of the local monomer product end. The process is a complete cycle, then whether the cycle times reach the iteration times set in advance is judged, if not, the training of the monomer product local life prediction model is continued, otherwise, the model of one product end is arbitrarily selected to inherit all weight parameters, and the final total model is obtained.

After the overall framework for predicting the residual life of the group products is designed, the task is completed according to the framework content. Firstly, a monomer local life prediction model is designed, which mainly comprises the following two parts: and (3) data analysis and preprocessing, and monomer local life prediction model construction, and then, building a central server-side global life prediction model. A specific flow is shown in fig. 3. Product operating state data collected by the sensor often needs to be preprocessed for training and testing of subsequent life prediction models. The application chooses to use a machine learning based method to build a product end local life prediction model.

An exemplary module of the present invention will be described in detail with reference to fig. 3.

Step 1: monomer product end data analysis and preprocessing

Because of the influence of non-steady state in the product running process or the abnormal condition when the sensor collects data, the product running state data collected by the sensor often have larger fluctuation values or abnormal data, and the model prediction effect obtained by directly using the collected data for training the model is mostly poor, so that pretreatment is often needed before the life prediction is carried out by using the data. The data preprocessing mainly comprises the following three parts: data normalization processing, data smoothing processing and data set division.

Step 1.1: data normalization

The scale of the data range acquired each time is different, if one data with a very large scale range exists, the influence of the data with a small scale range on model training can be ignored, so normalization is needed. The investigation finds that the common data normalization methods mainly comprise two methods: 0 mean normalization (Z-score standardization), min-max normalization (Min-Max Normalization).

The 0-mean normalization is to process each data by the following formula, wherein the distribution of the processed data accords with the standard normal distribution, namely, the mean value is 0, and the standard deviation is 1. The following formula represents the acquired sample data, μ represents the average value of all the sample data, and is the standard deviation of all the sample data.

min-max normalization refers to mapping sample values to [0,1] through transformation]Sample data X (X ₁ ，x ₂ ,. the normalization is as follows.

The two data normalization methods have advantages and disadvantages. The 0-mean normalization process is relatively complex, requiring some calculations to be performed in advance using samples, and is relatively complex. Whereas for min-max normalization its processing is relatively simple, essentially transforming the sample data into a fraction between 0,1, it is simpler to use the min-max normalization method when no distance measure is involved. In practice, a proper method is selected according to specific situations.

Step 1.2: data smoothing

The collected data often have burrs and noise, and the collected data are directly used for predicting and analyzing the poor effect, and the original sample data needs to be subjected to smooth pretreatment. A local weighted regression (local WeightedRegression, LWR) algorithm is often used in data smoothing, and its workflow is as follows: firstly, dividing a sample into a plurality of intervals, performing polynomial fitting on a local sample, estimating a fitting value by using a least square method, and finally, finishing smoothing processing on an original sample. The main idea of the local weighted regression algorithm is to calculate different weight coefficients according to the distances between other points and the observed sample points, multiply the sample points by corresponding fitting weights and add the fitting weights to obtain the fitting values of the observed points, and all the sample points can obtain smooth data with noise removed through such processing. The specific mathematical principle is as follows:

a range scale is set to be 2K in advance, and for a sample set Q= { Q ₁ ，q ₂ ，...，q _N Any one sample point q in } _i (i=1, 2, …, N), the weighted fitting value thereof is obtained by the following equation.

Wherein the weight coefficient w _i (q _k ) The value of (2) is determined by the following equation, from which it can be seen that for the distance observation point q _k The farther sample point q _i Corresponding to the weight coefficient w _i (q _k ) The smaller the value of (2) is, for the distance observation point q _k The closer the sample point q _i Corresponding to the weight coefficient w _i (q _k ) The larger the value of (c) is, the better the noise point of the abnormality in the original data set can be removed.

The local weighted regression algorithm can effectively remove noise in the original data set, so that the original data drawing curve becomes smooth, and the situation of over fitting or under fitting is avoided.

Step 1.3: data set partitioning

After data normalization processing and data smoothing processing, the data set is further required to be divided into a training set and a testing set before model training is performed. The K-fold cross validation is a method for dividing a data set, and is completed by taking certain data in the data set as a test set and the rest as a training set which are not repeated every time, so that the data can be fully utilized, and the method is suitable for being used in classification tasks. The leave-out method is also a method commonly used for dividing data sets, and is easy to operate and suitable for being used when the sample data volume is sufficient. In particular, when applied, the appropriate method is selected in combination with the task requirements and the size of the data volume.

Step 2: construction of LSTM-based residual life prediction model

With the development of sensor technology, more and more product running state data are obtained through sensors, and the traditional shallow machine learning algorithm is worry about processing massive data. While the powerful nonlinear mapping capability and Gao Weite sign extraction capability make it well suited to cope with this situation. RNNs are neural networks that process time series data, whereas population products are studied herein whose operational status data is slowly decaying over time, belonging to time series data, and thus are suitable for application to RNNs. However, the simple RNN has problems of gradient explosion and gradient disappearance, and long-short-term memory (LSTM) solves the problem by selective forgetting. Therefore, LSTM is selected herein to predict the remaining life of group products, and fig. 4 is a schematic diagram of LSTM network structure according to the present invention.

As shown in fig. 4. One LSTM cell contains three gates, called the forget gate, the input gate and the output gate, respectively, to control the cell state.

The information from the previous time is often selectively discarded before being transferred to the next time, and the forgetting gate can complete the operation. Will h _t-1 And x _t The formula shown in the following chart is carried out to calculate a formula belonging to [0,1]]Vector therebetween, the value of which represents cell state C _t-1 How much information is retained or discarded. 0 indicates no reservation and 1 indicates all reservations. The forgetting door is shown in fig. 4A, in which:

f _t ＝σ(Wf·[h _t-1 ，x _t ]+b _f )

the next step is to decide which new information to add to the cell state. i.e _t Is the weight coefficient of the updated information, will h _t-1 And x _t The first formula, which is introduced in the following diagram, can be obtained and then used by the activation function tanh with h _t-1 And x _t Generating new candidate state vectors

The two-step description is shown in fig. 4B, in which:

i _t ＝σ(W _i ·[h _t-1 ，x _t ]+b _i )

the status information may be updated by a formula shown in the following figure, whereby the update covers both part of the status information at the last moment and part of the status information at that moment. The update operation is as shown in fig. 4C, wherein:

Obtaining judging conditions, then C _t Bringing the tanh activation function to calculate a value of [ -1,1]And multiplying the vector of the two by a judging condition to obtain the final output. This step is shown in fig. 4D, wherein:

o _t ＝σ(W _o [h _t-1 ，x _t ]+b _o )

h _t ＝o _t *tanh(C _t )

step 3: construction of monomer product end local life prediction model and central server end global life prediction model

After the residual life prediction model based on the LSTM is built, a single product end local life prediction model and a central server end global life prediction model are further built.

The construction of the local and global life prediction models is completed by an LSTM layer, a Dropout layer and a Dense layer, generally, in the construction process, an original model can be designed according to experience, then an input sample decides to increase or decrease the number of layers of the LSTM according to the accuracy of a prediction result and the length of the prediction time, and generally, the LSTM layers of 1 to 3 layers can meet the accuracy requirement of the prediction result. FIG. 5 is an example LSTM lifetime prediction model according to the present invention, as shown in FIG. 5, with 2 LSTM layers and 2 Dense layers. The input is status data of the product and the example output is a predicted lifetime of the product.

In the invention, in order to facilitate the interactive update of parameters of the central server side and the single product side, the life prediction model structure of the central server side is identical to the life prediction model structure of the single product side, so that the convergence of the global model can be ensured, and therefore, when the model is built, the life prediction model structure of the central server side and the life prediction model of the single product side are completed in one building process.

The LSTM life prediction model is a model structure of a single product end life prediction model which needs to be fixed, and the summarized single product end weight parameter part is used as an initialization parameter. Therefore, the method is to summarize the weight parameters of the monomer product end, and consider which part is selected as the initial parameters of the monomer product end prediction model of the next iteration.

Firstly, parameter summarizing processing of a monomer product end local life prediction model. The existing processing mode is to weight the weight of the model according to the number of samples at each product end, and the weight coefficient of the weight of the product with more samples is larger, but the influence of the loss value of the model on the weight coefficient is ignored, so that the prediction precision of the global model is not high enough. The invention is thatAnd comprehensively weighting the weight value according to the loss value and the number of samples, as shown in a formula (1.1). w (w) _i Weight, loss, representing the i-th product-side predictive model _i Representing the mean square error value of the i-th product end prediction model, comprehensively considering the influence of the sample number and the model loss value on the weighting coefficient. w represents the weight of the central server prediction model, and k represents the number of monomer product ends. The idea of this is to consider that the single product end model with large loss value has larger duty ratio in the global model, and the local data has large influence on the global model. The w calculated by the central server is then sent to the monomer product end for updating the weights.

The above process is a complete iteration of the information interaction between the monomer product side and the central server side, and the process needs to be repeated for a plurality of times in order to achieve a satisfactory prediction effect of the global model. A fixed number of iterations m is typically preset and stopped when the above process is repeated m times.

Next, which part is selected as the local predictive model initialization parameter for the next iteration will be described. Because the global life prediction model herein is LSTM-based, it is desirable to have a way to implement LSTM in combination with model-based migration learning. The realization method combining LSTM and model-based migration learning is to fix the structural parameters and weight parameters of the LSTM layer by layer, then re-input data to train other layers to obtain new weight parameters, and select the optimal frozen layer number and the optimal structural parameters according to the accuracy of the prediction result of the model. The structural parameters refer to parameters such as learning rate, activation function, optimizer and the like of the model, and the weight parameters refer to bias and weight values of the model. The principle is that the first layers of the trained model have good function of capturing characteristic relation of input data, the process of retraining the model again to obtain the captured input characteristic can be omitted by directly transferring the first layers, and the actual transfer of the first layers is determined according to the accuracy of output prediction results in specific application. Therefore, the invention can complete the migration learning by freezing part or whole of the global life prediction model of the central server side.

Exemplary embodiments: prediction of remaining life of lithium batteries

The lithium batteries of different types have certain similarity and certain difference, can be regarded as a group product, have large variation of degradation trend of the lithium batteries of different types, and are high in cost for collecting the capacity data of the lithium batteries and generally unwilling to share the data among different enterprises due to competition, so that the lithium battery life prediction method is suitable for predicting the life of the lithium batteries based on LSTM.

The data set used in the invention is collected through a cycle life experiment, a lithium ion battery cycle life test bed is provided for battery research and development enterprises, the temperature condition collected by the data set is 25 ℃, a voltage-limiting constant-current charging and discharging mode is utilized, and the stopping condition of the experiment is that the capacity of the lithium battery is degenerated to 82% of the initial value.

The data set includes capacity degradation data for 10 different models of lithium ion batteries, 3 of which are used herein (groups a, B, C). A total of 13 sets of data, each set containing an average of 1000 capacity lithium battery data. The cathode materials are the same between the different groups, but the anodes are different. The degradation data of the batteries of different models selected in the invention under the temperature condition of 25 ℃ are shown in table 1.

Table 1 battery grouping at 25 c

The above A, B, C lithium batteries were combined and used as a data set for model training, and the specific combinations are shown in table 2, and a total of 35 combinations were made.

Table 2 data set grouping case

Since a plurality of groups of lithium ion batteries with different types are selected, the trend difference of failure processes among the different groups is larger, and the original degradation data (residual capacity) of the lithium ion batteries with different types are shown in fig. 6. It can be seen from the graph that the trend of the failure process of the raw data before being processed is not the same, which is unfavorable for the convergence of the global life prediction model of the subsequent central server side.

In the lithium battery cycle life test process, due to the fact that materials of different battery models are different, the lithium battery capacity data needs to be normalized. Since no distance metric or covariance calculation is involved here, a simpler method of min-max normalization is employed. The failure threshold was set to 82% of the initial capacity of the lithium battery, the initial capacity was set to 1, the failure threshold was set to 0, and the sample data X (X ₁ ，x ₂ ，...，x _n ) The normalization of the lithium battery sample data at 25℃is shown in FIG. 7. It can be seen that the normalization makes the samples have the same distribution, which is beneficial to improving the accuracy of life prediction results.

As can be seen from fig. 7, the normalized data plot curve still has spike-like fluctuation, so that the primary battery data needs to be subjected to smoothing pretreatment, and a local weighted regression algorithm is selected for processing. Fig. 8 is a comparison of the data of a lithium battery sample before and after smoothing. In the figure, a black curve is drawn by original degradation data, and a blue curve is drawn by data subjected to smoothing treatment. As can be seen from the graph, the original data degradation black curve has burr-like fluctuation, and the smooth data blue curve is subjected to local weighted regression smoothing treatment, so that the noise interference is eliminated on the premise of keeping close to the curve trend of the original data black curve, and the accuracy of a prediction result is improved.

After normalization and smoothing, the data set needs to be divided, and the reserving method is directly used because the sample data volume of each group is sufficient. Here 60% of the data set is divided into training sets and the remaining data is divided into test sets.

The local prediction of the monomer product end is that each monomer product is trained to obtain a local life prediction model by using own data locally, and then the weight parameters of the model are uploaded to a central server end. The local prediction model of the monomer product end uses an LSTM model, and the three monomer product ends all adopt the same model structure. The model structure comprises two LSTM layers, two Dropout layers for preventing overfitting and a top predictive output layer. The activation function of the first two LSTM layers is tanh (hyperbolic tangent), and the activation function of the last output layer Dense layer is linear, and the drop rate is set to 0.3. The Dropout rate is set through multiple attempts, the Dropout rate is increased when the model is over-fitted, the Dropout rate is reduced when the model is under-fitted, and the model fitting effect is better when the Dropout rate is 0.3, and the model structure is shown in fig. 9. In the model training process, batch training samples batch_size=32, the input data length of a single sample is 50, the output data length is 1, namely, the 51 st data is predicted by the first 50 data, the model training selection optimizer rmsprop, the learning rate is 0.001, and the training frequency epochs is 20.

Life prediction is performed on A, B, C three types of lithium battery data acquired under the temperature condition of 25 ℃ by using the model, accuracy of prediction results is recorded, degradation data of one type of battery is input each time, and partial results are shown in fig. 10. The prediction effect can be seen according to the overlapping degree of the two curves, and is acceptable in the whole according to the image, but few cases such as B_2006 and C_3006 with large prediction result errors exist.

The model of the central server is a final global prediction model, and in order to enable the global model to have a good prediction effect on the data of each single product, the global life prediction model is still built by using an LSTM network.

In order to facilitate final convergence of the global model and processing parameters such as weight transmitted by the single device, the global model adopts a model structure similar to a single device end prediction model, and comprises two LSTM layers, two Dropout layers, a prediction output layer for preventing overfitting and a top layer. In the model training process, batch training samples batch_size=32, the single sample data length is 50, the model training selection optimizer Adam, the learning rate is 0.001, and the training times epochs are 20.

According to the disclosure of the present invention, the weight parameters of the monomer product ends are summarized and processed first, and then, which part is selected as the initialization parameters of the monomer product end prediction model of the next iteration is considered.

Firstly introducing a summarizing process of weight parameters of a monomer product end, and carrying out further processing on the weight transmitted by the monomer product obtained by a central server, wherein the weight value is weighted by using a loss value according to the above content. As shown in formula (2), w _i Weight, loss, representing i-th product-side predictive model _i Representing the loss value of the ith product-side predictive model, w represents the weight of the central server predictive model. The w calculated by the central server is then sent to the monomer product end for updating the weight, and in actual operation, the communication times between the monomer equipment end and the central server end are found to be set to 20, so that a good prediction effect can be achieved.

Matching and combining lithium battery data of 1 sample A, 7 sample B and 5 sample C, and calculating 35 combination modes, wherein 3 samples of each combination are used as monomer equipment data to train at a monomer product end local prediction model to obtain a weight parameter w _i And loss parameter loss _i The weight w after the processing is obtained after the weighted average processing, and then the weight parameters are issued to each monomer product for updating the model.

Which part is selected as the local prediction model initialization parameter of the next iteration is described, and the following three schemes are taken together, which are shown in fig. 11 in detail. The first method is to directly adopt the structure of the local life prediction model of the previous round, ensure the model structure unchanged, freeze the weight parameters and the structure parameters of the previous four layers, then randomize the weights of the other layers, perform the next round of training, input the state data of the previous 50 moments, and output the state data of the next moment. And secondly, adopting the structure of the previous round of local life prediction model, ensuring that the model structure is unchanged, freezing the weight parameters and the structure parameters of the third layer and the fourth layer, and randomizing the weights of the other layers to obtain a new model. The third method is to adopt the structure of the previous round of local life prediction model, ensure the model structure unchanged, freeze the weight parameters and the structure parameters of the previous two layers, and randomize the weights of the other layers to obtain a new model.

According to the idea of the first scheme, the structure of the local life prediction model of the previous round is directly adopted, the model structure is kept unchanged, the weight parameters and the structure parameters of the four layers before freezing, then the weights of the other layers are randomized, the next round of training is carried out, the average accuracy of the prediction result is counted to be 95.48%, and the prediction result is partially plotted as shown in a graph 12.

According to the idea of the second scheme, the structure of the local life prediction model of the previous round is adopted, the model structure is kept unchanged, the weight parameters and the structure parameters of the third layer and the fourth layer are frozen, and then the weights of the other layers are randomized, so that a new model is obtained. The average accuracy of the predicted result is 95.55%, and compared with the first scheme, the accuracy is slightly improved, and the partial graph of the predicted result is shown in fig. 13.

The third scheme is adopted, the structure of the local life prediction model of the previous round is kept unchanged, the weight parameters and the structure parameters of the previous two layers are frozen, and the weights of the other layers are randomized, so that a new model is obtained. The average accuracy of the obtained prediction result is 95.72%, and the accuracy is improved in comparison with the second scheme. The prediction result is partially plotted as shown in fig. 14.

The accuracy of the statistical central server-side global model predictions and the monomer product-side local model predictions is shown in the following table. According to the table below, the average accuracy of the prediction results of the local model of the single product end is 95.17%, and the average accuracy of the prediction results of the global model of the central server end is 95.48%, 95.55% and 95.72%, so that compared with the accuracy of the local prediction model, the accuracy of the prediction results of the global model of the central server end is improved to a certain extent, and finally, the third scheme is selected as a migration learning strategy according to the accuracy.

TABLE 5.3 prediction accuracy statistics

In this embodiment, the LSTM model and the construction method disclosed in the present invention predict the remaining life of the lithium battery sample, and according to the image and the accuracy record of the prediction result, it is obvious that the accuracy of the prediction result can be effectively improved by using the LSTM-based prediction model to predict the remaining life of the lithium battery sample.

The invention is researched aiming at LSTM method for predicting life of group products. Aiming at the problem of predicting the residual life of group products, the invention provides a specific implementation method. Firstly, predicting a local life prediction model at each single product end, then processing parameters of the local model, and then, carrying out next iteration on the parameters processed by the migration part of the single product end to finally obtain a global life prediction model.

The existing FedAVg algorithm is improved at the position of the global life prediction model of the central server, the weight coefficient is not distributed according to the quantity of data quantity of the product end, but is distributed according to the loss value of the life prediction model of the product end, and the weight coefficient of the life prediction model with a larger loss value is correspondingly larger, so that the accuracy of the global life prediction model is improved.

In conclusion, although the invention has been described with reference to the embodiments shown in the drawings, equivalent or alternative means may be used without departing from the scope of the claims. The components described and illustrated herein are merely examples of systems/devices and methods that may be used to implement the disclosed embodiments and may be replaced with other devices and components without departing from the scope of the claims.

Claims

1. An LSTM-based lithium battery life prediction method, comprising:

acquiring a capacity degradation data set of a lithium battery;

preprocessing the capacity degradation data set, wherein the preprocessing comprises the following steps:

data normalization processing, wherein the data normalization method is min-max normalization:

dividing the data set into a training set and a testing set before model training;

an LSTM-based residual life prediction model is constructed, which is characterized in that,

the residual life prediction model is based on a neural network RNN for processing time series data and a long-short-term memory (LSTM) network, and the LSTM is used for eliminating gradient explosion and gradient disappearance problems in the RNN network;

after constructing a residual life prediction model based on LSTM, further constructing three lithium battery local life prediction models and a central server-side global life prediction model, and wherein;

the LSTM is of a network structure, and any LSTM unit comprises a forgetting gate, an input gate and an output gate;

the three lithium battery local life prediction models have the same structure, and the model structure comprises two LSTM layers, two Dropout layers and a top prediction output layer, wherein the two Dropout layers prevent overfitting;

and finally, sending the training set into the constructed lithium battery local life prediction model and the central server-side global life prediction model, performing model training, and outputting the predicted residual life of the lithium battery.

2. The LSTM based lithium battery life prediction method of claim 1, wherein the activation function of the two LSTM layers is a hyperbolic tangent tanh function, the activation function of the final output layer Dense layer is a linear function, and the Dropout rate is set to 0.3.

3. The LSTM based lithium battery life prediction method of claim 1, wherein in the forgetting gate, the following operations are performed:

the information from the previous time is selectively discarded before being transferred to the next time: will h _t-1 And x _t The following formula is carried in to calculate a value of [0,1]]Vector therebetween, the value of which represents cell state C _t-1 How much information is retained or discarded; 0 means no reservation, 1 means all reservations;

f _t ＝σ(W _f ·[h _t-1 ，x _t ]+b _f )。

4. the LSTM based lithium battery life prediction method according to claim 3, further deciding what new information to add to the cell state: i.e _t Is the weight coefficient of the updated information, will h _t-1 And x _t Substituting the following formula to obtain the i _t Then by activating the function tanh, h is used _t-1 And x _t Generating new candidate state vectors

The method comprises the following steps:

i _t ＝σ(W _i ·[h _t-1 ，x _t ]+b _i )

5. the LSTM based lithium battery life prediction method of claim 4, wherein the state information can be updated by a formula wherein:

Obtaining the judging condition, then taking Ct into the tanh activation function to calculate a value of [ -1,1]And multiplying the vector between the two vectors by a judging condition to obtain a final output, wherein:

o _t ＝σ(W _o [h _t-1 ，x _t ]+b _o )

h _t ＝o _t *tanh(C _t )。

6. the LSTM based lithium battery life prediction method according to any one of claims 1 to 5, wherein during model training, batch training samples batch_size=32, single sample data length 50, model training selection optimizer Adam, learning rate 0.001, training number epochs 20.

7. The LSTM based lithium battery life prediction method according to any one of claims 1 to 5, wherein after model training, the lithium battery local life prediction model is to send a weight coefficient and a loss value to a central server-side global life prediction model, and the central server obtains the weight transmitted by a lithium battery product side and needs to perform further processing to weight the weight value.

8. The LSTM based lithium battery life prediction method of claim 7, wherein the weight value is determined by:

w _i weight, loss, representing i-th product-side predictive model _i The loss value representing the i-th product end prediction model, w represents the weight of the central server prediction model, and the calculated w is then sent to the single product end by the central server for updating the weight, wherein the communication times between the single device end and the central server end are set to be 20.

9. The LSTM based lithium battery life prediction method of claim 8, wherein after model training, determining local prediction model initialization parameters for a next iteration of the lithium battery local life prediction model; the initialization parameters are determined by the following method: the structure of the local life prediction model of the previous round is directly adopted, the model structure is kept unchanged, the weight parameters and the structure parameters of the previous four layers are frozen, then the weights of the other layers are randomized, the next round of training is carried out, the state data of the previous 50 moments are input, and the state data of the next moment is output.

10. The LSTM based lithium battery life prediction method of claim 9, wherein an average accuracy of the prediction result is 95.48% by using the initialization parameter determination method.