CN117151243A

CN117151243A - Training method, prediction method, device and medium for low-voltage prediction model of storage battery

Info

Publication number: CN117151243A
Application number: CN202311066165.4A
Authority: CN
Inventors: 岳楷岚; 李俊杰; 吴上波
Original assignee: Thalys Automobile Co ltd
Current assignee: Thalys Automobile Co ltd
Priority date: 2023-08-23
Filing date: 2023-08-23
Publication date: 2023-12-01

Abstract

The application relates to a training method, a prediction method, a device and a medium of a low-voltage prediction model of a storage battery, wherein the training method comprises the steps of obtaining a plurality of pieces of sample data according to historical message information of a vehicle, and determining a time interval according to sampling moments of two adjacent pieces of sample data; when the voltage value of the current sample data is smaller than a low voltage threshold value, determining that the data tag of the current sample data is a fault tag; when the data tag of the current sample data is a fault tag and the time interval between the previous sample data and the current sample data is smaller than or equal to a time threshold value, determining the data tag of the previous sample data as the fault tag; based on the sample data with the data tag, a low-voltage prediction model of the storage battery is obtained through training so as to predict the low-voltage fault of the storage battery according to the current message information.

Description

Training method, prediction method, device and medium for low-voltage prediction model of storage battery

Technical Field

The application relates to the technical field of vehicle monitoring, in particular to a training method, a prediction method, a device and a medium of a low-voltage prediction model of a storage battery.

Background

The automobile storage battery is a part of a new energy automobile electrical system and mainly used for supplying power to equipment such as a low-voltage electrical system and the like. In actual use, the storage battery can have low-voltage faults due to various reasons, and at the moment, the situations of prohibiting high-voltage, prohibiting starting of the range extender and even power interruption can occur to the vehicle, so that the vehicle experience and the driving safety are affected.

At present, machine learning technology is utilized to predict vehicle faults, training data need to be marked in the supervision and learning process, but the quantity of the training data is huge, and the marking of fault data in the training data is a very complicated work.

Disclosure of Invention

Based on the method, the device and the medium for training the low-voltage prediction model of the storage battery are provided, and the problem that training data marking is difficult in the prior art is solved.

In one aspect, a training method for a low-voltage prediction model of a storage battery is provided, including:

acquiring a plurality of pieces of sample data according to historical message information of a vehicle, wherein each piece of sample data at least comprises a voltage value of a storage battery and a sampling moment of the voltage value, and the sample data also comprises training characteristics;

determining a time interval according to sampling moments of two adjacent pieces of sample data;

determining a data tag of the sample data, comprising: when the voltage value of the current sample data is smaller than a low voltage threshold value, determining that the data tag of the current sample data is a fault tag; and when the data tag of the current sample data is a fault tag and the time interval between the previous sample data and the current sample data is smaller than or equal to a time threshold value, determining that the data tag of the previous sample data is a fault tag, otherwise, determining that the data tag is a normal tag;

and training to obtain a low-voltage prediction model of the storage battery based on the sample data with the data tag so as to predict the low-voltage fault of the storage battery according to the current message information.

In one embodiment, the obtaining the plurality of pieces of sample data of the storage battery according to the historical message information of the vehicle includes:

determining the power supply state of a direct current conversion module in the historical message information;

and determining that the storage battery state in the historical message information is a discharge state according to the invalid power supply state, so as to acquire the sample data according to the historical message information in the discharge state.

In one embodiment, the training to obtain a battery low voltage prediction model based on the sample data with the data tag includes:

inputting the sample data into an initial model, wherein the initial model comprises a plurality of classification trees in series;

the loss of output is obtained according to the following mathematical expression:

wherein L is _m For loss of output, N is the number of samples of the sample data, F _k-1 (x _i ,W _k-1 ) For representing the initial model consisting of the first k-1 classification trees, the model parameter is W _k-1 For input sample data x under the condition of (2) _i Is the predicted value of L (y) _i ,F _k-1 (x _i ,W _k-1 ) Is true value y) _i Error from the predicted value, lambda is the regularization coefficient,is a regular term;

α _i is a weighting coefficient, and:

wherein y is _i =1 represents input x _i The data label of (a) is a fault label, y _i =0 denotes input x _i Is a normal tag, c is a constant greater than 1.

Based on the minimization of the loss, parameters of the initial model are adjusted to obtain the battery low voltage predictive model.

obtaining a first training set according to feature dimension processing of the sample data, wherein the first training set comprises:

and according to the mutual exclusion characteristics in the sample data, carrying out mutual exclusion characteristic binding to obtain dense characteristics so as to reduce the number of the characteristics in the sample data.

obtaining a second training set according to sample dimension processing of the sample data, wherein the second training set comprises:

obtaining a gradient value corresponding to each sample data according to the loss;

sequencing the sample data according to the absolute value of the gradient value, and determining large gradient sample data and small gradient sample data according to a preset proportion threshold value;

and reserving the large gradient sample data to obtain a first subset, randomly sampling the small gradient sample data according to a preset sampling proportion to obtain a second subset, and combining the first subset and the second subset to obtain the second training set.

traversing the characteristics in the sample data of the father node of the classification tree, and constructing a histogram according to the values of the characteristics;

traversing each barrel of the histogram to obtain a splitting gain corresponding to each barrel;

the bucket value corresponding to the maximum splitting gain is determined as the optimal splitting point of the parent node.

In one embodiment, the obtaining the splitting gain corresponding to each bucket includes:

the splitting gain is obtained according to the following mathematical expression:

wherein A is _l ＝{x _i ∈A：x _ij ≤d}，A _r ＝{x _i ∈A：x _ij >d}，B _l ＝{x _i ∈B：x _ij ≤s}，B _r ＝{x _i ∈B：x _ij >d}；

Wherein x is _ij For sample data x _i D is the value of the splitting point, a is the preset proportion threshold value, b is the preset sampling proportion, g _i For sample data x _i A corresponding negative gradient, n being the total number of samples of pre-splitting nodes, A being the first subset, B being the second subset,for the number of left sub-node samples divided by split point, +.>For the number of right sub-node samples divided by split points, +.>For the splitting gain.

In yet another aspect, a method for predicting low voltage of a battery is provided, including:

obtaining input data according to the current message information, wherein the input data comprises prediction characteristics;

invoking a pre-trained low-voltage prediction model of the storage battery to input the input data into the low-voltage prediction model of the storage battery to obtain a low-voltage fault prediction result;

the storage battery low-voltage prediction model is obtained through training by the training method.

In still another aspect, a training device for a low-voltage prediction model of a storage battery is provided, including:

the extraction module is used for acquiring a plurality of pieces of sample data according to historical message information of the vehicle, wherein each piece of sample data at least comprises a voltage value of the storage battery and a sampling moment of the voltage value, and the sample data also comprises training characteristics;

the calculation module is used for determining a time interval according to sampling moments of two adjacent pieces of sample data;

the labeling module is used for determining the data label of the sample data and comprises the following steps: when the voltage value of the current sample data is smaller than a low voltage threshold value, determining that the data tag of the current sample data is a fault tag; and when the data tag of the current sample data is a fault tag and the time interval between the previous sample data and the current sample data is smaller than or equal to a time threshold value, determining that the data tag of the previous sample data is a fault tag, otherwise, determining that the data tag is a normal tag;

and the training module is used for training and obtaining a low-voltage prediction model of the storage battery based on the sample data with the data tag.

There is also provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of training a battery low voltage prediction model or performs the method of battery low voltage prediction.

According to the training method of the low-voltage prediction model of the storage battery, sample data are extracted from historical message information, when the voltage value of the sample data is smaller than a low-voltage threshold value, the label of the sample data is a fault label, and after the label of the current sample data is determined to be the fault label, the sample data are traced back, if the time interval between the previous sample data and the current sample data is smaller than a time threshold value, the data label of the previous sample data is determined to be the fault label, and the training method of the application adopts a tracing mode, and can label the sample data when and before the fault occurs as the fault data only by one time of traversal.

Drawings

FIG. 1 is a flow chart of a method for training a low-voltage predictive model of a battery in one embodiment;

FIG. 2 is a schematic diagram of the connection of a battery, a DCDC module, and a low voltage load;

FIG. 3 is a graph of battery voltage versus time for one embodiment;

FIG. 4 is a schematic diagram of model training in one embodiment;

FIG. 5 is a schematic diagram of mutually exclusive feature bundling in one embodiment;

FIG. 6 is a schematic diagram of a histogram of a constructed feature in one embodiment;

FIG. 7 is a flow chart of a method of battery low voltage prediction in one embodiment;

FIG. 8 is a schematic flow diagram of model creation and online detection in one embodiment;

fig. 9 is a block diagram showing a structure of a training device for a low-voltage prediction model of a battery according to an embodiment.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

The storage battery plays a vital role in the normal operation of the new energy automobile, and mainly supplies power for a low-voltage power system, equipment such as engine starting acceleration and the like, and low-voltage faults inevitably occur due to various reasons due to the fact that the storage battery is in a charging and discharging process, and at the moment, the situations of prohibiting high-voltage loading, prohibiting engine starting and power interruption of the automobile can occur, so that the automobile using experience and driving safety are seriously affected.

In the prior art, the fault monitoring of the storage battery of the vehicle adopts a real-time monitoring mode, namely, adopts a scheme of alarming when the monitored voltage is lower than a certain threshold value, and is processed through post maintenance and planned maintenance, but the early warning capability of the mode on faults is poor.

Machine learning can identify a mode and abnormal behavior related to faults by analyzing a large amount of historical data, potential fault signs can be found in advance, data annotation is not negligible for importance of machine learning, the importance of the data annotation is important for constructing a high-quality training data set and improving the performance of a machine learning algorithm, but when the required training data is huge in quantity, the workload of the data annotation is huge, the fault data is marked as a very complicated work, and the cost of the machine learning is inevitably increased.

The application provides a training method of a low-voltage prediction model of a storage battery, which adopts a backtracking mode to label sample data so as to train and obtain the model.

As shown in fig. 1, the training method includes the steps of:

step 101, acquiring a plurality of pieces of sample data according to historical message information of a vehicle.

The historical message information is exemplified as a CAN (Controller Area Network ) message of the vehicle, and the historical data of the vehicle is stored in a data warehouse in a CAN message mode because the CAN message has the characteristics of high reliability, high data transmission speed, high flexibility and large bearing data volume. In order to obtain a specific signal field, firstly, analyzing and cleaning a historical CAN message to obtain training characteristics of a vehicle relative to a storage battery in a certain period of time, wherein the training characteristics comprise a storage battery voltage value, a storage battery SOC (State of Charge), a storage battery SOH (State of Health), a power battery system related signal field, a DCDC (direct current converter) system related signal field, a time signal field, a basic State signal field of the vehicle and the like.

It should be noted that the battery is also in a low power discharge state when the vehicle is in a dormant state, but no CAN signal is reported at this time, and therefore, the above data does not include the state of the battery in the dormant state of the vehicle, thereby reducing a part of data which is not highly relevant to the use of the vehicle.

The new energy automobile can be through automatic power supply, uses power battery to supply power for the battery, and as the connection schematic diagram of battery, DCDC module and low-voltage load that the fig. 2 shows, according to the relation of DCDC module, low-voltage battery and low-voltage load three, there are 3 states to the battery:

discharge state: the DCDC module is not activated, and the electric energy of the low-voltage load is provided by the storage battery at the moment, wherein the scene mainly occurs in the sleeping process of the vehicle, and the low-voltage fault mainly occurs in the scene;

state of charge: the DCDC module is activated, the vehicle is not used at the moment but is awakened by the automatic motor supplementing, and the scene is mainly that in the process of charging the storage battery, the low-voltage load is equivalent to direct power supply of the DCDC module at the moment;

standby state: the DCDC module is activated, the low-voltage storage battery basically does not supply power to the outside, the whole low-voltage load is directly powered by the DCDC module, and the scene is mainly that the vehicle normally runs or goes on a high-voltage standby process.

According to the above working principle, the storage battery is only in the discharging process and can possibly have low-voltage faults.

In order to further reduce the data volume, the embodiment firstly checks the power supply state of the direct current conversion module in the historical message information; and determining that the storage battery is in a discharge state in the historical message information according to an invalid power supply state (namely, the DCDC module is not activated), and screening the historical message information in the discharge state for acquiring the sample data.

And 102, determining a time interval according to sampling moments of two adjacent pieces of sample data.

The historical message information comprises the voltage value of the storage battery and the sampling time of the voltage value, the sample data are arranged in a reverse order, and the time interval between the polling calculation and the next piece of data is calculated.

And 103, data labeling, namely determining a data label of the sample data.

For the battery state, the data tag includes a fault tag indicating that a low voltage fault has occurred and a normal tag indicating that a low voltage fault has not occurred normally.

As shown in fig. 3, the horizontal axis represents time, the vertical axis represents voltage value, three virtual line segments perpendicular to the vertical axis represent the normal voltage value, the automatic power-up voltage threshold and the low voltage fault voltage threshold of the battery from top to bottom, the virtual line perpendicular to the horizontal axis represents different time intervals (a larger time interval exists between each segment) and 3 types of 4 segments are respectively normal discharge/automatic power-up, DCDC power supply, low voltage fault and DCDC power supply.

The possibility of occurrence of a low voltage failure is small in the case of normal operation of automatic power supply, and therefore, the vehicle is mostly in a discharge state when the voltage failure occurs and the vehicle itself is in a special state in which automatic power supply cannot be normally operated for various reasons.

Due to the capacity limitation of the electric storage itself, the time for which the vehicle is continuously in the discharge state when in the wake-up state is typically about 30 minutes, and the time is short, at this time, the state of the DCDC power supply system other than the battery can be understood to be approximately unchanged, that is, it can be understood that the whole section of the segment 3 has a similar vehicle state, and this region can be marked as a low voltage fault. Thus, the model obtained based on this training can make a low-voltage failure prediction at a time before the occurrence of the low-voltage failure.

The sample data is considered to be low voltage fault data when it meets any of the following conditions:

condition 1: the voltage value of the sample data is less than the low voltage threshold (i.e., the low voltage fault voltage threshold in fig. 3);

condition 2: t is t _i Time is marked as fault data, and is also t _i -t _i-1 S is not more than t _i-1 The data at the moment is also denoted as fault data.

Illustratively, as for the sample data shown in table 1, the sample data are grouped by vin, ordered by timestamp, with a low voltage threshold of 10.5V and a time threshold of 50000s.

Table 1:

according to the marking mode, the data before the fault occurs can be marked in a backtracking mode, and the marking can be realized only by traversing once, so that the workload is greatly reduced.

And step 104, training to obtain a low-voltage prediction model of the storage battery based on the sample data with the data tag.

In step 104, the machine learning algorithm discovers patterns and rules from the large amount of data by analyzing it, and makes predictions or makes decisions through these patterns and rules, after sample data preparation, selecting appropriate features to describe the data. Features are representative attributes in the data that can be used to distinguish between different classes or patterns, and in general, a sample of data can include multiple features. The model is trained by selecting a suitable machine learning model and using the existing sample data, the aim of the training model is to learn accurate rules and correlations from the data by adjusting the parameters of the model. The model is then evaluated using the new data that does not participate in the training to evaluate its performance and accuracy, training to meet the use requirements.

The machine learning mode mainly comprises decision trees, random forests, artificial neural networks, bayesian learning and the like, in the embodiment, the marked sample data is a two-class data set, the data has the characteristics of multiple features and large data volume, and the probability of null values appearing in partial fields of CAN signals is large, so that the application adopts a LightGBM (Light Gradient Boosting Machine, lightweight gradient lifting machine learning) algorithm which CAN process the null values and has better performance to perform classification model training.

The LightGBM is a gradient lifting framework, in this embodiment, a CART classification tree is used as a LightGBM base learner, each base learner predicts input sample data, and then adds each prediction result to obtain a final output, equivalently, the K-th base learner is used to predict residuals of the previous K-1 base learners, a residual tree is fitted in each iteration process, and is added to a current model, a certain number of CART trees are fitted, an initial weight coefficient is given to each tree, the sum of the corresponding prediction values of the samples in each tree is the final prediction value of the samples, the optimal solution of the weight vector with minimum loss is calculated during training, and the initial weight coefficient is replaced by the optimal solution.

The specific flow of model training is shown in fig. 4, firstly, feature selection is performed on an input data set to obtain an optimal fault data set, and in order to improve the training speed of the model in a large data volume scene, a mutual exclusion feature binding technology (Exclusive Feature Bundling, EFB) is adopted to reduce the feature quantity in sample data, a single-Side Gradient Sampling technology (GOSS) is adopted to reduce the sample quantity, and a histogram difference making technology is adopted to reduce calculation and memory occupation so as to accelerate training.

By exemplarily describing a mutual exclusion feature binding technique, the EFB algorithm can solve the problem of high-dimensional data sparsity by binding mutually exclusive features.

In sample data obtained by message signal analysis, from the aspect of characteristics, sparse characteristics containing a large number of 0 elements exist, from the aspect of samples, some characteristics are not simultaneously effective, namely mutual exclusion characteristics, as shown in fig. 5, the mutual exclusion characteristics are bundled, a plurality of sparse characteristics are combined into dense characteristics, the characteristic dimension and the calculation of unnecessary 0 values are reduced under the condition of no loss of information, the speed of an algorithm is improved, and a data set bundled by the mutual exclusion characteristics is a first data set.

The method comprises the steps of firstly predicting predicted values of data of each sample by using a model, calculating losses according to the predicted values, further calculating to obtain sample gradients, wherein the gradients are obtained by calculating derivatives of a loss function on the predicted values, sorting the samples in descending order according to absolute values of the gradients, taking a multiplied by 100% of the sorted sample data as large gradient sample data, and randomly sampling and reserving (1-a) multiplied by 100% of the samples according to a preset sampling proportion b multiplied by 100%.

The first subset consisting of the large gradient sample data and the second subset sampled from the small gradient sample data are combined to obtain a second training set for the input model.

In the training process of the model, the calculation amount is reduced by adopting a mode of taking a histogram as a difference, in the LightGBM, before each node of the base learner splits, the histogram of each characteristic of sample data corresponding to the node is made, corresponding values are brought into a gain formula through the histogram, gains of each characteristic divided by each different value range are calculated, and the dividing point with the maximum gain in all leaf nodes is found to be the optimal dividing point, and recursion is continued until the stopping condition of the base learner is met.

Specifically, in this embodiment, the variance gain is used as the gain formula, and the split gain is obtained according to the following variance gain

Wherein A is _l ＝{x _i ∈A：x _ij ≤d}，A _r ＝{x _i ∈A：x _ij >d}，B _l ＝{x _i ∈B：x _ij ≤d}，B _r ＝{x _i ∈B：x _ij >d}；

Wherein x is _ij For sample data x _i D is the value of the splitting point, a is the preset proportion threshold value, b is the preset sampling proportion, g _i For sample data x _i A corresponding negative gradient, A being the first subset, B being the second subset,for the number of left sub-node samples divided by split point, +.>For the number of right child node samples divided by the split point, n is the total number of samples of the node before splitting.

In the single-side gradient sampling, since the sampling is only for small gradient templates, the original data distribution is changed, and thus, in the mathematical expression, the negative gradient g of the loss function is calculated _i Multiplying by a coefficientTo maintain data distribution.

In this embodiment, the histogram is used to find the optimal splitting point of the classification tree so as to maximize the splitting gain. The histogram is to divide the data into different discrete areas by 'bin' to the value of the feature, then to traverse the discrete data to find the optimal dividing point, compared with the traditional traversing node all data, the histogram difference only needs to traverse all the bins of the histogram.

Illustratively, each feature in the sample data corresponding to the parent node (the node to be split) is traversed, the feature values are classified into bins according to a certain width to obtain a histogram, and as shown in fig. 6, each bin corresponds to a plurality of sample data.

Traversing each bucket of the histogram, for example, for the bucket [ T2, T3 ], taking the value T2 of the bucket into a gain formula, obtaining the splitting gain corresponding to the bucket, and taking the value of the bucket with the maximum splitting gain as the optimal splitting point of the father node.

In the actual tree construction process, the LightGBM calculates leaf nodes with small histograms first, and then uses the difference of the histograms to obtain leaf nodes with large histograms, so that the histograms of its sibling leaves can be obtained with very little cost.

In one embodiment, the low voltage fault data is quite rare in the vehicle operation data, so that the low voltage fault data is not recognized sufficiently, meanwhile, the variety of samples is increased due to the fact that the operation conditions of the vehicle are various, and the problem of over fitting of the model can be caused.

In this embodiment, the problem of overfitting is avoided by improving the loss function used for model training, specifically:

for the kth classification tree, the original loss function is:

in the original loss function, N is the total number of samples, F _k-1 (x _i ；W _k-1 ) Representing that an initial model consisting of the first k-1 classification trees has W as a model parameter (or weight coefficient) _k-1 For input sample data x under the condition of (2) _i Model parameters W _k-1 Parameters ω comprising the first k-1 trees ₁ ……ω _k-1 ，L(y _i ,F _k-1 (x _i ；W _k-1 ) Is a description of the true value y _i As a function of the single sample predictor error, in this embodiment, a logarithmic loss function is used for LightGBM.

It should be noted that L ₀ As a sample total loss, the derivative of the sample total loss with respect to the predicted value is the gradient of the sample data.

In this embodiment, an L2 regularization term is introduced into the loss function to reduce the overfitting, and a higher weight is given to the unstable samples to address the sample imbalance problem.

The loss function of the K tree after improvement is as follows:

α _i the weighting coefficients are obtained by the following steps:

In the modified loss function of the optical disc,is an L2 regularization term, lambda is a regularization coefficient, omega is a parameter of the model,m is the number of classification trees.

Based on the minimization of the loss, parameters are selected to replace parameters of the initial model when the function value of the loss function is minimized.

In the scheme, model training is carried out based on the sample data extracted from the historical message information, the sample data is marked in a backtracking mode, and the data in a certain time before a fault can be found out and marked as fault data through one-time traversal, so that the efficiency of data marking is greatly improved.

On the other hand, aiming at the characteristics of a large number of null values and sparse data in the message information, a LightGBM algorithm capable of processing a large number of null values is adopted, the EFB technology and the GOSS technology are utilized for processing samples, the calculated amount of the samples is reduced, the histogram difference is adopted for dividing the data into discrete data, the training speed is increased, meanwhile, the original algorithm loss function is correspondingly improved, and the problem of over-fitting is avoided.

It should be understood that, although the steps in the flowchart of fig. 1 are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in fig. 1 may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily performed in sequence, but may be performed alternately or alternately with at least a portion of other steps or sub-steps of other steps.

In one embodiment, the method for predicting the low voltage of the storage battery can predict the low voltage fault of the storage battery based on one message information, and does not need to adopt a time sequence which can be predicted only by one sequence data, so that the timeliness of prediction is improved.

As shown in fig. 7, the battery low-voltage prediction method includes the steps of:

step 201, input data is obtained according to the current message information.

After training the model, saving the LightGBM model as a model file, deploying the model file to a cloud platform in a mode of the model file, and analyzing by a real-time CAN message to obtain model prediction required characteristics, wherein the prediction characteristics comprise signal fields with high correlation with low-voltage prediction of a storage battery, such as a storage battery voltage value, a storage battery SOC, a storage battery SOH, a power battery system related signal field, a DCDC system related signal field, a time signal field, a basic state signal field of a vehicle and the like.

And 202, calling a pre-trained low-voltage prediction model of the storage battery to input the input data into the low-voltage prediction model of the storage battery to obtain a low-voltage fault prediction result in a numerical expression form.

In the above mode, the model is deployed at the cloud end, the low-voltage fault can be predicted in real time by using the vehicle network data with lower calculation cost, and the prediction result is pushed to the service system in real time, in another embodiment, the model is deployed at the vehicle end by using the edge calculation mode, the low-voltage fault can be predicted more accurately and rapidly under the support of richer state information and higher-frequency signal sampling period, and the prediction result is sent to the vehicle controller to execute related operation, so that the occurrence probability of the low-voltage fault is reduced.

In summary, as shown in fig. 8, the process from data labeling, model training to model use of the low-voltage prediction model of the storage battery of the present application is described, and in the training and use process, data is obtained through message analysis, and data in the discharging process of the storage battery is screened for training or prediction, so as to reduce the data amount.

In one embodiment, as shown in fig. 9, there is provided a training apparatus of a low-voltage prediction model of a storage battery, including: an extraction module 301, a calculation module 302, a labeling module 303, and a training module 304, wherein:

the extracting module 301 is configured to obtain a plurality of pieces of sample data according to historical message information of a vehicle, where each piece of sample data at least includes a voltage value of the storage battery and a sampling time of the voltage value, and the sample data further includes a training feature;

a calculating module 302, configured to determine a time interval according to sampling moments of two adjacent pieces of the sample data;

an labeling module 303, configured to determine a data tag of the sample data, including: when the voltage value of the current sample data is smaller than a low voltage threshold value, determining that the data tag of the current sample data is a fault tag; and when the data tag of the current sample data is a fault tag and the time interval between the previous sample data and the current sample data is smaller than or equal to a time threshold value, determining that the data tag of the previous sample data is a fault tag, otherwise, determining that the data tag is a normal tag;

the training module 304 is configured to train and obtain a low-voltage prediction model of the storage battery based on the sample data with the data tag.

In one embodiment, the extraction module 301 in the training device determines the battery state through the power supply state of the dc conversion module in the historical message information, specifically, when the dc conversion module is inactive, the battery state is a discharging state, and then the sample data is obtained from the historical message information in the discharging state.

In one embodiment, the training module 304 constructs a regularization term of the lightweight gradient elevator model, obtains overall loss of model training based on the regularization term, errors of the true value and the predicted value, adjusts parameters of the model based on minimization of the overall loss to obtain the low-voltage prediction model of the storage battery, and specifically, the initial model comprises a plurality of classification trees in series; the loss of output is obtained during training according to the following mathematical expression:

α _i is a weighting coefficient, and:

In constructing the completion loss function L _m And performing iterative training according to the model corresponding to the sample data, wherein the iterative training comprises the processing of the sample data.

Illustratively, processing of sample data includes reducing feature dimensions with mutually exclusive feature bundling, and reducing the number of sample data with single-sided gradient sampling.

When single-side gradient sampling is carried out, obtaining a gradient value corresponding to each sample data according to the loss; sequencing the sample data according to the absolute value of the gradient value, and determining large gradient sample data and small gradient sample data according to a preset proportion threshold value; the large gradient sample data is reserved to obtain a first subset, the small gradient sample data is randomly sampled according to a preset sampling proportion to obtain a second subset, and the first subset and the second subset are combined to obtain a training set for an input model.

In one embodiment, the training module 304 discretizes the features of the sample data of parent nodes of the classification tree, constructs a histogram of the features, traverses each bucket of the histogram, and determines the optimal segmentation features and optimal segmentation points for node splitting.

The splitting gain is obtained when traversing the histogram according to the following mathematical expression:

wherein A is _l ＝{x _i ∈A：x _ij ≤d}，A _r ＝{x _i ∈A：x _ij >d}，B _l ＝{x _i ∈B：x _ij ≤d}，B _r ＝{x _i ∈B：x _ij >d}；x _ij For sample data x _i D is the value of the splitting point, a is the preset proportion threshold value, b is the preset sampling proportion, g _i For sample data x _i A corresponding negative gradient, n being the total number of samples of pre-splitting nodes, A being the first subset, B being the second subset,for the number of left sub-node samples divided by split point, +.>For the number of right sub-node samples divided by split points, +.>For the splitting gain.

And according to the optimal segmentation characteristics and the optimal segmentation points, node segmentation is carried out by taking the maximized splitting gain as a direction.

For specific limitations on the training device of the low-voltage prediction model of the storage battery, reference may be made to the above limitation on the training method of the low-voltage prediction model of the storage battery, and no further description is given here. All or part of each module in the training device of the low-voltage prediction model of the storage battery can be realized by software, hardware and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements a training method for a low-voltage prediction model of a storage battery provided by an embodiment of the present application, for example, the following steps:

In one embodiment, the computer program when executed by the processor further performs the steps of:

α _i is a weighting coefficient, and:

Or, the plurality of instructions included in the computer program may be loaded by a processor of the network device, so as to execute the method for predicting low voltage of the storage battery provided by the embodiment of the present application, for example, obtain input data according to current message information, where the input data includes a prediction feature; and calling a pre-trained low-voltage prediction model of the storage battery to input the input data into the low-voltage prediction model of the storage battery, so as to obtain a low-voltage fault prediction result.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples illustrate only a few embodiments of the application, which are described in detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.

Claims

1. The training method of the low-voltage prediction model of the storage battery is characterized by comprising the following steps of:

2. The method for training a low-voltage prediction model of a battery according to claim 1, wherein the obtaining a plurality of pieces of sample data of the battery according to the historical message information of the vehicle comprises:

3. The method for training a low-voltage predictive model of a battery according to any one of claims 1-2, wherein the training to obtain a low-voltage predictive model of a battery based on the sample data with the data tag comprises:

α _i is a weighting coefficient, and:

4. The method for training a low-voltage predictive model of a battery according to claim 3, wherein the training to obtain the low-voltage predictive model of a battery based on the sample data with the data tag comprises:

5. The method for training a low-voltage predictive model of a battery according to claim 3, wherein the training to obtain the low-voltage predictive model of a battery based on the sample data with the data tag comprises:

6. The method for training a low-voltage predictive model of a battery according to claim 3, wherein the training to obtain the low-voltage predictive model of a battery based on the sample data with the data tag comprises:

7. The method for training a low-voltage predictive model of a battery according to claim 6, wherein said obtaining a split gain for each barrel comprises:

8. A method for predicting low voltage of a battery, comprising:

the low-voltage prediction model of the storage battery is obtained by training the training method according to any one of claims 1-7.

9. A training device for a low-voltage prediction model of a storage battery, comprising:

10. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the training method of the model of any one of claims 1 to 7, or performs the battery low-voltage prediction method of claim 8.