CN112782589A - Vehicle-mounted fuel cell remote fault classification diagnosis method and device and storage medium - Google Patents
Vehicle-mounted fuel cell remote fault classification diagnosis method and device and storage medium Download PDFInfo
- Publication number
- CN112782589A CN112782589A CN202110104677.XA CN202110104677A CN112782589A CN 112782589 A CN112782589 A CN 112782589A CN 202110104677 A CN202110104677 A CN 202110104677A CN 112782589 A CN112782589 A CN 112782589A
- Authority
- CN
- China
- Prior art keywords
- fuel cell
- data
- vehicle
- xgboost
- classification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 239000000446 fuel Substances 0.000 title claims abstract description 117
- 238000000034 method Methods 0.000 title claims abstract description 64
- 238000003745 diagnosis Methods 0.000 title claims abstract description 29
- 238000003860 storage Methods 0.000 title claims abstract description 12
- 238000012549 training Methods 0.000 claims abstract description 46
- 230000008569 process Effects 0.000 claims abstract description 26
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 24
- 238000012545 processing Methods 0.000 claims abstract description 11
- 239000013598 vector Substances 0.000 claims description 45
- 238000013145 classification model Methods 0.000 claims description 41
- 238000012360 testing method Methods 0.000 claims description 36
- 238000007781 pre-processing Methods 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 10
- 238000011156 evaluation Methods 0.000 claims description 7
- 238000004140 cleaning Methods 0.000 claims description 5
- 238000012544 monitoring process Methods 0.000 claims description 5
- 238000005457 optimization Methods 0.000 claims description 4
- 238000007635 classification algorithm Methods 0.000 abstract description 12
- 238000009826 distribution Methods 0.000 abstract description 6
- 238000005070 sampling Methods 0.000 abstract description 6
- 230000009467 reduction Effects 0.000 abstract description 4
- 230000006870 function Effects 0.000 description 18
- 229910052739 hydrogen Inorganic materials 0.000 description 6
- 239000001257 hydrogen Substances 0.000 description 6
- UFHFLCQGNIYNRP-UHFFFAOYSA-N Hydrogen Chemical compound [H][H] UFHFLCQGNIYNRP-UHFFFAOYSA-N 0.000 description 5
- 238000012952 Resampling Methods 0.000 description 5
- 238000009413 insulation Methods 0.000 description 5
- 239000012528 membrane Substances 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000010606 normalization Methods 0.000 description 4
- 230000000717 retained effect Effects 0.000 description 4
- 239000000126 substance Substances 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 230000006399 behavior Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000010276 construction Methods 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000003066 decision tree Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000005611 electricity Effects 0.000 description 2
- 238000004146 energy storage Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 208000032953 Device battery issue Diseases 0.000 description 1
- 230000032683 aging Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000011217 control strategy Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000005518 electrochemistry Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 150000002431 hydrogen Chemical class 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01R—MEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
- G01R31/00—Arrangements for testing electric properties; Arrangements for locating electric faults; Arrangements for electrical testing characterised by what is being tested not provided for elsewhere
- G01R31/36—Arrangements for testing, measuring or monitoring the electrical condition of accumulators or electric batteries, e.g. capacity or state of charge [SoC]
- G01R31/367—Software therefor, e.g. for battery testing using modelling or look-up tables
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01R—MEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
- G01R31/00—Arrangements for testing electric properties; Arrangements for locating electric faults; Arrangements for electrical testing characterised by what is being tested not provided for elsewhere
- G01R31/005—Testing of electric installations on transport means
- G01R31/006—Testing of electric installations on transport means on road vehicles, e.g. automobiles or trucks
- G01R31/007—Testing of electric installations on transport means on road vehicles, e.g. automobiles or trucks using microprocessors or computers
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01R—MEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
- G01R31/00—Arrangements for testing electric properties; Arrangements for locating electric faults; Arrangements for electrical testing characterised by what is being tested not provided for elsewhere
- G01R31/36—Arrangements for testing, measuring or monitoring the electrical condition of accumulators or electric batteries, e.g. capacity or state of charge [SoC]
- G01R31/3644—Constructional arrangements
- G01R31/3648—Constructional arrangements comprising digital calculation means, e.g. for performing an algorithm
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01R—MEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
- G01R31/00—Arrangements for testing electric properties; Arrangements for locating electric faults; Arrangements for electrical testing characterised by what is being tested not provided for elsewhere
- G01R31/36—Arrangements for testing, measuring or monitoring the electrical condition of accumulators or electric batteries, e.g. capacity or state of charge [SoC]
- G01R31/371—Arrangements for testing, measuring or monitoring the electrical condition of accumulators or electric batteries, e.g. capacity or state of charge [SoC] with remote indication, e.g. on external chargers
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01R—MEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
- G01R31/00—Arrangements for testing electric properties; Arrangements for locating electric faults; Arrangements for electrical testing characterised by what is being tested not provided for elsewhere
- G01R31/36—Arrangements for testing, measuring or monitoring the electrical condition of accumulators or electric batteries, e.g. capacity or state of charge [SoC]
- G01R31/378—Arrangements for testing, measuring or monitoring the electrical condition of accumulators or electric batteries, e.g. capacity or state of charge [SoC] specially adapted for the type of battery or accumulator
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2462—Approximate or statistical queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/285—Selection of pattern recognition techniques, e.g. of classifiers in a multi-classifier system
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Computational Biology (AREA)
- Combustion & Propulsion (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Hardware Design (AREA)
- Microelectronics & Electronic Packaging (AREA)
- Chemical & Material Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Fuel Cell (AREA)
Abstract
The invention discloses a vehicle-mounted fuel cell remote fault classification diagnosis method, a device and a storage medium, wherein the acquired fuel cell has serious and uneven sample distribution in various data in the running process of an automobile, random undersampling processing is adopted, so that overfitting of most samples and undersitting of few types of samples are avoided, and more effective and more representative data samples are provided for a subsequent classification algorithm; the XGboost is adopted to integrate multiple CART models together to form a strong classifier, and compared with a single classifier, the model generalization capability is more remarkable; XGboost training supports feature strength parallelism, and the model training speed is accelerated; in the fault classification process of the fuel cell, aiming at the problem that majority data are easy to be over-fitted during training due to serious imbalance of application data, the XGboost algorithm uses various strategies such as regularization, reduction, column sampling and the like to prevent over-fitting, and meanwhile, the XGboost algorithm optimizes a loss function, so that the model fitting precision is higher.
Description
Technical Field
The invention relates to the technical field of automobile battery big data, in particular to a vehicle-mounted fuel cell remote fault classification diagnosis method and device based on XGboost (eXtreme Gradient Boosting), and a storage medium.
Background
The development of modern automotive industries has accelerated the technological development and commercial application of fuel cells with the dual challenges of resources and environment. Proton Exchange Membrane Fuel Cells (PEMFCs), which have many advantages such as high energy conversion efficiency, fast starting speed, low operating temperature, and environmental friendliness, are becoming indispensable components of future electric vehicles. However, the conditions of manufacturing process, vehicle running condition, vehicle driving behavior, operating environment in the use process, aging abuse and the like often cause the faults of single battery overvoltage, vehicle-mounted energy storage device type overvoltage, temperature difference, high battery temperature, mismatch of a rechargeable energy storage system and the like of the fuel cell vehicle. The failure problem of the vehicle-mounted fuel cell is not inconstant, and if the failure of the cell is light, the cell is irreversibly damaged, and if the failure of the cell is serious, the operation control strategy of the vehicle is affected, and in extreme cases, a catastrophic accident can be caused.
Nowadays, relatively few methods are used for fault classification of vehicle-mounted Proton Exchange Membrane Fuel Cells (PEMFCs), and the classification of fuel cell faults by using real-time data acquired by vehicle operation is more in an exploration stage, and there is no perfect fault classification method and standard, and the main problems of the currently commonly used fault classification model are as follows:
1) the fuel cell fault classification technology for reference is mainly a classification algorithm based on an expert model, but the model classification of the fuel cell fault by using the expert model needs a great deal of professional knowledge such as electrochemistry, hydromechanics, automobile dynamics and the like as support, and the fuel cell automobile has a complex structure, a plurality of functions and a complex modeling process. Although the fault classification algorithm of the expert model has the advantages of high precision and the like, the robustness is poor, and the requirements on personnel engaged in related work are relatively high; meanwhile, the expert model is modeled by using the existing knowledge, so that potential dangerous faults are difficult to find, and potential safety hazards still exist in the battery safety and the vehicle safety;
2) the data-driven algorithm is used for overfitting most samples and under-fitting few samples, the model is biased to the most samples and is high in classification accuracy, but the accuracy of the classification to the few samples is low;
3) the data set for modeling the fuel cell fault is generally a laboratory data set, and rarely utilizes data collected by actual automobile operation to model the vehicle fuel cell fault classification diagnosis.
Therefore, the prior art still needs to be improved and developed.
Disclosure of Invention
The invention aims to provide a vehicle-mounted fuel cell remote fault classification diagnosis method, a device and a storage medium, which classify fuel cell faults by adopting a method of resampling a data set and combining an XGboost classification algorithm, and improve the safety and maintainability of a fuel cell automobile.
The technical scheme of the invention is as follows: a vehicle-mounted fuel cell remote fault classification diagnosis method specifically comprises the following steps:
acquiring data of the operation of the fuel cell vehicle;
preprocessing the data of the fuel cell vehicle operation to obtain a training set input vector and a testing set input vector;
inputting the training set input vector into an XGboost classifier to train a classification model to obtain an XGboost fault classification model;
and inputting the input vector of the test set into the XGboost fault classification model for testing to obtain a classification result of the test set, namely a battery fault alarm grade.
The vehicle-mounted fuel cell remote fault classification diagnosis method is characterized in that the running data of the fuel cell vehicle is acquired, and the data of the fuel cell vehicle in the running process is acquired through a large number of state monitoring sensors arranged on the fuel cell vehicle.
The vehicle-mounted fuel cell remote fault classification diagnosis method comprises the following steps of:
s 21: cleaning the data of the fuel cell vehicle operation;
s 22: performing undersampling processing on the cleaned running data of the fuel cell vehicle;
s 23: normalizing the data of the fuel cell vehicle operation after undersampling;
s 24: and dividing the normalized data of the fuel cell vehicle operation into a training set input vector and a testing set input vector according to a certain proportion.
The remote fault classification diagnosis method for the vehicle-mounted fuel cell comprises the following specific steps in s 21: deleting data with the effectiveness not meeting the requirement in the data of the fuel cell automobile operation; and deleting data which cannot meet the requirements in relation to the fault of the fuel cell in the data of the fuel cell automobile operation.
The vehicle-mounted fuel cell remote fault classification diagnosis method comprises the steps of inputting the training set input vector into an XGboost classifier to train a classification model to obtain the XGboost fault classification model, and adjusting and optimizing training parameters by combining a gridding search automatic tuning algorithm in the training process.
The remote fault classification diagnosis method for the vehicle-mounted fuel cell comprises the steps of iteration times, learning rate, regularization index, gamma, min _ child _ weight and max _ depth.
The vehicle-mounted fuel cell remote fault classification diagnosis method further comprises the following steps: and comprehensively evaluating the battery fault alarm level by adopting an evaluation index.
The vehicle-mounted fuel cell remote fault classification diagnosis method is characterized in that the evaluation indexes comprise accuracy, precision, recall rate and F1.
An on-vehicle fuel cell remote failure classification diagnosis device, comprising:
the data acquisition module is used for acquiring the data of the fuel cell vehicle operation;
the data preprocessing module is used for preprocessing the data of the fuel cell vehicle operation to obtain a training set input vector and a testing set input vector;
the model training module inputs the training set input vector into an XGboost classifier to train a classification model to obtain an XGboost fault classification model;
and the battery fault alarm grade classification module inputs the input vector of the test set into the XGboost fault classification model for testing to obtain a classification result of the test set, namely the battery fault alarm grade.
A storage medium having stored therein a computer program which, when run on a computer, causes the computer to perform any of the methods described above.
The invention has the beneficial effects that: the invention obtains various data of the vehicle-mounted proton exchange membrane fuel cell in the running process of the automobile by providing the method, the device and the storage medium for the remote fault classification diagnosis of the vehicle-mounted fuel cell, and the data has the condition of seriously uneven sample distribution; the XGboost adopted by the scheme integrates a plurality of CART models to form a strong classifier, and compared with a single classifier, the generalization capability of the model is more obvious; XGboost training supports parallelism in feature strength, and the training speed of the model is accelerated; in the fault classification process of the fuel cell, aiming at the problem that most kinds of data are easy to be over-fitted during training due to serious imbalance of application data, the XGboost algorithm uses various strategies such as regularization, reduction, column sampling and the like to prevent over-fitting, and meanwhile, the XGboost algorithm optimizes a loss function, so that the fitting precision of the model is higher; according to the scheme, the fuel cell faults are classified by adopting a method of combining resampling of the data set and the XGboost classification algorithm, so that the fuel cell faults can be accurately classified in the unbalanced data set, and the safety and maintainability of the fuel cell automobile are improved.
Drawings
Fig. 1 is a flow chart of steps of a remote fault classification diagnosis method for an on-board fuel cell in accordance with the present invention.
FIG. 2 is a schematic diagram of the XGboost algorithm implementation principle of the present invention.
FIG. 3 is a schematic diagram of the classification model loss calculation process in the present invention.
Fig. 4 is a schematic diagram of the on-board fuel cell remote fault classification diagnosis apparatus of the invention.
Fig. 5 is a schematic diagram of a terminal in the present invention.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
As shown in fig. 1, a vehicle-mounted fuel cell fault classification diagnosis method, which utilizes data (including parameters such as current, voltage, power, temperature and pressure of a hydrogen system, speed of a vehicle, current, voltage and insulation resistance of a motor) collected in the operation process of a fuel cell vehicle, adopts a resampling algorithm to process collected unbalanced data (to prevent a subsequent classification effect from being influenced by extreme unbalance of a sample), obtains a new data set, scores the newly sampled data set by using a plurality of weak classifiers by using an XGboost algorithm to finally synthesize a strong classifier, and classifies common fuel cell faults according to fault alarm levels by utilizing collected vehicle operation data, specifically comprises the following steps:
s1: data on the operation of the fuel cell vehicle is acquired.
In the operation process of the fuel cell automobile, a large number of state monitoring sensors arranged on the fuel cell automobile are used for collecting data (including fuel cell system related data (such as current, voltage and power of the fuel cell), automobile driving behavior information (such as speed and the like of the automobile), position information and power system related data (such as temperature and pressure of a hydrogen system, current, voltage and insulation resistance of a motor) in the operation process of the fuel cell automobile, and the like), and the operation data of the fuel cell automobile is sent to a system database by a terminal data collector (namely the state monitoring sensors) for storage after collection, so that when the data is required to be used in battery fault classification, the operation data can be inquired from the system database and directly exported according to needs.
S2: preprocessing the data of the fuel cell vehicle operation to obtain a training set input vector and a testing set input vector: a large amount of process data are generated and stored in the whole driving process of the fuel cell vehicle, and in order to manage many different types of parameters, the data for acquiring the operation of the fuel cell vehicle need to be processed in advance, and the method specifically comprises the following steps:
2.1. and cleaning the data of the fuel cell automobile operation.
(2.1.1) checking the validity of the data of the fuel cell vehicle operation, and carrying out deletion processing on some meaningless data or data containing too many missing values;
(2.1.2) deleting parameters (such as positioning state, longitude and latitude, vehicle identification, data acquisition time, fault indication and the like) which have small relation with the fault of the fuel cell;
after data cleaning, only the data shown in table 1 is finally retained (wherein, the field means that the data is retained as long as the field in table 1 appears in the data during cleaning, and otherwise, the field is not retained; the description means that the field specifically contains):
field(s) | Description of the invention |
total Current | Total current of power battery |
total Voltage | Total voltage of power battery |
soc | Power battery SOC |
vehicle Status | Vehicle state 1: a vehicle start state; 2: extinguishing the fire; 3: other states |
fuelBatteryCurrent | Current of fuel cell |
fuelBatteryVoltage | Voltage of fuel cell |
fuelBatteryPower | Fuel cell power |
hydrogenMaxPressure | Maximum pressure of hydrogen |
hydrogenSystemMaxTemperature | Maximum temperature in hydrogen system |
vehicle Speed | Vehicle speed |
motor Status | Motor state 1: electricity is consumed; 2: generating electricity; 3: a closed state; 4: ready state |
motor Voltage | Driver voltage |
motorbus Current | DC bus current of motor controller |
insulation Resistance | Insulation resistance k omega |
alarm Level | Alarm rating |
TABLE 1 data retained after washing
2.2. Under-sampling (undersampling, which is a method for alleviating class imbalance, is implemented by discarding samples, and can be understood as undersampling a class with a large number of samples in a training set) processing the cleaned data.
Under general conditions, originally acquired data are taken from data during automobile running, most of information is acquired data during normal driving, fault information accounts for a small proportion, and sampling of a certain proportion of normal data can effectively reduce the proportion of sample data so as to improve the condition of seriously unbalanced sample data: for example, after the data is cleaned, the total sample size is 306694, wherein the normal data is 299394, the percentage is about 97.62%, the type 1 fault is 5213, the percentage is about 1.70%, the type 2 fault is 360, the percentage is about 0.12%, the type 3 fault is 1727, the percentage is about 0.56%, and the sample distribution has serious unevenness. The majority of samples are randomly sampled by undersampling, and the sampling ratio is set to be 2: 1.
2.3. Respectively carrying out normalization processing on 14-dimensional parameters (other 14 items of data except the alarm level in the table 1) in the data subjected to undersampling processing; and dividing the normalized 14-dimensional data into a training set input vector and a test set input vector according to a ratio of 7:3 (each input vector comprises 14-dimensional data).
Wherein the 14-dimensional parameters include fuel cell current, fuel cell voltage, fuel cell power, temperature of the hydrogen system, pressure of the hydrogen system, speed of the vehicle, current of the motor, voltage of the motor, insulation resistance of the motor, and the like.
The collected data has a plurality of characteristic attributes, the dimensions of the collected data are different, in order to eliminate the influence of variable dimensions in the subsequent modeling process, normalization processing is adopted on the collected data, the max-min normalization is adopted, and the formula is as follows:
wherein max is the maximum value of the sample data, min is the minimum value of the sample data, and x is the data to be normalized and is the data after the normalization processing.
S3: and inputting the training set input vector into an XGboost classifier to train a classification model, adjusting a training parameter by combining a gridding search automatic tuning algorithm (the gridding search automatic tuning algorithm is a parameter optimization algorithm carried by Python (computer programming language)), and improving the XGboost algorithm to obtain the XGboost fault classification model after parameter automatic tuning training.
The training parameters include iteration number, learning rate (learning _ rate), regularization index, Gamma (when a node is split, the node is split only when the value of a loss function after splitting is reduced, the Gamma specifies the minimum loss function reduction value required by the node splitting. The final parameters are set as: n _ estimator (number of trees built in model) =500, learning _ rate =0.3, gamma =0, max _ depth =4, and the rest parameters all adopt default values.
The XGboost is an optimization of a Boosting algorithm, and integrates a plurality of weak classifiers into a strong classifier; the XGboost algorithm fits the residual error of the previous decision tree by continuously generating a new decision tree, and the precision is continuously improved along with the increase of the iteration times. The implementation principle of the XGboost algorithm is shown in FIG. 2:
training set formed by fuel cell vehicle operation data after data preprocessing(N is the amount of samples), and the XGboost algorithm trains K CARTs (classification and regression trees, which are abbreviated as trees in the following) to form a set(Representing the kth tree), which assign each input sample to a different leaf node, each leaf node corresponding to a classified score, according to the partitioning point of the attributes(ii) a When given classification sample(i.e. in the input vectors of the training set) The result of the classification for this sample is the sum of the classification scores for each tree. The classification model can be defined as:
(1)
wherein F is the set of all CARTs,for corresponding samplesAs a result of the classification of (a),representing a sampleAnd (4) classifying and scoring the leaf nodes obtained after inputting the leaf nodes into the kth tree, wherein K is the number of CARTs.
Wherein, the objective function of the classification model is:
wherein Obj is the total objective function,and representing an error function, wherein N is the dosage of the sample, the original classification model is kept unchanged in each iteration, and a new function is added into the classification model to correct the result. One function corresponds to one CART (i.e. a set of K CARTs)In the input vector of a corresponding certain CARTFunction of time), the newly generated CART fits the residual of the last prediction, and the iterative process can be represented by the following formula:
wherein, the classification sample isThe initial value of the classification result of (1) is the classification sample after iterationOf the classification result ofThe iteration value is a classification sample after t iterationsT is the number of iterations.
Wherein the content of the first and second substances,a regularization term representing the classification model, i.e., the total complexity of the K CARTs; wherein the content of the first and second substances, (4)
wherein T is the number of leaf nodes of the kth tree (the number of leaf nodes in each tree is different);is the score of the leaf node of the kth tree;a penalty term indicating the L1 regular type, which is used for controlling the number of leaf nodes;and a penalty term in the form of L2 is used to ensure that the score of the leaf node cannot be too large (the scores of the leaf node and the leaf node change every time a tree is updated). The goal of regularization is to select a simple prediction function that prevents the classification model from overfitting. Each iteration updates the objective function of the classification model to:
wherein the content of the first and second substances,is the objective function of the classification model after the t-th iteration, t is the number of iterations,representing classification samplesA tree is formed at the t-th iteration.
The equation (5) is developed according to taylor two-stage to obtain:
in the formula:is the first derivative of the loss function,as the second derivative of the loss function:
wherein the content of the first and second substances,for partial derivation, XGboost is based on the first derivativeAnd second derivativeAnd iteratively generating a base learner, and adding and updating the learner.
In the CART construction process, each time a partition is added to the existing leaf, whether a node is added or not can be dynamically selected in the tree construction process.
Where gain represents the splitting gain (calculating the splitting gain, selecting the cut point of the feature by comparing the splitting gains),it represents that the left subtree (Binary tree) is an important type of tree structure, the Binary tree is characterized in that each node only has scores of two subtrees at most, namely the left subtree and the right subtree, represents the score of the right subtree, represents the score obtained by not dividing,a penalty term indicating the L1 regular type, which is used for controlling the number of leaf nodes;and a penalty term in the form of L2 is used to ensure that the score of the leaf node cannot be too large.
During the training process, the classification model continuously calculates the node loss to select the leaf node with the largest gain loss. The classification model loss calculation process is shown in fig. 3.
S4: and inputting the input vector of the test set into the XGboost fault classification model for testing to obtain a classification result of the test set, namely a battery fault alarm grade.
S5: and comprehensively evaluating the battery failure alarm grade by using evaluation indexes such as accuracy (Precision, namely the proportion of correct-prediction regular data to correct-prediction data), Precision (Precision), recall (namely recall = all correct-distribution regular samples/all regular samples, namely the proportion of the correct-prediction data to the actual regular data), F1 (F score, F1= 2P R/(P + R), P is accuracy and R is recall), and the like.
Wherein, the classification accuracy rate accuracy is 99.80%, the Precision is 96.72%, Recall is 88.24%, F1 is 92.05%, Cohen Kappa (statistical coefficient for measuring the classification Precision) is 95.65%. Compared with classification models such as a CNN convolutional neural network, an LSTM long-time memory network and an SVM support vector machine, the XGBoost classification algorithm is utilized in the technical scheme, and various index performances are obviously improved.
The technical scheme can acquire various parameters of the vehicle-mounted proton exchange membrane fuel cell in the running of the automobile by utilizing an advanced sensor technology, such as: power battery system data, fuel cell system data, automobile driving behavior data, vehicle position information, vehicle alarm information, and the like. The data is mostly data of normal running of the vehicle, the proportion of fault data is very small (< 1%), and the conventional classifier is biased to be in a plurality of categories. The XGboost is an optimization of a Boosting algorithm, a plurality of CART models are integrated together to form a strong classifier, and compared with a single classifier, the generalization capability of the models is more remarkable. XGboost training supports parallelism in feature strength, and the training speed of the model is accelerated. In the process of classifying the faults of the fuel cell, data applied by the XGboost algorithm has serious imbalance, most classes are easy to be over-fitted during training, and the XGboost algorithm prevents over-fitting by using various strategies such as regularization, shrinkage (reduction), column sampling and the like. Meanwhile, the XGboost algorithm optimizes the loss function, so that the fitting accuracy of the model is higher. Therefore, in order to accurately classify the fuel cell faults in the unbalanced data set, the scheme classifies the fuel cell faults by adopting a method of combining resampling of the data set and an XGboost classification algorithm, and improves the safety and maintainability of the fuel cell automobile.
The scheme provides a method for carrying out resampling (namely undersampling) on a data set and combining an XGboost classification algorithm to classify the faults of the fuel cell, and accurately classifying and diagnosing the faults of the fuel cell by using collected driving data of the fuel vehicle with unbalanced distribution; the safety and the maintainability of the fuel cell automobile are improved; weak links of the system are found, the battery management system is optimized, and the stability of the system is fundamentally improved.
According to the scheme, a large amount of data acquired in the automobile operation process of the proton exchange membrane fuel cell is analyzed, the relationship between the battery fault and the automobile operation data is fitted by combining the characteristics of the acquired data and applying an XGboost algorithm after the unbalanced sample data is resampled, and the classification result is evaluated by using a plurality of evaluation indexes; this scheme is compared in other fault classification algorithm and all has obvious promotion in fault classification's efficiency and degree of accuracy: according to the scheme, random undersampling processing is adopted for the condition that the sample distribution of the data set is seriously uneven, so that overfitting to most samples and underfitting to few types of samples are avoided, and more effective and representative data samples are provided for the subsequent classification algorithm; the method can analyze the relation between the operation data of the fuel cell automobile and the battery fault, and provides effective help for protecting the safe operation of the fuel cell automobile; according to the scheme, the classification idea of the integrated classifier is adopted, so that the accuracy, precision, recall rate, F1 and other evaluation indexes are remarkably improved compared with other fault classification algorithms, and particularly the classification accuracy of a small number of classes is effectively improved; meanwhile, the XGboost algorithm-based fault classification algorithm is higher in calculation efficiency.
As shown in fig. 4, a vehicle-mounted fuel cell remote fault classification diagnosis apparatus includes:
the data acquisition module 101 is used for acquiring the data of the operation of the fuel cell vehicle;
the data preprocessing module 102 is used for preprocessing the data of the fuel cell vehicle operation to obtain a training set input vector and a testing set input vector;
the model training module 103 is used for inputting the training set input vector into the XGboost classifier to train the classification model to obtain an XGboost fault classification model;
and the battery fault alarm grade classification module 104 inputs the input vector of the test set into the XGboost fault classification model for testing to obtain a classification result of the test set, namely a battery fault alarm grade.
Referring to fig. 5, an embodiment of the present invention further provides a terminal. As shown, the terminal 300 includes a processor 301 and a memory 302. The processor 301 is electrically connected to the memory 302. The processor 301 is a control center of the terminal 300, connects various parts of the entire terminal using various interfaces and lines, and performs various functions of the terminal and processes data by running or calling a computer program stored in the memory 302 and calling data stored in the memory 302, thereby performing overall monitoring of the terminal 300.
In this embodiment, the processor 301 in the terminal 300 loads instructions corresponding to one or more processes of the computer program into the memory 302 according to the following steps, and the processor 301 runs the computer program stored in the memory 302, so as to implement various functions: acquiring data of the operation of the fuel cell vehicle; preprocessing the data of the fuel cell vehicle operation to obtain a training set input vector and a testing set input vector; inputting the training set input vector into an XGboost classifier to train a classification model to obtain an XGboost fault classification model; and inputting the input vector of the test set into the XGboost fault classification model for testing to obtain a classification result of the test set, namely a battery fault alarm grade.
Memory 302 may be used to store computer programs and data. The memory 302 stores computer programs containing instructions executable in the processor. The computer program may constitute various functional modules. The processor 301 executes various functional applications and data processing by calling a computer program stored in the memory 302.
An embodiment of the present application provides a storage medium, and when being executed by a processor, the computer program performs a method in any optional implementation manner of the foregoing embodiment to implement the following functions: acquiring data of the operation of the fuel cell vehicle; preprocessing the data of the fuel cell vehicle operation to obtain a training set input vector and a testing set input vector; inputting the training set input vector into an XGboost classifier to train a classification model to obtain an XGboost fault classification model; and inputting the input vector of the test set into the XGboost fault classification model for testing to obtain a classification result of the test set, namely a battery fault alarm grade. The storage medium may be implemented by any type of volatile or nonvolatile storage device or combination thereof, such as a Static Random Access Memory (SRAM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), an Erasable Programmable Read-Only Memory (EPROM), a Programmable Read-Only Memory (PROM), a Read-Only Memory (ROM), a magnetic Memory, a flash Memory, a magnetic disk, or an optical disk.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
In addition, units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
Furthermore, the functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.
The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.
Claims (10)
1. A vehicle-mounted fuel cell remote fault classification diagnosis method is characterized by comprising the following steps:
acquiring data of the operation of the fuel cell vehicle;
preprocessing the data of the fuel cell vehicle operation to obtain a training set input vector and a testing set input vector;
inputting the training set input vector into an XGboost classifier to train a classification model to obtain an XGboost fault classification model;
and inputting the input vector of the test set into the XGboost fault classification model for testing to obtain a classification result of the test set, namely a battery fault alarm grade.
2. The method for remote fault classification and diagnosis of the vehicle-mounted fuel cell according to claim 1, wherein the data of the operation of the fuel cell vehicle is acquired by a plurality of state monitoring sensors arranged on the fuel cell vehicle.
3. The remote fault classification diagnosis method for the vehicle-mounted fuel cell according to claim 1, wherein the data for the operation of the fuel cell vehicle is preprocessed to obtain a training set input vector and a testing set input vector, and the method specifically comprises the following steps:
s 21: cleaning the data of the fuel cell vehicle operation;
s 22: performing undersampling processing on the cleaned running data of the fuel cell vehicle;
s 23: normalizing the data of the fuel cell vehicle operation after undersampling;
s 24: and dividing the normalized data of the fuel cell vehicle operation into a training set input vector and a testing set input vector according to a certain proportion.
4. The remote fault classification diagnosis method for the vehicle-mounted fuel cell according to claim 3, characterized in that the s21 specifically comprises the following processes: deleting data with the effectiveness not meeting the requirement in the data of the fuel cell automobile operation; and deleting data which cannot meet the requirements in relation to the fault of the fuel cell in the data of the fuel cell automobile operation.
5. The vehicle-mounted fuel cell remote fault classification diagnosis method according to claim 1, characterized in that the training set input vectors are input into an XGboost classifier to train a classification model, and in the XGboost fault classification model, optimization training parameters are adjusted and optimized in combination with a gridding search automatic tuning algorithm in the training process.
6. The remote fault classification diagnosis method for the vehicle-mounted fuel cell according to claim 5, characterized in that the training parameters comprise iteration number, learning rate, regularization index, gamma, min _ child _ weight and max _ depth.
7. The on-vehicle fuel cell remote fault classification diagnosis method according to claim 1, characterized by further comprising the steps of: and comprehensively evaluating the battery fault alarm level by adopting an evaluation index.
8. The remote fault classification diagnosis method for the vehicle-mounted fuel cell according to claim 7, characterized in that the evaluation indexes include accuracy, precision, recall, F1.
9. A vehicle-mounted fuel cell remote failure classification diagnosis device, characterized by comprising:
the data acquisition module is used for acquiring the data of the fuel cell vehicle operation;
the data preprocessing module is used for preprocessing the data of the fuel cell vehicle operation to obtain a training set input vector and a testing set input vector;
the model training module inputs the training set input vector into an XGboost classifier to train a classification model to obtain an XGboost fault classification model;
and the battery fault alarm grade classification module inputs the input vector of the test set into the XGboost fault classification model for testing to obtain a classification result of the test set, namely the battery fault alarm grade.
10. A storage medium having stored thereon a computer program which, when run on a computer, causes the computer to perform the method of any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110104677.XA CN112782589A (en) | 2021-01-26 | 2021-01-26 | Vehicle-mounted fuel cell remote fault classification diagnosis method and device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110104677.XA CN112782589A (en) | 2021-01-26 | 2021-01-26 | Vehicle-mounted fuel cell remote fault classification diagnosis method and device and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112782589A true CN112782589A (en) | 2021-05-11 |
Family
ID=75757931
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110104677.XA Pending CN112782589A (en) | 2021-01-26 | 2021-01-26 | Vehicle-mounted fuel cell remote fault classification diagnosis method and device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112782589A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115542172A (en) * | 2022-12-01 | 2022-12-30 | 湖北工业大学 | Power battery fault detection method, system, device and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109840541A (en) * | 2018-12-05 | 2019-06-04 | 国网辽宁省电力有限公司信息通信分公司 | A kind of network transformer Fault Classification based on XGBoost |
CN110705657A (en) * | 2019-11-21 | 2020-01-17 | 北京交通大学 | Mode identification fault diagnosis method of proton exchange membrane fuel cell system |
CN110986407A (en) * | 2019-11-08 | 2020-04-10 | 杭州电子科技大学 | Fault diagnosis method for centrifugal water chilling unit |
CN111812535A (en) * | 2020-06-30 | 2020-10-23 | 南京林业大学 | Power battery fault diagnosis method and system based on data driving |
CN112214369A (en) * | 2020-10-23 | 2021-01-12 | 华中科技大学 | Hard disk fault prediction model establishing method based on model fusion and application thereof |
-
2021
- 2021-01-26 CN CN202110104677.XA patent/CN112782589A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109840541A (en) * | 2018-12-05 | 2019-06-04 | 国网辽宁省电力有限公司信息通信分公司 | A kind of network transformer Fault Classification based on XGBoost |
CN110986407A (en) * | 2019-11-08 | 2020-04-10 | 杭州电子科技大学 | Fault diagnosis method for centrifugal water chilling unit |
CN110705657A (en) * | 2019-11-21 | 2020-01-17 | 北京交通大学 | Mode identification fault diagnosis method of proton exchange membrane fuel cell system |
CN111812535A (en) * | 2020-06-30 | 2020-10-23 | 南京林业大学 | Power battery fault diagnosis method and system based on data driving |
CN112214369A (en) * | 2020-10-23 | 2021-01-12 | 华中科技大学 | Hard disk fault prediction model establishing method based on model fusion and application thereof |
Non-Patent Citations (2)
Title |
---|
何龙: "《深入理解XGBoost高效机器学习算法与进阶》", 31 January 2020 * |
王青天 等: "《Python金融大数据风控建模实践》", 31 May 2020 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115542172A (en) * | 2022-12-01 | 2022-12-30 | 湖北工业大学 | Power battery fault detection method, system, device and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102245437B (en) | Vehicle failure diagnostic device | |
CN111999657B (en) | Method for evaluating driving mileage of lithium ion battery of electric vehicle in residual life | |
CN102177049B (en) | Generation of reference value for vehicle failure diagnosis | |
CN111414477A (en) | Vehicle fault automatic diagnosis method, device and equipment | |
CN113884961B (en) | SOC calibration method, modeling device, computer equipment and medium | |
CN108664010A (en) | Generating set fault data prediction technique, device and computer equipment | |
CN106054858B (en) | The method of the vehicle remote diagnosis and spare part retrieval classified based on decision tree classification and error code | |
CN111950585A (en) | XGboost-based underground comprehensive pipe gallery safety condition assessment method | |
CN112756759B (en) | Spot welding robot workstation fault judgment method | |
CN112327168A (en) | XGboost-based electric vehicle battery consumption prediction method | |
CN112785015B (en) | Equipment fault diagnosis method based on case reasoning | |
CN110879377A (en) | Metering device fault tracing method based on deep belief network | |
CN115828140A (en) | Neighborhood mutual information and random forest fusion fault detection method, system and application | |
CN112988756A (en) | Big data-based cosmetic production data determination method and cloud server | |
CN112560997A (en) | Fault recognition model training method, fault recognition method and related device | |
CN115112372A (en) | Bearing fault diagnosis method and device, electronic equipment and storage medium | |
CN112782589A (en) | Vehicle-mounted fuel cell remote fault classification diagnosis method and device and storage medium | |
CN116448161A (en) | Artificial intelligence-based environment monitoring equipment fault diagnosis method | |
CN115358481A (en) | Early warning and identification method, system and device for enterprise ex-situ migration | |
CN109948738B (en) | Energy consumption abnormity detection method and device for coating drying chamber | |
CN114460481A (en) | Energy storage battery thermal runaway early warning method based on Bi-LSTM and attention mechanism | |
CN114330881A (en) | Data-driven fan blade icing prediction method and device | |
CN114036647A (en) | Power battery safety risk assessment method based on real vehicle data | |
CN113256325A (en) | Second-hand vehicle valuation method, system, computing device and storage medium | |
CN112363012A (en) | Power grid fault early warning device and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210511 |