CN109829236A - A kind of Compressor Fault Diagnosis method based on XGBoost feature extraction - Google Patents
A kind of Compressor Fault Diagnosis method based on XGBoost feature extraction Download PDFInfo
- Publication number
- CN109829236A CN109829236A CN201910100466.1A CN201910100466A CN109829236A CN 109829236 A CN109829236 A CN 109829236A CN 201910100466 A CN201910100466 A CN 201910100466A CN 109829236 A CN109829236 A CN 109829236A
- Authority
- CN
- China
- Prior art keywords
- failure
- fault
- tree
- sample
- splay
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
The invention discloses a kind of Compressor Fault Diagnosis methods based on XGBoost feature extraction.Firstly, iteration constructs failure splay tree according to the loss function of fault data and the customized XGBoost algorithm of fault type;Secondly, extracting leaf node location index vector of the sample in fault tree and carrying out feature coding reconstruct, the intelligent characterization of hidden failure information is obtained;Then, it is based on the characterization matrix, fault prediction model is established respectively using SVM, neural network algorithm, realizes the predictive diagnosis of multiple faults mode.The characteristics of this method is, can hidden failure characteristic information sufficiently in mining data, keep the precision of fault diagnosis and prediction higher.
Description
Technical field
The present invention relates to a kind of method for diagnosing faults based on machine learning, special based on XGBoost more particularly, to one kind
Levy the Compressor Fault Diagnosis method extracted.
Background technique
For fault diagnosis model, the quality of feature extraction is largely fixed the performance of the model, it is failure
Key link in diagnosis.Feature extraction is to excavate to extract the most process of information representative and one from fault data
The process of depth excavation is carried out to implicit information in fault data.Poor fault signature not only influences the efficiency of algorithm operation,
Also algorithm model can be reduced to the precision of prediction of failure.Therefore, it is most important to study effective feature extracting method.
At this stage, method for diagnosing faults is divided into method based on analytic modell analytical model, based on the side of Heuristics by most literature
Method and method based on data-driven.As present system has the characteristics that multivariable, close coupling and non-linear, so that system
The building of model is extremely difficult, and the fault diagnosis effect based on analytic modell analytical model and Heuristics method is just very unobvious, and is based on
The method of data-driven gradually obtains the concern of people, and becomes the hot spot of fault diagnosis field.Method based on data-driven
It is exactly the useful information in extraction system process data, according to these useful informations come the failure of diagnostic system.It will be based on data
The fault diagnosis technology of driving is subdivided into the method for diagnosing faults based on statistical analysis, the method for diagnosing faults based on signal processing
With the method for diagnosing faults based on artificial intelligence.Since simple corresponding relationship being not present between fault type and failure symptom,
For the uncertainty and complexity of system, the fault diagnosis technology based on artificial intelligence is more applicable in.It mainly passes through work
The normal data of industry process and fault data train all kinds of learning algorithms, and then realize the purpose of fault diagnosis.Its technology
Difficult point, which is how to excavate from the fault data of monitoring, extracts implicit important feature information, and characterization system runs normal
Mode and fault mode.Method for diagnosing faults based on artificial intelligence includes neural network, support vector machines method, limit study
Machine method and fuzzy logic method.Degeneration and mechanical abrasion, existing research person for ingredient in production system pass through time frequency analysis
Technology realizes the detection and diagnosis of failure on frequency domain using ANN;There is researcher to propose least square SVM hybrid classification again
Device, using the parameter of particle swarm optimization algorithm optimization SVM, is realized to oil immersed type electricity during being trained to classifier
The fault diagnosis of the dissolved gas analysis of power transformer;Simultaneously it is proposed that in conjunction with singular value decomposition and extreme learning machine algorithm
Rolling bearing fault diagnosis technology;Somebody carries out the selection of correlated characteristic using traditional decision-tree, calculates in conjunction with backpropagation
Method and least-squares algorithm finely tune the network parameter of adaptive Fuzzy Reasoning Neural Network, to carry out failure to induction machine
Diagnosis.Meanwhile researchers from the new information of increase, excavate unused implicit letter for the Problems Existing in complication system
It ceases and this field is looked forward to using new three angles of mathematical tool, propose four fault diagnosises based on data-driven
Research and prospects: the fault diagnosis prospect based on Multi-source Information Fusion, is based on machine at the fault diagnosis prospect based on association analysis
The fault diagnosis prospect of study, the fault diagnosis prospect based on time frequency analysis.
Wherein, in the fault diagnosis based on machine learning, traditional intelligence learning method, either for classifying still
It returns, majority is shallow structure algorithm, is limited in that the expression in the case of finite sample and computing unit to complicated function
Ability is limited, and for complicated classification problem, its generalization ability is centainly restricted, how from monitoring data to fault signature into
Row excavate and indicate be such method Research Challenges, if reasonable drawing and table can be carried out to the implicit information in fault data
Sign, can obtain better fault detection and prediction result.Currently used for the machine learning method of fault diagnosis field, be all from
The angle of Approximation Theory is fitted monitoring data, there are the deficiency of approximation accuracy etc., such as neural network, support vector machine etc.,
It still can not sufficiently excavate the fault signature in monitoring data.
Above-mentioned machine learning fault diagnosis present Research and there are aiming at the problem that, this chapter proposes special based on XGBoost
Levy the Compressor Fault Diagnosis method extracted.This method according to the compressor fault data and fault type of monitoring, is made by oneself first
Justice adapts to the XGBoost algorithm loss function of the fault diagnosis scene, constructs corresponding failure splay tree with this;Secondly, determining
Leaf node location index of the sample data in failure splay tree simultaneously creates location index matrix;Finally, to index matrix into
The coding of row ad hoc fashion reconstructs, and realizes the feature extraction and intelligent characterization of hidden failure information.This method passes through building event
Hinder the mode of splay tree to realize between the statistics of information significant variable in data, including data structure and distributed nature.Together
When, realize that the depth of hidden feature information in data is excavated by the way that the depth of splay tree is arranged.It is examined with the failure of certain compressor
The validity of this feature extracting method is verified for disconnected.
Summary of the invention
It is improved for the above-mentioned problems in the prior art for hidden failure characteristic information in abundant mining data
Fault diagnosis accuracy, the object of the invention with a kind of Compressor Fault Diagnosis method based on XGBoost feature extraction is provided.
A kind of Compressor Fault Diagnosis method based on XGBoost feature extraction, which comprises the following steps:
1) according to the loss function of compressor fault data sample and the customized XGBoost algorithm of fault type;
2) iteration constructs failure splay tree;
3) it extracts leaf node location index vector of the fault data sample in failure splay tree and carries out feature coding
Reconstruct obtains the intelligent characterization of hidden failure information;
4) it is based on the characterization matrix, fault prediction model is established respectively using SVM, neural network algorithm, realizes multiple faults
The predictive diagnosis of mode.
A kind of Compressor Fault Diagnosis method based on XGBoost feature extraction, which is characterized in that the step
1) following steps are used according to the loss function of fault data sample and the customized XGBoost algorithm of fault type in:
1.1) classification diagnosis more for failure introduce Maximum Likelihood Estimation Method, for given fault data sample and its
Corresponding malfunction classification, customized loss function are as follows:
In formula: n representing fault data sample total number, yiThe true classification of representing fault,The prediction class of representing fault
Not;According to XGBoost theory of algorithm, the objective function of the t times iteration of tree-model FM of failure division at this time be may be expressed as:
In formula: Ω (ft) representing fault splay tree complexity,Represent failure predication value when the t-1 times iteration, ft
(xi) function when representing the t times iteration, C represents constant;
Approximate objective function is obtained with Taylor expansion are as follows:
In formula:giAnd hiIt containsSingle order and second dervative information, booting failure splay tree division building;
Tree-model FM is divided for failure, removes constant termStill have with C:
Function FM at this time(t)For the final goal function for needing trained failure splay tree;Then according to XGBoost algorithm
The leaf node cumulative fashion of theory, failure splay tree becomes:
In formula: IjFor the set of fault data sample above each leaf, wjRepresent each tree leaf node fault data sample
Reciprocal fraction, the leaf node number of T representing fault splay tree;λ, γ are used to control the specific gravity of corresponding part;Above formula is to wjIt asks
Leading and enabling derivative is zero, acquires the least disadvantage of failure division tree construction are as follows:
In formula: Represent the output score of least disadvantage failure division tree construction, S generation
Table least disadvantage.
A kind of Compressor Fault Diagnosis method based on XGBoost feature extraction, it is characterised in that step 2)
It is as follows that iteration constructs the step of failure splay tree:
2.1) loss function is constructed according to fault data sample and fault typeIt is constructed using the loss function
Corresponding objective function under the t times iteration of failure splay tree;
2.2) formula (3) is used to guide the failure splay tree of each step to construct as objective function;
2.3) in failure splay tree fission process, a ginseng is successively chosen in the sample of current failure division tree node
Number feature, uses lower formula (7) to calculate the parameter as the gain information Gain of leaf node fragmentation criterion;
In formula: first item is the information score of left subtree, and Section 2 is the information score of right subtree, Section 3 be it is current not
The information score of segmentation, GL, HLThe g of left subtree after respectively dividingiAnd hiInformation and GR, HRRight subtree after respectively dividing
giAnd hiInformation and;
2.4) after the completion of all parameters all calculate, select corresponding fault parameter to current failure point according to this gain G ain
It splits tree to be divided, and fault data sample is placed in corresponding leaf node;
2.5) when whether being less than the threshold γ of setting according to gain G ain value, decide whether to carry out current failure splay tree knot
The division of point;Whether reach the maximal tree depth of setting according to this failure splay tree simultaneously, judgement at this time failure splay tree whether
Building is completed, and division is stopped;If so, calculation formula (6) obtains the least disadvantage functional value of this failure splay tree;If it is not, then
Step 2.3 is returned to, and executes subsequent step in order.
A kind of Compressor Fault Diagnosis method based on XGBoost feature extraction, it is characterised in that step 3)
It extracts leaf node location index vector of the fault data sample in failure splay tree and carries out feature coding reconstruct, obtain hidden
Intelligent characterization containing fault message comprises the steps of:
3.1) fault data sample data set is set as { x1,x2,…,xn, possessing K failure splay tree, every failure point
In the case where splitting leaf node quantity for T, n-th of sample corresponding leaf node position in kth failure splay tree is
ank, and ank∈ [1, T], k ∈ [1, K], then the location index vector dimension of a sample is K, is corresponded in different faults splay tree
Location index can be identical or different, obtains all sample position index vector matrixes are as follows:
In formula: n representative sample quantity, K representing fault splay tree quantity;
3.2) it is obtained according to this location index matrix, the distance difference between different samples only represents it in failure splay tree at this time
Difference on middle position, it includes hidden failure information or implicit, fault diagnosis mould is trained using this vector matrix
Type can make subsequent algorithm model that can not acquire effective fault message;It is implied on the index position to preferably utilize
Information, it is indexed vector coding reconstruct, the value of location index feature is first expanded into Euclidean space and is obtained newly
Characteristic set is V, and V is the set of all elements in Z, then each sample index vector dimension is extended to K × T, when in sample
When in the presence of value in set V, the corresponding position value in new vector is 1, is otherwise 0, then obtains the coding of all samples
Restructuring matrix example are as follows:
A kind of Compressor Fault Diagnosis method based on XGBoost feature extraction, the step 4) based on
The characterization matrix realizes the predictive diagnosis of multiple faults mode using SVM, neural network fault prediction model, and use is following
Step:
4.1) fault data pre-processes: the fault data sample of acquisition is initially rambling, often includes missing values,
Repetition values, exceptional value, and constructing the required data of failure splay tree is numerical matrix form;Therefore, specific according to data
Situation is pre-processed data using Data Discretization, mean value or median filling missing values mode, is obtained with this regular
Sample data;
4.2) failure splay tree constructs:
4.2.1 loss function) is constructed according to fault data sample and fault typeUse the loss function structure
Build corresponding objective function under the t times iteration of failure splay tree;
4.2.2 formula (5)) is used to guide the failure splay tree of each step to construct as objective function;
4.2.3) in tree fission process, a parameter attribute is successively chosen in the sample of current tree node, uses public affairs
Formula (7) calculates gain information Gain of the parameter as leaf node fragmentation criterion;
4.2.4) after the completion of all parameters all calculate, select corresponding fault parameter to current failure according to this gain G ain
Tree is divided, and fault data sample is placed in corresponding leaf node;
When 4.2.5) whether being less than the threshold γ of setting according to gain G ain value, decide whether point for carrying out current tree node
It splits;Whether reach the maximal tree depth of setting according to this failure splay tree simultaneously, whether failure splay tree has constructed at this time for judgement
At stopping division;If so, calculating formula (6) obtains the least disadvantage functional value of this failure splay tree;If it is not, then returning to step
4.2.3, and in order execute subsequent step;
The failure splay tree of optimum structure is found through the above steps and obtains optimal leaf node quantity T and its corresponding
Leaf node score w;
4.3) leaf node location index vector extracts: using the failure splay tree established, extracting each fault data sample
Leaf node position in splay tree, forms corresponding index vector, and obtains the index vector of all fault data samples
Matrix Z;
4.4) index feature set constructs: the method described according to step 3.2) reconstructs index vector feature space, obtains
New characteristic set V;
4.5) encoder matrix generates: extending each fault data sample index vector dimension is K × T, and according to step
3.2) method described carries out coding reconstruct, obtains encoder matrix
4.6) fault prediction model is established and is diagnosed: being based on encoder matrixWith fault sample initial parameter, in conjunction with SVM,
Neural network algorithm establishes fault diagnosis and prediction model respectively;Using the model to fault sample to be predicted carry out prediction and it is defeated
Diagnostic result is corresponded to out.
By using above-mentioned technology, compared with prior art, the invention has the advantages that:
1) the invention proposes XGBoost feature extracting methods, for excavating the fault signature implied in fault data letter
Cease and carry out intelligent characterization, while giving the importance ranking of Fault characteristic parameters, as fault location and detection according to
According to;
2) it the invention proposes the Compressor Fault Diagnosis method based on XGBoost feature extraction, is examined according to specific failure
The disconnected customized corresponding XGBoost loss function of scene, constructs failure splay tree, extracts leaf of the sample in failure division tree-model
Site position index vector simultaneously carries out feature coding reconstruct, the characterization matrix of hidden feature information is obtained, then in conjunction with SVM, mind
Fault prediction model is constructed through network machine in normal service learning algorithm and carries out fault diagnosis;
3) present invention applies XGBoost hidden feature extracting method on compressor fault data set, constructs and is based on
The SVM Compressor Fault Diagnosis model and Neural Network Diagnosis model of XGBoost feature extraction compare corresponding model in event respectively
Hinder the accuracy on test set, it is shown that the validity for the diagnostic model that the method for the present invention obtains.
Detailed description of the invention
Fig. 1 is failure splay tree building flow chart of the invention;
Fig. 2 is the Compressor Fault Diagnosis model construction flow chart based on XGBoost feature extraction;
Fig. 3 is single division tree structure diagram of FM fault model based on compressor diagnostic data set;
Fig. 4 is the Fault characteristic parameters importance ranking figure based on compressor diagnostic data set;
Fig. 5 is the SVM compressor fault model decision boundary graph based on XGBoost fault signature;
Fig. 6 is the neural network compressor fault model decision boundary graph based on XGBoost fault signature.
Specific embodiment
Below based on Compressor Fault Diagnosis data set, the invention will be further described.
As shown in Figs. 1-2, the method for diagnosing faults of the invention based on XGBoost feature extraction the following steps are included:
1) according to the loss function of fault data sample and the customized XGBoost algorithm of fault type;Its function constructed
Cheng Caiyong following steps:
1.1) classification diagnosis more for failure introduce Maximum Likelihood Estimation Method, for given fault sample and its correspondence
Malfunction classification, customized loss function are as follows:
In formula: n representing fault data sample total number, yiThe true classification of representing fault,The prediction class of representing fault
Not.XGBoost theory of algorithm is used for reference, the objective function of the t times iteration of tree-model FM of failure division at this time is represented by
In formula: Ω (ft) representing fault splay tree complexity,Represent failure predication value when the t-1 times iteration, ft
(xi) function when representing the t times iteration, C represents constant.Obtaining approximate objective function with Taylor expansion is
In formula:giAnd hiIt containsSingle order and second dervative information, booting failure tree division building.For failure splay tree FM, remove constant termStill have with C
Function FM (t) at this time is the final goal function for needing the failure splay tree of training.Then according to XGBoost algorithm
The leaf node cumulative fashion of theory, failure splay tree becomes:
In formula: IjFor the set of fault data sample above each leaf, wjRepresent each tree leaf node fault data sample
Reciprocal fraction, the leaf node number of T representing fault splay tree, λ, γ are used to control the specific gravity of corresponding part.Above formula is to wjIt asks
Leading and enabling derivative is zero, and the least disadvantage for acquiring fault tree synthesis is
In formula: The output score of least disadvantage fault tree synthesis is represented, S is represented most
Small loss.
2) the step of iteration constructs failure splay tree, and the iteration constructs failure splay tree is as follows:
2.1) loss function is constructed according to fault data sample and fault typeIt is constructed using the loss function
Corresponding objective function under the t times iteration of failure splay tree.
2.2) formula (3) is used to guide the failure splay tree of each step to construct as objective function.
2.3) in tree fission process, a parameter attribute is successively chosen in the sample of current tree node, uses following formula
(7) gain information Gain of the parameter as leaf node fragmentation criterion is calculated.
In formula: first item is the information score of left subtree, and Section 2 is the information score of right subtree, Section 3 be it is current not
The information score of segmentation.
2.4) after the completion of all parameters all calculate, select corresponding fault parameter to current failure tree according to this gain G ain
It is divided, and fault data sample is placed in corresponding leaf node.
2.5) when whether being less than the threshold γ of setting according to gain G ain value, decide whether point for carrying out current tree node
It splits;Whether reach the maximal tree depth of setting according to this failure splay tree simultaneously, whether failure splay tree has constructed at this time for judgement
At stopping division.If so, calculating formula (6) obtains the least disadvantage functional value of this failure splay tree;If it is not, then returning to step
2.3, and subsequent step is executed in order.
During iterative calculation, which can constantly enumerate splitted construction, and attempt to divide each leaf knot
When point, the gain G ain of splitting parameter is calculated to carry out the purification of the statistics of important information.One is found using the above method most
The failure splay tree and calculating formula (6) of excellent structure obtain corresponding least disadvantage functional value S and current structure score wj, set at this time
Leaf node include most and most important hidden failure characteristic information.
3) it extracts leaf node location index vector of the fault data sample in fault tree and carries out feature coding reconstruct,
The intelligent characterization of hidden failure information is obtained, comprising the following steps:
3.1) fault data sample set is set as { x1,x2,…,xn, possessing K splay tree, every splay tree leaf node
In the case that quantity is T, n-th of sample corresponding leaf node position in kth tree is ank, and ank∈ [1, T], k ∈
[1, K], then the location index vector dimension of a sample is K, and corresponding position index can be identical or not in difference division subtrees
Together, obtaining all sample position index vector matrixes is
In formula: n representative sample quantity, K represent splay tree quantity.
3.2) it is obtained according to this location index matrix, the distance difference between different samples only represents its position in splay tree at this time
The difference set, it includes hidden failure information or implicit, fault diagnosis model, meeting are trained using this vector matrix
So that subsequent algorithm model can not acquire effective fault message.In order to preferably utilize the letter implied on the index position
Breath is indexed the coding reconstruct of vector to it, the value of location index feature is first expanded to Euclidean space and obtains new feature
Collection is combined into V, and V is the set of all elements in Z, then each sample index vector dimension is extended to K × T, when existing in sample
When value in set V, otherwise it is 0 that the corresponding position value in new vector, which is 1, then obtains the coding reconstruct of all samples
Matrix example is
4) it is based on the characterization matrix, fault prediction model is established using SVM, neural network algorithm, realizes multiple faults mode
Predictive diagnosis, specifically use following steps:
4.1) fault data pre-processes: the fault sample data of acquisition are initially rambling, often include missing values,
Repetition values, exceptional value, and constructing the required data of failure splay tree is numerical matrix form.Therefore, specific according to data
Situation is pre-processed data using Data Discretization, mean value or median filling missing values mode, is obtained with this regular
Sample data.
4.2) failure splay tree constructs:
4.2.1 loss function) is constructed according to fault data sample and fault typeUse the loss function structure
Build corresponding objective function under the t times iteration of failure splay tree;
4.2.2 formula (5)) is used to guide the failure splay tree of each step to construct as objective function;
4.2.3) in tree fission process, a parameter attribute is successively chosen in the sample of current tree node, uses formula
(7) gain information Gain of the parameter as leaf node fragmentation criterion is calculated;
4.2.4) after the completion of all parameters all calculate, select corresponding fault parameter to current failure according to this gain G ain
Tree is divided, and sample is placed in corresponding leaf node;
When 4.2.5) whether being less than the threshold γ of setting according to gain G ain value, decide whether point for carrying out current tree node
It splits;Whether reach the maximal tree depth of setting according to this failure splay tree simultaneously, whether failure splay tree has constructed at this time for judgement
At stopping division.If so, calculating formula (6) obtains the least disadvantage functional value of this failure splay tree;If it is not, then returning to step
4.2.3 subsequent step), and is in order executed;
The failure splay tree of optimum structure is found through the above steps and obtains optimal leaf node quantity T and its corresponding
Leaf node score w;
4.3) leaf node location index vector extracts.Using the failure splay tree of foundation, extracts each fault sample and dividing
The leaf node position in tree is split, forms corresponding index vector, and obtain the index vector matrix Z of all samples.
4.4) index feature set constructs: according to the method for 3.2) description, reconstructing index vector feature space, obtains new
Characteristic set V.
4.5) encoder matrix generate: extend each sample index vector dimension be K × T, and according to 3.2 description method into
Row coding reconstruct, obtains encoder matrix
4.6) fault prediction model is established and is diagnosed: being based on encoder matrixWith fault sample initial parameter, in conjunction with SVM,
Neural network algorithm establishes fault diagnosis and prediction model.Fault sample to be predicted is predicted using the model and is exported pair
Answer diagnostic result.
Embodiment:
Based on from compressor fault data set building fault diagnosis model in certain enterprise's air separation equipment.The fault data packet
Containing training data and test data, wherein there is the operating frequency f of motor comprising attribute1, measurement when motor component support quantity
f3, preparatory measured value f5, each component of motor component encode f9, motor running speed f11, whether filter f is installed16And filtering
Device direction f23Deng there are also fault category class.With fault parameter title of the alphabet registration in, respectively { class, f1,
f2,…f48, with the corresponding fault category of digital representation sample, respectively 1: axis misalignment, 2: mechanical part loosens, and 3: bearing
Failure, 4: connecting rod failure, 5: piston failure, 6: valve block failure, 7: machine winding failure, 8: slide plate damage, 9: rotor unbalance,
10: film shocks, 11: impeller incrustation }.In the way of Fig. 2, fault diagnosis model is constructed with failure training data, then with survey
Try the accuracy of data verification model.Fig. 3 is based on single division tree graph of the fault model under this data set.Table 1 be based on
The SVM fault diagnosis model of XGBoost fault signature is to a kind of this ten predictablity rate of fault type.
SVM algorithm compressor model diagnostic result of the table 1 based on XGBoost fault signature
Fig. 4 is Fault characteristic parameters importance ranking figure.Fig. 5 is the SVM compressor extracted based on XGBoost fault signature
Fault model decision boundary figure.Table 2 is the structure and parameter table of neural network model, which trains 198283 altogether
A parameter, and contain multiple hidden layers, every layer using correct linear unit activating function (Rectified-Linear Unit,
RLU), and the output unit of each hidden layer is handled by Dropout, effectively reduces the total parameter of model training and raising model is general
Change ability.
2 Artificial Neural Network Structures parameter of table
Table 3 is that the SVM fault diagnosis model based on XGBoost fault signature is accurate to a kind of this ten prediction of fault type
Rate Fig. 6 show the decision boundary figure of the neural network compressor model extracted based on XGBoost fault signature.
Neural network compressor model diagnostic result of the table 3 based on XGBoost fault signature
Modeling experiment the result shows that: the feature extracting method based on XGBoost is deeply excavated and is characterized in fault data
Hidden failure Rule Information keeps the Forecasting recognition accuracy of subsequent algorithm model higher.It gives system equipment simultaneously
When failure, the importance ranking of each characteristic parameter can be used as effective foundation of fault location and detection.
Claims (5)
1. a kind of Compressor Fault Diagnosis method based on XGBoost feature extraction, which comprises the following steps:
1) according to the loss function of compressor fault data sample and the customized XGBoost algorithm of fault type;
2) iteration constructs failure splay tree;
3) it extracts leaf node location index vector of the fault data sample in failure splay tree and carries out feature coding reconstruct,
Obtain the intelligent characterization of hidden failure information;
4) it is based on the characterization matrix, fault prediction model is established respectively using SVM, neural network algorithm, realizes multiple faults mode
Predictive diagnosis.
2. a kind of Compressor Fault Diagnosis method based on XGBoost feature extraction according to claim 1, feature exist
According to the loss function of fault data sample and the customized XGBoost algorithm of fault type using following in the step 1)
Step:
1.1) classification diagnosis more for failure introduce Maximum Likelihood Estimation Method, for given fault data sample and its correspondence
Malfunction classification, customized loss function are as follows:
In formula: n representing fault data sample total number, yiThe true classification of representing fault,The prediction classification of representing fault;Root
According to XGBoost theory of algorithm, the objective function of the t times iteration of tree-model FM of failure division at this time be may be expressed as:
In formula: Ω (ft) representing fault splay tree complexity,Represent failure predication value when the t-1 times iteration, ft(xi) generation
Function when the t times iteration of table, C represent constant;
Approximate objective function is obtained with Taylor expansion are as follows:
In formula:giAnd hiIt contains
Single order and second dervative information, booting failure splay tree division building;
Tree-model FM is divided for failure, removes constant termStill have with C:
Function FM at this time(t)For the final goal function for needing trained failure splay tree;Then according to XGBoost theory of algorithm,
The leaf node cumulative fashion of failure splay tree becomes:
In formula: IjFor the set of fault data sample above each leaf, wjIt is corresponding to represent each tree leaf node fault data sample
Score, the leaf node number of T representing fault splay tree;λ, γ are used to control the specific gravity of corresponding part;Above formula is to wjDerivation is simultaneously
Enabling derivative is zero, acquires the least disadvantage of failure division tree construction are as follows:
In formula: The output score of least disadvantage failure division tree construction is represented, S represents minimum
Loss.
3. a kind of Compressor Fault Diagnosis method based on XGBoost feature extraction according to claim 2, feature exist
It is as follows in the step of iteration of step 2) constructs failure splay tree:
2.1) loss function is constructed according to fault data sample and fault typeUse loss function building failure point
Split corresponding objective function under the t times iteration of tree;
2.2) formula (3) is used to guide the failure splay tree of each step to construct as objective function;
2.3) in failure splay tree fission process, a parameter spy is successively chosen in the sample of current failure division tree node
Sign, uses lower formula (7) to calculate the parameter as the gain information Gain of leaf node fragmentation criterion;
In formula: first item is the information score of left subtree, and Section 2 is the information score of right subtree, and Section 3 is not divide currently
Information score, GL, HLThe g of left subtree after respectively dividingiAnd hiInformation and GR, HRThe g of right subtree after respectively dividingiAnd hi
Information and;
2.4) after the completion of all parameters all calculate, select corresponding fault parameter to current failure splay tree according to this gain G ain
It is divided, and fault data sample is placed in corresponding leaf node;
2.5) when whether being less than the threshold γ of setting according to gain G ain value, decide whether to carry out current failure division tree node
Division;Whether reach the maximal tree depth of setting according to this failure splay tree simultaneously, whether failure splay tree constructs at this time for judgement
It completes, stops division;If so, calculation formula (6) obtains the least disadvantage functional value of this failure splay tree;If it is not, then returning to
Step 2.3, subsequent step and is in order executed.
4. according to a kind of Compressor Fault Diagnosis method based on XGBoost feature extraction as claimed in claim 3, it is characterised in that
Leaf node location index vector of the extraction fault data sample of step 3) in failure splay tree simultaneously carries out feature coding weight
Structure, the intelligent characterization for obtaining hidden failure information comprise the steps of:
3.1) fault data sample data set is set as { x1,x2,…,xn, possessing K failure splay tree, every failure splay tree
In the case that leaf node quantity is T, n-th of sample corresponding leaf node position in kth failure splay tree is ank, and
ank∈ [1, T], k ∈ [1, K], then the location index vector dimension of a sample is K, corresponding position rope in different faults splay tree
Drawing can be identical or different, obtains all sample position index vector matrixes are as follows:
In formula: n representative sample quantity, K representing fault splay tree quantity;
3.2) it is obtained according to this location index matrix, the distance difference between different samples only represents its position in failure splay tree at this time
The difference set, it includes hidden failure information or implicit, fault diagnosis model, meeting are trained using this vector matrix
So that subsequent algorithm model can not acquire effective fault message;In order to preferably utilize the letter implied on the index position
Breath is indexed the coding reconstruct of vector to it, the value of location index feature is first expanded to Euclidean space and obtains new feature
Collection is combined into V, and V is the set of all elements in Z, then each sample index vector dimension is extended to K × T, when existing in sample
When value in set V, otherwise it is 0 that the corresponding position value in new vector, which is 1, then obtains the coding reconstruct of all samples
Matrix example are as follows:
5. a kind of Compressor Fault Diagnosis method based on XGBoost feature extraction according to claim 4, the step
It is rapid 4) based on the characterization matrix, using SVM, neural network fault prediction model, realize that the prediction of multiple faults mode is examined
It is disconnected, using following steps:
4.1) fault data pre-processes: the fault data sample of acquisition is initially rambling, often includes missing values, repetition
Value, exceptional value, and constructing the required data of failure splay tree is numerical matrix form;Therefore, according to data concrete condition,
Data are pre-processed using Data Discretization, mean value or median filling missing values mode, regular sample is obtained with this
Data;
4.2) failure splay tree constructs:
4.2.1 loss function) is constructed according to fault data sample and fault typeFailure is constructed using the loss function
Corresponding objective function under the t times iteration of splay tree;
4.2.2 formula (5)) is used to guide the failure splay tree of each step to construct as objective function;
4.2.3) in tree fission process, a parameter attribute is successively chosen in the sample of current tree node, is used formula (7)
Calculate gain information Gain of the parameter as leaf node fragmentation criterion;
4.2.4) after the completion of all parameters all calculate, according to this gain G ain select corresponding fault parameter to current failure tree into
Line splitting, and fault data sample is placed in corresponding leaf node;
When 4.2.5) whether being less than the threshold γ of setting according to gain G ain value, decide whether the division for carrying out current tree node;
Whether reaching the maximal tree depth of setting according to this failure splay tree simultaneously, whether failure splay tree constructs completion at this time for judgement,
Stop division;If so, calculating formula (6) obtains the least disadvantage functional value of this failure splay tree;If it is not, then returning to step
4.2.3, and in order execute subsequent step;
The failure splay tree of optimum structure is found through the above steps and obtains optimal leaf node quantity T and its corresponding leaf knot
Point score w;
4.3) leaf node location index vector extracts: using the failure splay tree established, extracting each fault data sample and is dividing
The leaf node position in tree is split, forms corresponding index vector, and obtain the index vector matrix of all fault data samples
Z;
4.4) index feature set constructs: the method described according to step 3.2) reconstructs index vector feature space, obtains new
Characteristic set V;
4.5) encoder matrix generates: extending each fault data sample index vector dimension is K × T, and is retouched according to step 3.2)
The method stated carries out coding reconstruct, obtains encoder matrix
4.6) fault prediction model is established and is diagnosed: being based on encoder matrixWith fault sample initial parameter, in conjunction with SVM, nerve
Network algorithm establishes fault diagnosis and prediction model respectively;Fault sample to be predicted is predicted using the model and is exported pair
Answer diagnostic result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910100466.1A CN109829236B (en) | 2019-01-31 | 2019-01-31 | XGboost feature extraction-based compressor fault diagnosis method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910100466.1A CN109829236B (en) | 2019-01-31 | 2019-01-31 | XGboost feature extraction-based compressor fault diagnosis method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109829236A true CN109829236A (en) | 2019-05-31 |
CN109829236B CN109829236B (en) | 2023-04-18 |
Family
ID=66862139
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910100466.1A Active CN109829236B (en) | 2019-01-31 | 2019-01-31 | XGboost feature extraction-based compressor fault diagnosis method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109829236B (en) |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110262465A (en) * | 2019-07-11 | 2019-09-20 | 电子科技大学 | A kind of winged control method for diagnosing faults based on error code classification |
CN110413494A (en) * | 2019-06-19 | 2019-11-05 | 浙江工业大学 | A kind of LightGBM method for diagnosing faults improving Bayes's optimization |
CN110781206A (en) * | 2019-12-02 | 2020-02-11 | 国网河北省电力有限公司电力科学研究院 | Method for predicting whether electric energy meter in operation fails or not by learning meter-dismantling and returning failure characteristic rule |
CN110796120A (en) * | 2019-11-21 | 2020-02-14 | 杭州电力设备制造有限公司 | Time domain feature-based circuit breaker mechanical fault XGboost diagnosis method |
CN110986407A (en) * | 2019-11-08 | 2020-04-10 | 杭州电子科技大学 | Fault diagnosis method for centrifugal water chilling unit |
CN110987439A (en) * | 2019-12-05 | 2020-04-10 | 山东超越数控电子股份有限公司 | Aeroengine fault prediction method based on Logitics regression and Xgboost model |
CN111337244A (en) * | 2020-03-13 | 2020-06-26 | 华风数据(深圳)有限公司 | Method and device for monitoring and diagnosing faults of input shaft of fan gearbox |
CN111417124A (en) * | 2019-06-28 | 2020-07-14 | 西南交通大学 | Method for sensing frequency spectrum in cognitive wireless network environment |
CN111429970A (en) * | 2019-12-24 | 2020-07-17 | 大连海事大学 | Method and system for obtaining multi-gene risk scores by performing feature selection based on extreme gradient lifting method |
CN111444940A (en) * | 2020-02-28 | 2020-07-24 | 山东大学 | Fault diagnosis method for critical parts of fan |
CN111612036A (en) * | 2020-04-20 | 2020-09-01 | 国网浙江省电力有限公司嘉兴供电公司 | Oil-immersed transformer fault diagnosis method based on particle swarm optimization XGboost |
CN112183590A (en) * | 2020-09-14 | 2021-01-05 | 浙江大学 | Transformer fault diagnosis method based on Oneclass SVM algorithm |
CN112380041A (en) * | 2020-11-13 | 2021-02-19 | 重庆金美通信有限责任公司 | Finger control communication equipment fault prediction method based on xgboost |
CN112926400A (en) * | 2021-01-29 | 2021-06-08 | 华南理工大学 | Intelligent diagnosis method and system for leakage fault in hydraulic cylinder based on data driving |
CN113095390A (en) * | 2021-04-02 | 2021-07-09 | 东北大学 | Walking stick motion analysis system and method based on cloud database and improved ensemble learning |
CN113190786A (en) * | 2021-05-13 | 2021-07-30 | 岳聪 | Vibration prediction method for large-scale rotating equipment by using multidimensional assembly parameters |
CN114638384A (en) * | 2022-05-17 | 2022-06-17 | 四川观想科技股份有限公司 | Fault diagnosis method and system based on machine learning |
CN115758897A (en) * | 2022-11-24 | 2023-03-07 | 天津华翼蓝天科技股份有限公司 | Simulator fault diagnosis method based on machine learning |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106524411A (en) * | 2016-11-02 | 2017-03-22 | 王华勤 | Control method for fault diagnosis of suspension-type air conditioner |
CN109190670A (en) * | 2018-08-02 | 2019-01-11 | 大连理工大学 | A kind of charging pile failure prediction method based on expansible boosted tree |
-
2019
- 2019-01-31 CN CN201910100466.1A patent/CN109829236B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106524411A (en) * | 2016-11-02 | 2017-03-22 | 王华勤 | Control method for fault diagnosis of suspension-type air conditioner |
CN109190670A (en) * | 2018-08-02 | 2019-01-11 | 大连理工大学 | A kind of charging pile failure prediction method based on expansible boosted tree |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110413494B (en) * | 2019-06-19 | 2022-05-13 | 浙江工业大学 | LightGBM fault diagnosis method for improving Bayesian optimization |
CN110413494A (en) * | 2019-06-19 | 2019-11-05 | 浙江工业大学 | A kind of LightGBM method for diagnosing faults improving Bayes's optimization |
CN111417124A (en) * | 2019-06-28 | 2020-07-14 | 西南交通大学 | Method for sensing frequency spectrum in cognitive wireless network environment |
CN110262465B (en) * | 2019-07-11 | 2021-05-14 | 电子科技大学 | Flight control fault diagnosis method based on fault code classification |
CN110262465A (en) * | 2019-07-11 | 2019-09-20 | 电子科技大学 | A kind of winged control method for diagnosing faults based on error code classification |
CN110986407A (en) * | 2019-11-08 | 2020-04-10 | 杭州电子科技大学 | Fault diagnosis method for centrifugal water chilling unit |
CN110796120A (en) * | 2019-11-21 | 2020-02-14 | 杭州电力设备制造有限公司 | Time domain feature-based circuit breaker mechanical fault XGboost diagnosis method |
CN110781206A (en) * | 2019-12-02 | 2020-02-11 | 国网河北省电力有限公司电力科学研究院 | Method for predicting whether electric energy meter in operation fails or not by learning meter-dismantling and returning failure characteristic rule |
CN110987439A (en) * | 2019-12-05 | 2020-04-10 | 山东超越数控电子股份有限公司 | Aeroengine fault prediction method based on Logitics regression and Xgboost model |
CN110987439B (en) * | 2019-12-05 | 2022-03-22 | 超越科技股份有限公司 | Aeroengine fault prediction method based on Logitics regression and Xgboost model |
CN111429970B (en) * | 2019-12-24 | 2024-03-22 | 大连海事大学 | Method and system for acquiring multiple gene risk scores based on feature selection of extreme gradient lifting method |
CN111429970A (en) * | 2019-12-24 | 2020-07-17 | 大连海事大学 | Method and system for obtaining multi-gene risk scores by performing feature selection based on extreme gradient lifting method |
CN111444940A (en) * | 2020-02-28 | 2020-07-24 | 山东大学 | Fault diagnosis method for critical parts of fan |
CN111337244B (en) * | 2020-03-13 | 2022-04-01 | 华风数据(深圳)有限公司 | Method and device for monitoring and diagnosing faults of input shaft of fan gearbox |
CN111337244A (en) * | 2020-03-13 | 2020-06-26 | 华风数据(深圳)有限公司 | Method and device for monitoring and diagnosing faults of input shaft of fan gearbox |
CN111612036A (en) * | 2020-04-20 | 2020-09-01 | 国网浙江省电力有限公司嘉兴供电公司 | Oil-immersed transformer fault diagnosis method based on particle swarm optimization XGboost |
CN112183590A (en) * | 2020-09-14 | 2021-01-05 | 浙江大学 | Transformer fault diagnosis method based on Oneclass SVM algorithm |
CN112380041A (en) * | 2020-11-13 | 2021-02-19 | 重庆金美通信有限责任公司 | Finger control communication equipment fault prediction method based on xgboost |
CN112380041B (en) * | 2020-11-13 | 2023-11-14 | 重庆金美通信有限责任公司 | Xgboost-based failure prediction method for command communication equipment |
CN112926400A (en) * | 2021-01-29 | 2021-06-08 | 华南理工大学 | Intelligent diagnosis method and system for leakage fault in hydraulic cylinder based on data driving |
CN113095390A (en) * | 2021-04-02 | 2021-07-09 | 东北大学 | Walking stick motion analysis system and method based on cloud database and improved ensemble learning |
CN113190786A (en) * | 2021-05-13 | 2021-07-30 | 岳聪 | Vibration prediction method for large-scale rotating equipment by using multidimensional assembly parameters |
CN113190786B (en) * | 2021-05-13 | 2024-03-15 | 岳聪 | Vibration prediction method for large-scale rotating equipment by utilizing multidimensional assembly parameters |
CN114638384A (en) * | 2022-05-17 | 2022-06-17 | 四川观想科技股份有限公司 | Fault diagnosis method and system based on machine learning |
CN115758897A (en) * | 2022-11-24 | 2023-03-07 | 天津华翼蓝天科技股份有限公司 | Simulator fault diagnosis method based on machine learning |
Also Published As
Publication number | Publication date |
---|---|
CN109829236B (en) | 2023-04-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109829236A (en) | A kind of Compressor Fault Diagnosis method based on XGBoost feature extraction | |
CN107941537B (en) | A kind of mechanical equipment health state evaluation method | |
CN109102005B (en) | Small sample deep learning method based on shallow model knowledge migration | |
CN110502991B (en) | Internal combustion engine health monitoring method and system based on random convolutional neural network structure | |
CN110929918B (en) | 10kV feeder fault prediction method based on CNN and LightGBM | |
CN105548764A (en) | Electric power equipment fault diagnosis method | |
US11840998B2 (en) | Hydraulic turbine cavitation acoustic signal identification method based on big data machine learning | |
CN108985380B (en) | Point switch fault identification method based on cluster integration | |
CN110647830B (en) | Bearing fault diagnosis method based on convolutional neural network and Gaussian mixture model | |
CN105574284B (en) | A kind of Fault Diagnosis for Electrical Equipment method based on trend feature point | |
CN111680875B (en) | Unmanned aerial vehicle state risk fuzzy comprehensive evaluation method based on probability baseline model | |
CN105467975A (en) | Equipment fault diagnosis method | |
CN109597401A (en) | A kind of equipment fault diagnosis method based on data-driven | |
CN108537259A (en) | Train control on board equipment failure modes and recognition methods based on Rough Sets Neural Networks model | |
CN109813542A (en) | The method for diagnosing faults of air-treatment unit based on production confrontation network | |
CN111523778A (en) | Power grid operation safety assessment method based on particle swarm algorithm and gradient lifting tree | |
CN110737976A (en) | mechanical equipment health assessment method based on multi-dimensional information fusion | |
CN108961460B (en) | Fault prediction method and device based on sparse ESGP (Enterprise service gateway) and multi-objective optimization | |
CN114201920A (en) | Laser cutting numerical control system fault diagnosis method based on digital twinning and deep transfer learning | |
CN110210169A (en) | A kind of shield machine failure prediction method based on LSTM | |
CN115587290A (en) | Aero-engine fault diagnosis method based on variational self-coding generation countermeasure network | |
CN114429152A (en) | Rolling bearing fault diagnosis method based on dynamic index antagonism self-adaption | |
CN116205265A (en) | Power grid fault diagnosis method and device based on deep neural network | |
CN112763215B (en) | Multi-working-condition online fault diagnosis method based on modular federal deep learning | |
CN111523557A (en) | Wind power intelligent fault diagnosis method based on big data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |