CN109829236A - A kind of Compressor Fault Diagnosis method based on XGBoost feature extraction - Google Patents

A kind of Compressor Fault Diagnosis method based on XGBoost feature extraction Download PDF

Info

Publication number
CN109829236A
CN109829236A CN201910100466.1A CN201910100466A CN109829236A CN 109829236 A CN109829236 A CN 109829236A CN 201910100466 A CN201910100466 A CN 201910100466A CN 109829236 A CN109829236 A CN 109829236A
Authority
CN
China
Prior art keywords
failure
fault
tree
sample
splay
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910100466.1A
Other languages
Chinese (zh)
Other versions
CN109829236B (en
Inventor
姜少飞
李治
邬天骥
李吉泉
彭翔
景立挺
许青青
高启龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN201910100466.1A priority Critical patent/CN109829236B/en
Publication of CN109829236A publication Critical patent/CN109829236A/en
Application granted granted Critical
Publication of CN109829236B publication Critical patent/CN109829236B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a kind of Compressor Fault Diagnosis methods based on XGBoost feature extraction.Firstly, iteration constructs failure splay tree according to the loss function of fault data and the customized XGBoost algorithm of fault type;Secondly, extracting leaf node location index vector of the sample in fault tree and carrying out feature coding reconstruct, the intelligent characterization of hidden failure information is obtained;Then, it is based on the characterization matrix, fault prediction model is established respectively using SVM, neural network algorithm, realizes the predictive diagnosis of multiple faults mode.The characteristics of this method is, can hidden failure characteristic information sufficiently in mining data, keep the precision of fault diagnosis and prediction higher.

Description

A kind of Compressor Fault Diagnosis method based on XGBoost feature extraction
Technical field
The present invention relates to a kind of method for diagnosing faults based on machine learning, special based on XGBoost more particularly, to one kind Levy the Compressor Fault Diagnosis method extracted.
Background technique
For fault diagnosis model, the quality of feature extraction is largely fixed the performance of the model, it is failure Key link in diagnosis.Feature extraction is to excavate to extract the most process of information representative and one from fault data The process of depth excavation is carried out to implicit information in fault data.Poor fault signature not only influences the efficiency of algorithm operation, Also algorithm model can be reduced to the precision of prediction of failure.Therefore, it is most important to study effective feature extracting method.
At this stage, method for diagnosing faults is divided into method based on analytic modell analytical model, based on the side of Heuristics by most literature Method and method based on data-driven.As present system has the characteristics that multivariable, close coupling and non-linear, so that system The building of model is extremely difficult, and the fault diagnosis effect based on analytic modell analytical model and Heuristics method is just very unobvious, and is based on The method of data-driven gradually obtains the concern of people, and becomes the hot spot of fault diagnosis field.Method based on data-driven It is exactly the useful information in extraction system process data, according to these useful informations come the failure of diagnostic system.It will be based on data The fault diagnosis technology of driving is subdivided into the method for diagnosing faults based on statistical analysis, the method for diagnosing faults based on signal processing With the method for diagnosing faults based on artificial intelligence.Since simple corresponding relationship being not present between fault type and failure symptom, For the uncertainty and complexity of system, the fault diagnosis technology based on artificial intelligence is more applicable in.It mainly passes through work The normal data of industry process and fault data train all kinds of learning algorithms, and then realize the purpose of fault diagnosis.Its technology Difficult point, which is how to excavate from the fault data of monitoring, extracts implicit important feature information, and characterization system runs normal Mode and fault mode.Method for diagnosing faults based on artificial intelligence includes neural network, support vector machines method, limit study Machine method and fuzzy logic method.Degeneration and mechanical abrasion, existing research person for ingredient in production system pass through time frequency analysis Technology realizes the detection and diagnosis of failure on frequency domain using ANN;There is researcher to propose least square SVM hybrid classification again Device, using the parameter of particle swarm optimization algorithm optimization SVM, is realized to oil immersed type electricity during being trained to classifier The fault diagnosis of the dissolved gas analysis of power transformer;Simultaneously it is proposed that in conjunction with singular value decomposition and extreme learning machine algorithm Rolling bearing fault diagnosis technology;Somebody carries out the selection of correlated characteristic using traditional decision-tree, calculates in conjunction with backpropagation Method and least-squares algorithm finely tune the network parameter of adaptive Fuzzy Reasoning Neural Network, to carry out failure to induction machine Diagnosis.Meanwhile researchers from the new information of increase, excavate unused implicit letter for the Problems Existing in complication system It ceases and this field is looked forward to using new three angles of mathematical tool, propose four fault diagnosises based on data-driven Research and prospects: the fault diagnosis prospect based on Multi-source Information Fusion, is based on machine at the fault diagnosis prospect based on association analysis The fault diagnosis prospect of study, the fault diagnosis prospect based on time frequency analysis.
Wherein, in the fault diagnosis based on machine learning, traditional intelligence learning method, either for classifying still It returns, majority is shallow structure algorithm, is limited in that the expression in the case of finite sample and computing unit to complicated function Ability is limited, and for complicated classification problem, its generalization ability is centainly restricted, how from monitoring data to fault signature into Row excavate and indicate be such method Research Challenges, if reasonable drawing and table can be carried out to the implicit information in fault data Sign, can obtain better fault detection and prediction result.Currently used for the machine learning method of fault diagnosis field, be all from The angle of Approximation Theory is fitted monitoring data, there are the deficiency of approximation accuracy etc., such as neural network, support vector machine etc., It still can not sufficiently excavate the fault signature in monitoring data.
Above-mentioned machine learning fault diagnosis present Research and there are aiming at the problem that, this chapter proposes special based on XGBoost Levy the Compressor Fault Diagnosis method extracted.This method according to the compressor fault data and fault type of monitoring, is made by oneself first Justice adapts to the XGBoost algorithm loss function of the fault diagnosis scene, constructs corresponding failure splay tree with this;Secondly, determining Leaf node location index of the sample data in failure splay tree simultaneously creates location index matrix;Finally, to index matrix into The coding of row ad hoc fashion reconstructs, and realizes the feature extraction and intelligent characterization of hidden failure information.This method passes through building event Hinder the mode of splay tree to realize between the statistics of information significant variable in data, including data structure and distributed nature.Together When, realize that the depth of hidden feature information in data is excavated by the way that the depth of splay tree is arranged.It is examined with the failure of certain compressor The validity of this feature extracting method is verified for disconnected.
Summary of the invention
It is improved for the above-mentioned problems in the prior art for hidden failure characteristic information in abundant mining data Fault diagnosis accuracy, the object of the invention with a kind of Compressor Fault Diagnosis method based on XGBoost feature extraction is provided.
A kind of Compressor Fault Diagnosis method based on XGBoost feature extraction, which comprises the following steps:
1) according to the loss function of compressor fault data sample and the customized XGBoost algorithm of fault type;
2) iteration constructs failure splay tree;
3) it extracts leaf node location index vector of the fault data sample in failure splay tree and carries out feature coding Reconstruct obtains the intelligent characterization of hidden failure information;
4) it is based on the characterization matrix, fault prediction model is established respectively using SVM, neural network algorithm, realizes multiple faults The predictive diagnosis of mode.
A kind of Compressor Fault Diagnosis method based on XGBoost feature extraction, which is characterized in that the step 1) following steps are used according to the loss function of fault data sample and the customized XGBoost algorithm of fault type in:
1.1) classification diagnosis more for failure introduce Maximum Likelihood Estimation Method, for given fault data sample and its Corresponding malfunction classification, customized loss function are as follows:
In formula: n representing fault data sample total number, yiThe true classification of representing fault,The prediction class of representing fault Not;According to XGBoost theory of algorithm, the objective function of the t times iteration of tree-model FM of failure division at this time be may be expressed as:
In formula: Ω (ft) representing fault splay tree complexity,Represent failure predication value when the t-1 times iteration, ft (xi) function when representing the t times iteration, C represents constant;
Approximate objective function is obtained with Taylor expansion are as follows:
In formula:giAnd hiIt containsSingle order and second dervative information, booting failure splay tree division building;
Tree-model FM is divided for failure, removes constant termStill have with C:
Function FM at this time(t)For the final goal function for needing trained failure splay tree;Then according to XGBoost algorithm The leaf node cumulative fashion of theory, failure splay tree becomes:
In formula: IjFor the set of fault data sample above each leaf, wjRepresent each tree leaf node fault data sample Reciprocal fraction, the leaf node number of T representing fault splay tree;λ, γ are used to control the specific gravity of corresponding part;Above formula is to wjIt asks Leading and enabling derivative is zero, acquires the least disadvantage of failure division tree construction are as follows:
In formula: Represent the output score of least disadvantage failure division tree construction, S generation Table least disadvantage.
A kind of Compressor Fault Diagnosis method based on XGBoost feature extraction, it is characterised in that step 2) It is as follows that iteration constructs the step of failure splay tree:
2.1) loss function is constructed according to fault data sample and fault typeIt is constructed using the loss function Corresponding objective function under the t times iteration of failure splay tree;
2.2) formula (3) is used to guide the failure splay tree of each step to construct as objective function;
2.3) in failure splay tree fission process, a ginseng is successively chosen in the sample of current failure division tree node Number feature, uses lower formula (7) to calculate the parameter as the gain information Gain of leaf node fragmentation criterion;
In formula: first item is the information score of left subtree, and Section 2 is the information score of right subtree, Section 3 be it is current not The information score of segmentation, GL, HLThe g of left subtree after respectively dividingiAnd hiInformation and GR, HRRight subtree after respectively dividing giAnd hiInformation and;
2.4) after the completion of all parameters all calculate, select corresponding fault parameter to current failure point according to this gain G ain It splits tree to be divided, and fault data sample is placed in corresponding leaf node;
2.5) when whether being less than the threshold γ of setting according to gain G ain value, decide whether to carry out current failure splay tree knot The division of point;Whether reach the maximal tree depth of setting according to this failure splay tree simultaneously, judgement at this time failure splay tree whether Building is completed, and division is stopped;If so, calculation formula (6) obtains the least disadvantage functional value of this failure splay tree;If it is not, then Step 2.3 is returned to, and executes subsequent step in order.
A kind of Compressor Fault Diagnosis method based on XGBoost feature extraction, it is characterised in that step 3) It extracts leaf node location index vector of the fault data sample in failure splay tree and carries out feature coding reconstruct, obtain hidden Intelligent characterization containing fault message comprises the steps of:
3.1) fault data sample data set is set as { x1,x2,…,xn, possessing K failure splay tree, every failure point In the case where splitting leaf node quantity for T, n-th of sample corresponding leaf node position in kth failure splay tree is ank, and ank∈ [1, T], k ∈ [1, K], then the location index vector dimension of a sample is K, is corresponded in different faults splay tree Location index can be identical or different, obtains all sample position index vector matrixes are as follows:
In formula: n representative sample quantity, K representing fault splay tree quantity;
3.2) it is obtained according to this location index matrix, the distance difference between different samples only represents it in failure splay tree at this time Difference on middle position, it includes hidden failure information or implicit, fault diagnosis mould is trained using this vector matrix Type can make subsequent algorithm model that can not acquire effective fault message;It is implied on the index position to preferably utilize Information, it is indexed vector coding reconstruct, the value of location index feature is first expanded into Euclidean space and is obtained newly Characteristic set is V, and V is the set of all elements in Z, then each sample index vector dimension is extended to K × T, when in sample When in the presence of value in set V, the corresponding position value in new vector is 1, is otherwise 0, then obtains the coding of all samples Restructuring matrix example are as follows:
A kind of Compressor Fault Diagnosis method based on XGBoost feature extraction, the step 4) based on The characterization matrix realizes the predictive diagnosis of multiple faults mode using SVM, neural network fault prediction model, and use is following Step:
4.1) fault data pre-processes: the fault data sample of acquisition is initially rambling, often includes missing values, Repetition values, exceptional value, and constructing the required data of failure splay tree is numerical matrix form;Therefore, specific according to data Situation is pre-processed data using Data Discretization, mean value or median filling missing values mode, is obtained with this regular Sample data;
4.2) failure splay tree constructs:
4.2.1 loss function) is constructed according to fault data sample and fault typeUse the loss function structure Build corresponding objective function under the t times iteration of failure splay tree;
4.2.2 formula (5)) is used to guide the failure splay tree of each step to construct as objective function;
4.2.3) in tree fission process, a parameter attribute is successively chosen in the sample of current tree node, uses public affairs Formula (7) calculates gain information Gain of the parameter as leaf node fragmentation criterion;
4.2.4) after the completion of all parameters all calculate, select corresponding fault parameter to current failure according to this gain G ain Tree is divided, and fault data sample is placed in corresponding leaf node;
When 4.2.5) whether being less than the threshold γ of setting according to gain G ain value, decide whether point for carrying out current tree node It splits;Whether reach the maximal tree depth of setting according to this failure splay tree simultaneously, whether failure splay tree has constructed at this time for judgement At stopping division;If so, calculating formula (6) obtains the least disadvantage functional value of this failure splay tree;If it is not, then returning to step 4.2.3, and in order execute subsequent step;
The failure splay tree of optimum structure is found through the above steps and obtains optimal leaf node quantity T and its corresponding Leaf node score w;
4.3) leaf node location index vector extracts: using the failure splay tree established, extracting each fault data sample Leaf node position in splay tree, forms corresponding index vector, and obtains the index vector of all fault data samples Matrix Z;
4.4) index feature set constructs: the method described according to step 3.2) reconstructs index vector feature space, obtains New characteristic set V;
4.5) encoder matrix generates: extending each fault data sample index vector dimension is K × T, and according to step 3.2) method described carries out coding reconstruct, obtains encoder matrix
4.6) fault prediction model is established and is diagnosed: being based on encoder matrixWith fault sample initial parameter, in conjunction with SVM, Neural network algorithm establishes fault diagnosis and prediction model respectively;Using the model to fault sample to be predicted carry out prediction and it is defeated Diagnostic result is corresponded to out.
By using above-mentioned technology, compared with prior art, the invention has the advantages that:
1) the invention proposes XGBoost feature extracting methods, for excavating the fault signature implied in fault data letter Cease and carry out intelligent characterization, while giving the importance ranking of Fault characteristic parameters, as fault location and detection according to According to;
2) it the invention proposes the Compressor Fault Diagnosis method based on XGBoost feature extraction, is examined according to specific failure The disconnected customized corresponding XGBoost loss function of scene, constructs failure splay tree, extracts leaf of the sample in failure division tree-model Site position index vector simultaneously carries out feature coding reconstruct, the characterization matrix of hidden feature information is obtained, then in conjunction with SVM, mind Fault prediction model is constructed through network machine in normal service learning algorithm and carries out fault diagnosis;
3) present invention applies XGBoost hidden feature extracting method on compressor fault data set, constructs and is based on The SVM Compressor Fault Diagnosis model and Neural Network Diagnosis model of XGBoost feature extraction compare corresponding model in event respectively Hinder the accuracy on test set, it is shown that the validity for the diagnostic model that the method for the present invention obtains.
Detailed description of the invention
Fig. 1 is failure splay tree building flow chart of the invention;
Fig. 2 is the Compressor Fault Diagnosis model construction flow chart based on XGBoost feature extraction;
Fig. 3 is single division tree structure diagram of FM fault model based on compressor diagnostic data set;
Fig. 4 is the Fault characteristic parameters importance ranking figure based on compressor diagnostic data set;
Fig. 5 is the SVM compressor fault model decision boundary graph based on XGBoost fault signature;
Fig. 6 is the neural network compressor fault model decision boundary graph based on XGBoost fault signature.
Specific embodiment
Below based on Compressor Fault Diagnosis data set, the invention will be further described.
As shown in Figs. 1-2, the method for diagnosing faults of the invention based on XGBoost feature extraction the following steps are included:
1) according to the loss function of fault data sample and the customized XGBoost algorithm of fault type;Its function constructed Cheng Caiyong following steps:
1.1) classification diagnosis more for failure introduce Maximum Likelihood Estimation Method, for given fault sample and its correspondence Malfunction classification, customized loss function are as follows:
In formula: n representing fault data sample total number, yiThe true classification of representing fault,The prediction class of representing fault Not.XGBoost theory of algorithm is used for reference, the objective function of the t times iteration of tree-model FM of failure division at this time is represented by
In formula: Ω (ft) representing fault splay tree complexity,Represent failure predication value when the t-1 times iteration, ft (xi) function when representing the t times iteration, C represents constant.Obtaining approximate objective function with Taylor expansion is
In formula:giAnd hiIt containsSingle order and second dervative information, booting failure tree division building.For failure splay tree FM, remove constant termStill have with C
Function FM (t) at this time is the final goal function for needing the failure splay tree of training.Then according to XGBoost algorithm The leaf node cumulative fashion of theory, failure splay tree becomes:
In formula: IjFor the set of fault data sample above each leaf, wjRepresent each tree leaf node fault data sample Reciprocal fraction, the leaf node number of T representing fault splay tree, λ, γ are used to control the specific gravity of corresponding part.Above formula is to wjIt asks Leading and enabling derivative is zero, and the least disadvantage for acquiring fault tree synthesis is
In formula: The output score of least disadvantage fault tree synthesis is represented, S is represented most Small loss.
2) the step of iteration constructs failure splay tree, and the iteration constructs failure splay tree is as follows:
2.1) loss function is constructed according to fault data sample and fault typeIt is constructed using the loss function Corresponding objective function under the t times iteration of failure splay tree.
2.2) formula (3) is used to guide the failure splay tree of each step to construct as objective function.
2.3) in tree fission process, a parameter attribute is successively chosen in the sample of current tree node, uses following formula (7) gain information Gain of the parameter as leaf node fragmentation criterion is calculated.
In formula: first item is the information score of left subtree, and Section 2 is the information score of right subtree, Section 3 be it is current not The information score of segmentation.
2.4) after the completion of all parameters all calculate, select corresponding fault parameter to current failure tree according to this gain G ain It is divided, and fault data sample is placed in corresponding leaf node.
2.5) when whether being less than the threshold γ of setting according to gain G ain value, decide whether point for carrying out current tree node It splits;Whether reach the maximal tree depth of setting according to this failure splay tree simultaneously, whether failure splay tree has constructed at this time for judgement At stopping division.If so, calculating formula (6) obtains the least disadvantage functional value of this failure splay tree;If it is not, then returning to step 2.3, and subsequent step is executed in order.
During iterative calculation, which can constantly enumerate splitted construction, and attempt to divide each leaf knot When point, the gain G ain of splitting parameter is calculated to carry out the purification of the statistics of important information.One is found using the above method most The failure splay tree and calculating formula (6) of excellent structure obtain corresponding least disadvantage functional value S and current structure score wj, set at this time Leaf node include most and most important hidden failure characteristic information.
3) it extracts leaf node location index vector of the fault data sample in fault tree and carries out feature coding reconstruct, The intelligent characterization of hidden failure information is obtained, comprising the following steps:
3.1) fault data sample set is set as { x1,x2,…,xn, possessing K splay tree, every splay tree leaf node In the case that quantity is T, n-th of sample corresponding leaf node position in kth tree is ank, and ank∈ [1, T], k ∈ [1, K], then the location index vector dimension of a sample is K, and corresponding position index can be identical or not in difference division subtrees Together, obtaining all sample position index vector matrixes is
In formula: n representative sample quantity, K represent splay tree quantity.
3.2) it is obtained according to this location index matrix, the distance difference between different samples only represents its position in splay tree at this time The difference set, it includes hidden failure information or implicit, fault diagnosis model, meeting are trained using this vector matrix So that subsequent algorithm model can not acquire effective fault message.In order to preferably utilize the letter implied on the index position Breath is indexed the coding reconstruct of vector to it, the value of location index feature is first expanded to Euclidean space and obtains new feature Collection is combined into V, and V is the set of all elements in Z, then each sample index vector dimension is extended to K × T, when existing in sample When value in set V, otherwise it is 0 that the corresponding position value in new vector, which is 1, then obtains the coding reconstruct of all samples Matrix example is
4) it is based on the characterization matrix, fault prediction model is established using SVM, neural network algorithm, realizes multiple faults mode Predictive diagnosis, specifically use following steps:
4.1) fault data pre-processes: the fault sample data of acquisition are initially rambling, often include missing values, Repetition values, exceptional value, and constructing the required data of failure splay tree is numerical matrix form.Therefore, specific according to data Situation is pre-processed data using Data Discretization, mean value or median filling missing values mode, is obtained with this regular Sample data.
4.2) failure splay tree constructs:
4.2.1 loss function) is constructed according to fault data sample and fault typeUse the loss function structure Build corresponding objective function under the t times iteration of failure splay tree;
4.2.2 formula (5)) is used to guide the failure splay tree of each step to construct as objective function;
4.2.3) in tree fission process, a parameter attribute is successively chosen in the sample of current tree node, uses formula (7) gain information Gain of the parameter as leaf node fragmentation criterion is calculated;
4.2.4) after the completion of all parameters all calculate, select corresponding fault parameter to current failure according to this gain G ain Tree is divided, and sample is placed in corresponding leaf node;
When 4.2.5) whether being less than the threshold γ of setting according to gain G ain value, decide whether point for carrying out current tree node It splits;Whether reach the maximal tree depth of setting according to this failure splay tree simultaneously, whether failure splay tree has constructed at this time for judgement At stopping division.If so, calculating formula (6) obtains the least disadvantage functional value of this failure splay tree;If it is not, then returning to step 4.2.3 subsequent step), and is in order executed;
The failure splay tree of optimum structure is found through the above steps and obtains optimal leaf node quantity T and its corresponding Leaf node score w;
4.3) leaf node location index vector extracts.Using the failure splay tree of foundation, extracts each fault sample and dividing The leaf node position in tree is split, forms corresponding index vector, and obtain the index vector matrix Z of all samples.
4.4) index feature set constructs: according to the method for 3.2) description, reconstructing index vector feature space, obtains new Characteristic set V.
4.5) encoder matrix generate: extend each sample index vector dimension be K × T, and according to 3.2 description method into Row coding reconstruct, obtains encoder matrix
4.6) fault prediction model is established and is diagnosed: being based on encoder matrixWith fault sample initial parameter, in conjunction with SVM, Neural network algorithm establishes fault diagnosis and prediction model.Fault sample to be predicted is predicted using the model and is exported pair Answer diagnostic result.
Embodiment:
Based on from compressor fault data set building fault diagnosis model in certain enterprise's air separation equipment.The fault data packet Containing training data and test data, wherein there is the operating frequency f of motor comprising attribute1, measurement when motor component support quantity f3, preparatory measured value f5, each component of motor component encode f9, motor running speed f11, whether filter f is installed16And filtering Device direction f23Deng there are also fault category class.With fault parameter title of the alphabet registration in, respectively { class, f1, f2,…f48, with the corresponding fault category of digital representation sample, respectively 1: axis misalignment, 2: mechanical part loosens, and 3: bearing Failure, 4: connecting rod failure, 5: piston failure, 6: valve block failure, 7: machine winding failure, 8: slide plate damage, 9: rotor unbalance, 10: film shocks, 11: impeller incrustation }.In the way of Fig. 2, fault diagnosis model is constructed with failure training data, then with survey Try the accuracy of data verification model.Fig. 3 is based on single division tree graph of the fault model under this data set.Table 1 be based on The SVM fault diagnosis model of XGBoost fault signature is to a kind of this ten predictablity rate of fault type.
SVM algorithm compressor model diagnostic result of the table 1 based on XGBoost fault signature
Fig. 4 is Fault characteristic parameters importance ranking figure.Fig. 5 is the SVM compressor extracted based on XGBoost fault signature Fault model decision boundary figure.Table 2 is the structure and parameter table of neural network model, which trains 198283 altogether A parameter, and contain multiple hidden layers, every layer using correct linear unit activating function (Rectified-Linear Unit, RLU), and the output unit of each hidden layer is handled by Dropout, effectively reduces the total parameter of model training and raising model is general Change ability.
2 Artificial Neural Network Structures parameter of table
Table 3 is that the SVM fault diagnosis model based on XGBoost fault signature is accurate to a kind of this ten prediction of fault type Rate Fig. 6 show the decision boundary figure of the neural network compressor model extracted based on XGBoost fault signature.
Neural network compressor model diagnostic result of the table 3 based on XGBoost fault signature
Modeling experiment the result shows that: the feature extracting method based on XGBoost is deeply excavated and is characterized in fault data Hidden failure Rule Information keeps the Forecasting recognition accuracy of subsequent algorithm model higher.It gives system equipment simultaneously When failure, the importance ranking of each characteristic parameter can be used as effective foundation of fault location and detection.

Claims (5)

1. a kind of Compressor Fault Diagnosis method based on XGBoost feature extraction, which comprises the following steps:
1) according to the loss function of compressor fault data sample and the customized XGBoost algorithm of fault type;
2) iteration constructs failure splay tree;
3) it extracts leaf node location index vector of the fault data sample in failure splay tree and carries out feature coding reconstruct, Obtain the intelligent characterization of hidden failure information;
4) it is based on the characterization matrix, fault prediction model is established respectively using SVM, neural network algorithm, realizes multiple faults mode Predictive diagnosis.
2. a kind of Compressor Fault Diagnosis method based on XGBoost feature extraction according to claim 1, feature exist According to the loss function of fault data sample and the customized XGBoost algorithm of fault type using following in the step 1) Step:
1.1) classification diagnosis more for failure introduce Maximum Likelihood Estimation Method, for given fault data sample and its correspondence Malfunction classification, customized loss function are as follows:
In formula: n representing fault data sample total number, yiThe true classification of representing fault,The prediction classification of representing fault;Root According to XGBoost theory of algorithm, the objective function of the t times iteration of tree-model FM of failure division at this time be may be expressed as:
In formula: Ω (ft) representing fault splay tree complexity,Represent failure predication value when the t-1 times iteration, ft(xi) generation Function when the t times iteration of table, C represent constant;
Approximate objective function is obtained with Taylor expansion are as follows:
In formula:giAnd hiIt contains Single order and second dervative information, booting failure splay tree division building;
Tree-model FM is divided for failure, removes constant termStill have with C:
Function FM at this time(t)For the final goal function for needing trained failure splay tree;Then according to XGBoost theory of algorithm, The leaf node cumulative fashion of failure splay tree becomes:
In formula: IjFor the set of fault data sample above each leaf, wjIt is corresponding to represent each tree leaf node fault data sample Score, the leaf node number of T representing fault splay tree;λ, γ are used to control the specific gravity of corresponding part;Above formula is to wjDerivation is simultaneously Enabling derivative is zero, acquires the least disadvantage of failure division tree construction are as follows:
In formula: The output score of least disadvantage failure division tree construction is represented, S represents minimum Loss.
3. a kind of Compressor Fault Diagnosis method based on XGBoost feature extraction according to claim 2, feature exist It is as follows in the step of iteration of step 2) constructs failure splay tree:
2.1) loss function is constructed according to fault data sample and fault typeUse loss function building failure point Split corresponding objective function under the t times iteration of tree;
2.2) formula (3) is used to guide the failure splay tree of each step to construct as objective function;
2.3) in failure splay tree fission process, a parameter spy is successively chosen in the sample of current failure division tree node Sign, uses lower formula (7) to calculate the parameter as the gain information Gain of leaf node fragmentation criterion;
In formula: first item is the information score of left subtree, and Section 2 is the information score of right subtree, and Section 3 is not divide currently Information score, GL, HLThe g of left subtree after respectively dividingiAnd hiInformation and GR, HRThe g of right subtree after respectively dividingiAnd hi Information and;
2.4) after the completion of all parameters all calculate, select corresponding fault parameter to current failure splay tree according to this gain G ain It is divided, and fault data sample is placed in corresponding leaf node;
2.5) when whether being less than the threshold γ of setting according to gain G ain value, decide whether to carry out current failure division tree node Division;Whether reach the maximal tree depth of setting according to this failure splay tree simultaneously, whether failure splay tree constructs at this time for judgement It completes, stops division;If so, calculation formula (6) obtains the least disadvantage functional value of this failure splay tree;If it is not, then returning to Step 2.3, subsequent step and is in order executed.
4. according to a kind of Compressor Fault Diagnosis method based on XGBoost feature extraction as claimed in claim 3, it is characterised in that Leaf node location index vector of the extraction fault data sample of step 3) in failure splay tree simultaneously carries out feature coding weight Structure, the intelligent characterization for obtaining hidden failure information comprise the steps of:
3.1) fault data sample data set is set as { x1,x2,…,xn, possessing K failure splay tree, every failure splay tree In the case that leaf node quantity is T, n-th of sample corresponding leaf node position in kth failure splay tree is ank, and ank∈ [1, T], k ∈ [1, K], then the location index vector dimension of a sample is K, corresponding position rope in different faults splay tree Drawing can be identical or different, obtains all sample position index vector matrixes are as follows:
In formula: n representative sample quantity, K representing fault splay tree quantity;
3.2) it is obtained according to this location index matrix, the distance difference between different samples only represents its position in failure splay tree at this time The difference set, it includes hidden failure information or implicit, fault diagnosis model, meeting are trained using this vector matrix So that subsequent algorithm model can not acquire effective fault message;In order to preferably utilize the letter implied on the index position Breath is indexed the coding reconstruct of vector to it, the value of location index feature is first expanded to Euclidean space and obtains new feature Collection is combined into V, and V is the set of all elements in Z, then each sample index vector dimension is extended to K × T, when existing in sample When value in set V, otherwise it is 0 that the corresponding position value in new vector, which is 1, then obtains the coding reconstruct of all samples Matrix example are as follows:
5. a kind of Compressor Fault Diagnosis method based on XGBoost feature extraction according to claim 4, the step It is rapid 4) based on the characterization matrix, using SVM, neural network fault prediction model, realize that the prediction of multiple faults mode is examined It is disconnected, using following steps:
4.1) fault data pre-processes: the fault data sample of acquisition is initially rambling, often includes missing values, repetition Value, exceptional value, and constructing the required data of failure splay tree is numerical matrix form;Therefore, according to data concrete condition, Data are pre-processed using Data Discretization, mean value or median filling missing values mode, regular sample is obtained with this Data;
4.2) failure splay tree constructs:
4.2.1 loss function) is constructed according to fault data sample and fault typeFailure is constructed using the loss function Corresponding objective function under the t times iteration of splay tree;
4.2.2 formula (5)) is used to guide the failure splay tree of each step to construct as objective function;
4.2.3) in tree fission process, a parameter attribute is successively chosen in the sample of current tree node, is used formula (7) Calculate gain information Gain of the parameter as leaf node fragmentation criterion;
4.2.4) after the completion of all parameters all calculate, according to this gain G ain select corresponding fault parameter to current failure tree into Line splitting, and fault data sample is placed in corresponding leaf node;
When 4.2.5) whether being less than the threshold γ of setting according to gain G ain value, decide whether the division for carrying out current tree node; Whether reaching the maximal tree depth of setting according to this failure splay tree simultaneously, whether failure splay tree constructs completion at this time for judgement, Stop division;If so, calculating formula (6) obtains the least disadvantage functional value of this failure splay tree;If it is not, then returning to step 4.2.3, and in order execute subsequent step;
The failure splay tree of optimum structure is found through the above steps and obtains optimal leaf node quantity T and its corresponding leaf knot Point score w;
4.3) leaf node location index vector extracts: using the failure splay tree established, extracting each fault data sample and is dividing The leaf node position in tree is split, forms corresponding index vector, and obtain the index vector matrix of all fault data samples Z;
4.4) index feature set constructs: the method described according to step 3.2) reconstructs index vector feature space, obtains new Characteristic set V;
4.5) encoder matrix generates: extending each fault data sample index vector dimension is K × T, and is retouched according to step 3.2) The method stated carries out coding reconstruct, obtains encoder matrix
4.6) fault prediction model is established and is diagnosed: being based on encoder matrixWith fault sample initial parameter, in conjunction with SVM, nerve Network algorithm establishes fault diagnosis and prediction model respectively;Fault sample to be predicted is predicted using the model and is exported pair Answer diagnostic result.
CN201910100466.1A 2019-01-31 2019-01-31 XGboost feature extraction-based compressor fault diagnosis method Active CN109829236B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910100466.1A CN109829236B (en) 2019-01-31 2019-01-31 XGboost feature extraction-based compressor fault diagnosis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910100466.1A CN109829236B (en) 2019-01-31 2019-01-31 XGboost feature extraction-based compressor fault diagnosis method

Publications (2)

Publication Number Publication Date
CN109829236A true CN109829236A (en) 2019-05-31
CN109829236B CN109829236B (en) 2023-04-18

Family

ID=66862139

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910100466.1A Active CN109829236B (en) 2019-01-31 2019-01-31 XGboost feature extraction-based compressor fault diagnosis method

Country Status (1)

Country Link
CN (1) CN109829236B (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110262465A (en) * 2019-07-11 2019-09-20 电子科技大学 A kind of winged control method for diagnosing faults based on error code classification
CN110413494A (en) * 2019-06-19 2019-11-05 浙江工业大学 A kind of LightGBM method for diagnosing faults improving Bayes's optimization
CN110781206A (en) * 2019-12-02 2020-02-11 国网河北省电力有限公司电力科学研究院 Method for predicting whether electric energy meter in operation fails or not by learning meter-dismantling and returning failure characteristic rule
CN110796120A (en) * 2019-11-21 2020-02-14 杭州电力设备制造有限公司 Time domain feature-based circuit breaker mechanical fault XGboost diagnosis method
CN110986407A (en) * 2019-11-08 2020-04-10 杭州电子科技大学 Fault diagnosis method for centrifugal water chilling unit
CN110987439A (en) * 2019-12-05 2020-04-10 山东超越数控电子股份有限公司 Aeroengine fault prediction method based on Logitics regression and Xgboost model
CN111337244A (en) * 2020-03-13 2020-06-26 华风数据(深圳)有限公司 Method and device for monitoring and diagnosing faults of input shaft of fan gearbox
CN111417124A (en) * 2019-06-28 2020-07-14 西南交通大学 Method for sensing frequency spectrum in cognitive wireless network environment
CN111429970A (en) * 2019-12-24 2020-07-17 大连海事大学 Method and system for obtaining multi-gene risk scores by performing feature selection based on extreme gradient lifting method
CN111444940A (en) * 2020-02-28 2020-07-24 山东大学 Fault diagnosis method for critical parts of fan
CN111612036A (en) * 2020-04-20 2020-09-01 国网浙江省电力有限公司嘉兴供电公司 Oil-immersed transformer fault diagnosis method based on particle swarm optimization XGboost
CN112183590A (en) * 2020-09-14 2021-01-05 浙江大学 Transformer fault diagnosis method based on Oneclass SVM algorithm
CN112380041A (en) * 2020-11-13 2021-02-19 重庆金美通信有限责任公司 Finger control communication equipment fault prediction method based on xgboost
CN112926400A (en) * 2021-01-29 2021-06-08 华南理工大学 Intelligent diagnosis method and system for leakage fault in hydraulic cylinder based on data driving
CN113095390A (en) * 2021-04-02 2021-07-09 东北大学 Walking stick motion analysis system and method based on cloud database and improved ensemble learning
CN113190786A (en) * 2021-05-13 2021-07-30 岳聪 Vibration prediction method for large-scale rotating equipment by using multidimensional assembly parameters
CN114638384A (en) * 2022-05-17 2022-06-17 四川观想科技股份有限公司 Fault diagnosis method and system based on machine learning
CN115758897A (en) * 2022-11-24 2023-03-07 天津华翼蓝天科技股份有限公司 Simulator fault diagnosis method based on machine learning

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106524411A (en) * 2016-11-02 2017-03-22 王华勤 Control method for fault diagnosis of suspension-type air conditioner
CN109190670A (en) * 2018-08-02 2019-01-11 大连理工大学 A kind of charging pile failure prediction method based on expansible boosted tree

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106524411A (en) * 2016-11-02 2017-03-22 王华勤 Control method for fault diagnosis of suspension-type air conditioner
CN109190670A (en) * 2018-08-02 2019-01-11 大连理工大学 A kind of charging pile failure prediction method based on expansible boosted tree

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110413494B (en) * 2019-06-19 2022-05-13 浙江工业大学 LightGBM fault diagnosis method for improving Bayesian optimization
CN110413494A (en) * 2019-06-19 2019-11-05 浙江工业大学 A kind of LightGBM method for diagnosing faults improving Bayes's optimization
CN111417124A (en) * 2019-06-28 2020-07-14 西南交通大学 Method for sensing frequency spectrum in cognitive wireless network environment
CN110262465B (en) * 2019-07-11 2021-05-14 电子科技大学 Flight control fault diagnosis method based on fault code classification
CN110262465A (en) * 2019-07-11 2019-09-20 电子科技大学 A kind of winged control method for diagnosing faults based on error code classification
CN110986407A (en) * 2019-11-08 2020-04-10 杭州电子科技大学 Fault diagnosis method for centrifugal water chilling unit
CN110796120A (en) * 2019-11-21 2020-02-14 杭州电力设备制造有限公司 Time domain feature-based circuit breaker mechanical fault XGboost diagnosis method
CN110781206A (en) * 2019-12-02 2020-02-11 国网河北省电力有限公司电力科学研究院 Method for predicting whether electric energy meter in operation fails or not by learning meter-dismantling and returning failure characteristic rule
CN110987439A (en) * 2019-12-05 2020-04-10 山东超越数控电子股份有限公司 Aeroengine fault prediction method based on Logitics regression and Xgboost model
CN110987439B (en) * 2019-12-05 2022-03-22 超越科技股份有限公司 Aeroengine fault prediction method based on Logitics regression and Xgboost model
CN111429970B (en) * 2019-12-24 2024-03-22 大连海事大学 Method and system for acquiring multiple gene risk scores based on feature selection of extreme gradient lifting method
CN111429970A (en) * 2019-12-24 2020-07-17 大连海事大学 Method and system for obtaining multi-gene risk scores by performing feature selection based on extreme gradient lifting method
CN111444940A (en) * 2020-02-28 2020-07-24 山东大学 Fault diagnosis method for critical parts of fan
CN111337244B (en) * 2020-03-13 2022-04-01 华风数据(深圳)有限公司 Method and device for monitoring and diagnosing faults of input shaft of fan gearbox
CN111337244A (en) * 2020-03-13 2020-06-26 华风数据(深圳)有限公司 Method and device for monitoring and diagnosing faults of input shaft of fan gearbox
CN111612036A (en) * 2020-04-20 2020-09-01 国网浙江省电力有限公司嘉兴供电公司 Oil-immersed transformer fault diagnosis method based on particle swarm optimization XGboost
CN112183590A (en) * 2020-09-14 2021-01-05 浙江大学 Transformer fault diagnosis method based on Oneclass SVM algorithm
CN112380041A (en) * 2020-11-13 2021-02-19 重庆金美通信有限责任公司 Finger control communication equipment fault prediction method based on xgboost
CN112380041B (en) * 2020-11-13 2023-11-14 重庆金美通信有限责任公司 Xgboost-based failure prediction method for command communication equipment
CN112926400A (en) * 2021-01-29 2021-06-08 华南理工大学 Intelligent diagnosis method and system for leakage fault in hydraulic cylinder based on data driving
CN113095390A (en) * 2021-04-02 2021-07-09 东北大学 Walking stick motion analysis system and method based on cloud database and improved ensemble learning
CN113190786A (en) * 2021-05-13 2021-07-30 岳聪 Vibration prediction method for large-scale rotating equipment by using multidimensional assembly parameters
CN113190786B (en) * 2021-05-13 2024-03-15 岳聪 Vibration prediction method for large-scale rotating equipment by utilizing multidimensional assembly parameters
CN114638384A (en) * 2022-05-17 2022-06-17 四川观想科技股份有限公司 Fault diagnosis method and system based on machine learning
CN115758897A (en) * 2022-11-24 2023-03-07 天津华翼蓝天科技股份有限公司 Simulator fault diagnosis method based on machine learning

Also Published As

Publication number Publication date
CN109829236B (en) 2023-04-18

Similar Documents

Publication Publication Date Title
CN109829236A (en) A kind of Compressor Fault Diagnosis method based on XGBoost feature extraction
CN107941537B (en) A kind of mechanical equipment health state evaluation method
CN109102005B (en) Small sample deep learning method based on shallow model knowledge migration
CN110502991B (en) Internal combustion engine health monitoring method and system based on random convolutional neural network structure
CN110929918B (en) 10kV feeder fault prediction method based on CNN and LightGBM
CN105548764A (en) Electric power equipment fault diagnosis method
US11840998B2 (en) Hydraulic turbine cavitation acoustic signal identification method based on big data machine learning
CN108985380B (en) Point switch fault identification method based on cluster integration
CN110647830B (en) Bearing fault diagnosis method based on convolutional neural network and Gaussian mixture model
CN105574284B (en) A kind of Fault Diagnosis for Electrical Equipment method based on trend feature point
CN111680875B (en) Unmanned aerial vehicle state risk fuzzy comprehensive evaluation method based on probability baseline model
CN105467975A (en) Equipment fault diagnosis method
CN109597401A (en) A kind of equipment fault diagnosis method based on data-driven
CN108537259A (en) Train control on board equipment failure modes and recognition methods based on Rough Sets Neural Networks model
CN109813542A (en) The method for diagnosing faults of air-treatment unit based on production confrontation network
CN111523778A (en) Power grid operation safety assessment method based on particle swarm algorithm and gradient lifting tree
CN110737976A (en) mechanical equipment health assessment method based on multi-dimensional information fusion
CN108961460B (en) Fault prediction method and device based on sparse ESGP (Enterprise service gateway) and multi-objective optimization
CN114201920A (en) Laser cutting numerical control system fault diagnosis method based on digital twinning and deep transfer learning
CN110210169A (en) A kind of shield machine failure prediction method based on LSTM
CN115587290A (en) Aero-engine fault diagnosis method based on variational self-coding generation countermeasure network
CN114429152A (en) Rolling bearing fault diagnosis method based on dynamic index antagonism self-adaption
CN116205265A (en) Power grid fault diagnosis method and device based on deep neural network
CN112763215B (en) Multi-working-condition online fault diagnosis method based on modular federal deep learning
CN111523557A (en) Wind power intelligent fault diagnosis method based on big data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant