CN113551904A - Gear box multi-type concurrent fault diagnosis method based on hierarchical machine learning - Google Patents

Gear box multi-type concurrent fault diagnosis method based on hierarchical machine learning Download PDF

Info

Publication number
CN113551904A
CN113551904A CN202110722255.9A CN202110722255A CN113551904A CN 113551904 A CN113551904 A CN 113551904A CN 202110722255 A CN202110722255 A CN 202110722255A CN 113551904 A CN113551904 A CN 113551904A
Authority
CN
China
Prior art keywords
model
layer
fault
machine learning
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110722255.9A
Other languages
Chinese (zh)
Other versions
CN113551904B (en
Inventor
蔡志强
陈秋安
司书宾
段锋
孟学煜
张帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN202110722255.9A priority Critical patent/CN113551904B/en
Publication of CN113551904A publication Critical patent/CN113551904A/en
Application granted granted Critical
Publication of CN113551904B publication Critical patent/CN113551904B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01MTESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
    • G01M13/00Testing of machine parts
    • G01M13/02Gearings; Transmission mechanisms
    • G01M13/021Gearings
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01MTESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
    • G01M13/00Testing of machine parts
    • G01M13/02Gearings; Transmission mechanisms
    • G01M13/028Acoustic or vibration analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Acoustics & Sound (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Testing Of Devices, Machine Parts, Or Other Structures Thereof (AREA)

Abstract

The invention relates to a multi-type concurrent fault diagnosis method for a gearbox through hierarchical machine learning, which provides a brand-new hierarchical machine learning model on the basis of the traditional machine learning, wherein the model comprises two layers, the first layer is the traditional machine learning model with a simple structure and is used for identifying a single fault type with easily distinguished characteristics, filtering a multi-type concurrent fault sample which cannot be accurately identified to the second layer and correctly classifying the multi-type concurrent fault sample by the second layer model. And a classification model is established by adopting an extreme learning machine in the second layer, the extreme learning machine is a single-layer feedforward neural network, gradient calculation in the negative feedback regulation process of the traditional neural network is overcome by adopting least square fitting, and the adjustment of model parameters can be quickly realized. The fault diagnosis is carried out through the hierarchical machine learning, so that the accuracy of fault identification can be improved, and the training efficiency can be greatly improved.

Description

Gear box multi-type concurrent fault diagnosis method based on hierarchical machine learning
Technical Field
The invention belongs to the field of fault diagnosis of rotary mechanical equipment, and particularly relates to a multi-type concurrent fault diagnosis method for a gearbox based on hierarchical machine learning.
Background
The gear box is a rotating mechanical device which is most widely applied in industrial equipment, and particularly plays a role in playing a role in aviation equipment as an important transmission part. The reliability of the gear box can be improved by researching the fault diagnosis of the gear box, and the normal operation of equipment can be guaranteed. In general, the running conditions of the rotary machine are complex, the performance of the rotary machine is gradually degraded in the continuous long-term running process, faults are easy to occur, the product quality is reduced or production stop is caused, and huge property loss and casualties are caused in the heavy process. The gear box plays a key role in the power transmission and motion conversion processes of mechanical equipment and is the most vulnerable part in the rotary mechanical equipment. Statistically, 70% of failures of rotating machines are directly related to gearboxes. Therefore, relevant research on state detection and fault diagnosis of mechanical equipment of the gearbox is developed, fault hidden dangers existing in equipment operation are timely and accurately found, and the method and the device have important significance for guaranteeing equipment safety and reducing accident occurrence.
For fault diagnosis and reliability research of a gearbox, at the present stage, three modeling methods mainly exist: a mechanism-based modeling method, a knowledge-based modeling method, and a data-driven based modeling method. The modeling method based on the mechanism needs to fully know the internal structure and the mechanism of the equipment, and the established model has deep physical significance, so that the modeling method has good extensibility. However, when the equipment structure is too complex, and all information of the internal mechanism of the complex equipment cannot be obtained, it is very difficult to establish an accurate model. In addition, the mechanism modeling is always based on many simplifications and assumptions, and therefore, the output of the model and the actual result also generate errors. Knowledge-based modeling methods do not require a complete understanding of the internal mechanisms of complex equipment, and the models created are easy to understand, but less versatile. The knowledge model is built by accumulating a large amount of production experience and process knowledge, but if an unprecedented fault is met, false alarm and false alarm situations occur because of no corresponding experience and knowledge. The modeling method based on data driving does not need an accurate model and relevant prior knowledge of equipment, but directly carries out fault diagnosis, reliability analysis and the like on the equipment data collected, and a machine learning algorithm is a typical representative of the modeling method. With the rise of artificial intelligence and the massive accumulation of equipment data in recent years, data-based methods attract the attention of more and more researchers, and have gradually become the first solution in the field of gearbox fault diagnosis.
Chinese patent publication No. CN108918137A discloses a "gear box fault diagnosis device based on improved WPA-BP neural network and method thereof", which establishes a BP neural network according to acquired experimental data of the gear box under various working conditions, then performs parameter optimization on the weight and threshold of the BP neural network by using a wolf colony algorithm, and trains the BP neural network after the parameter optimization, so as to finally accurately judge the fault type of the gear box in the operation process, and obtain an ideal diagnosis and identification effect. However, the current fault diagnosis methods based on the neural network and the heuristic parameter optimization algorithm have the following defects: multiple iterations are required, resulting in low modeling efficiency; the heuristic algorithm is easy to fall into local optimum, and the result is unstable; the method only aims at the problem of diagnosis of various single-type faults and cannot adapt to complex multi-type concurrent fault diagnosis scenes.
Disclosure of Invention
The technical problem solved by the invention is as follows: in order to avoid the defects that the modeling efficiency is low and the multi-type concurrent faults cannot be effectively diagnosed in the prior art, the invention provides a gear box multi-type concurrent fault diagnosis method based on hierarchical machine learning. The method provides a brand-new level machine learning model on the basis of the traditional machine learning, the model comprises two layers, the first layer is the traditional machine learning model with a simple structure, the function is to identify a single fault type with characteristics easy to distinguish, filter multi-type concurrent fault samples which cannot be accurately identified to the second layer, and the second layer model is used for carrying out correct classification. And a classification model is established by adopting an extreme learning machine in the second layer, the extreme learning machine is a single-layer feedforward neural network, gradient calculation in the negative feedback regulation process of the traditional neural network is overcome by adopting least square fitting, and the adjustment of model parameters can be quickly realized. The fault diagnosis is carried out through the hierarchical machine learning, so that the accuracy of fault identification can be improved, and the training efficiency can be greatly improved.
The technical scheme of the invention is as follows: a multi-type concurrent fault diagnosis method for a gearbox based on brand-new level machine learning comprises the following steps:
step 1: constructing a gearbox fault data set, comprising the sub-steps of:
step 1.1: a gearbox fault diagnosis experiment platform is built, the sensors are utilized to collect raw characteristic data, wherein the raw characteristic data comprises the rotating speed, the rotating acceleration and the displacement of the gearbox,
step 1.2: preprocessing the original characteristic data obtained in the last step, and constructing a fault data set for training a machine learning model; simultaneously, dividing the sample into a normal sample and a fault sample, and dividing a training set and a testing set in a random mode according to a data set in proportion so as to evaluate the effect of the trained model;
step 2: establishing a traditional machine learning two-classification model for the state identification of the gearbox according to the processed data in the step 1; then establishing a multi-classification model, wherein the model is a first-layer classification model;
and step 3: evaluating the multi-classification model obtained in the step 2 and filtering sample data which is not correctly classified, wherein the method comprises the following substeps:
step 3.1: establishing an evaluation system of the multi-classification models in the step 2, namely comparing the fault diagnosis performance of each multi-classification algorithm in the step two by using evaluation indexes of a confusion matrix, a classification accuracy, a classification recall rate and a classification accuracy;
step 3.2: displaying each evaluation index in the step 3.1 in an image mode to perform visual analysis, checking the precision of fault diagnosis of each category, and taking a model with the highest precision as an optimal model; filtering fault type samples lower than a precision threshold value in the optimal model;
and 4, step 4: establishing a fault diagnosis model which is a second-layer model according to the fault sample filtered in the step 3, and comprising the following substeps:
step 4.1: diagnosing by adopting an extreme learning machine, wherein the extreme learning machine comprises three layers, namely an input layer, a hidden layer and an output layer; suppose there are N training samples (x)i,yi) Wherein x isi=[xi1,xi2,...,xin]∈RnIs an n-dimensional input sample, yi=[yi1,yi2,...,yin]T∈RmIs the corresponding m-dimensional output, the goal of learning is to find the relationship between the input and the output. Assuming that the hidden layer has L nodes and the activation function is g (-), the expression of the extreme learning machine is:
Figure BDA0003137201540000041
wherein beta isjIs the output weight, ω, of the jth hidden layer nodejIs a weight vector connecting the ith input node and the jth hidden node, bjIs the bias of the jth hidden layer node;
the output of the learning machine is finally obtained as follows:
Figure BDA0003137201540000042
step 4.2: introducing a kernel matrix at the output in step 4.1: HH ═ KTThe output of the kernel limit learning machine is:
Figure BDA0003137201540000043
and 5: and (3) summarizing the partial type diagnosis result with higher precision obtained by the multi-classification model in the step (2) and the partial type diagnosis result filtered in the step (3) obtained by the limit learning machine in the step (4) to obtain a final diagnosis result.
The further technical scheme of the invention is as follows: the data set comprises a training set and a testing set, wherein the training set accounts for 70% and the testing set accounts for 30%.
The further technical scheme of the invention is as follows: in the step 2, a logistic regression method is adopted for identifying the state of the gearbox, and the logistic regression expression is as follows
Figure BDA0003137201540000044
Where g (z) represents the activation function, x is the input vector, w is the weight vector, and b is the bias vector. For convenience of presentation, w and b were replaced by θ, and the results are as follows
Figure BDA0003137201540000045
Then, identifying the samples in the fault state by using a binary classification model, and dividing the specific type of each fault sample; after the division, a multi-classification model of fault diagnosis is established by adopting a plurality of different algorithms.
The further technical scheme of the invention is as follows: a multi-classification model for fault diagnosis is established by adopting a support vector machine and a neural network,
the support vector machine expression is as follows:
Figure BDA0003137201540000051
the neural network expression is as follows: suppose the number of layers of the neural network is K (K)>1) The number of nodes from the input layer to the output layer is m0,m1,…,mKThereby defining an input directionDimension of the quantity is m0Dimension of the output vector is mk. The output vectors of each layer of the network are respectively expressed as follows:
an input layer:
Figure BDA0003137201540000052
hidden layer one:
Figure BDA0003137201540000053
hidden layer two:
Figure BDA0003137201540000054
an output layer:
Figure BDA0003137201540000055
weight matrix and offset vector for each layer
Figure BDA0003137201540000056
Then, the output of the network is
Figure BDA0003137201540000057
Effects of the invention
The invention has the technical effects that:
the gear box multi-type concurrent fault diagnosis method based on hierarchical machine learning provided by the invention is mainly based on the traditional machine learning algorithm and the extreme learning machine to complete the construction of a hierarchical machine learning model, and the analysis and summary of the model effect are carried out through various evaluation indexes, so that the modeling and diagnosis processes are simple and efficient.
The multi-type concurrent fault diagnosis method for the gear box based on the hierarchical machine learning can identify the normal state and the fault state of the gear box, can diagnose the identified fault type, can accurately and efficiently finish fault diagnosis based on the simple and easily-realized hierarchical machine learning model, has clear structure, clear thought and good effect, and can meet the expected target, particularly for the complex multi-type concurrent fault diagnosis problem.
The multi-type concurrent fault diagnosis method for the gearbox based on the hierarchical machine learning can effectively utilize the test data of the gearbox to carry out fault diagnosis on the gearbox, and can obtain a model with the best effect by evaluating the diagnosis results of different models. In addition, the final given result is a specific multi-type concurrent fault diagnosis result, so that scientific quantitative evaluation on the running state and performance of the gearbox is facilitated, and the method has important significance on improvement of the overall reliability of the gearbox.
Drawings
Fig. 1 is a flowchart of a fault diagnosis algorithm based on hierarchical machine learning.
FIG. 2 is a schematic diagram of a model and data change flow of a hierarchical machine learning algorithm in an embodiment.
Fig. 3 is a diagram illustrating a fault diagnosis result based on a conventional machine learning algorithm in an embodiment.
FIG. 4 is a diagram illustrating ROC results for various algorithms as a second layer model in an embodiment.
FIG. 5 is a diagram illustrating the final effect of the hierarchical machine learning algorithm in the embodiment.
Detailed Description
In the description of the present invention, it is to be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", "clockwise", "counterclockwise", and the like, indicate orientations and positional relationships based on those shown in the drawings, and are used only for convenience of description and simplicity of description, and do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be considered as limiting the present invention.
With reference to figures 1-5 of the drawings,
the technical scheme adopted by the invention for solving the technical problems is as follows:
a gear box multi-type concurrent fault diagnosis method based on hierarchical machine learning is characterized by comprising the following steps:
step 1, constructing a gearbox fault data set;
firstly, an experimental platform is set up to collect necessary original characteristic data, and a characteristic variable which can best reflect the performance of the rotary mechanical part is extracted by combining with actual conditions. And (3) carrying out data preprocessing on the original data, including a series of operations such as missing value processing, data standardization, feature dimension reduction and the like, and finally constructing a fault data set for training a machine learning model. And labeling the normal sample and the fault sample, and dividing a training set and a testing set of the data set so as to evaluate the effect of the trained model.
Step 2, establishing a multi-type concurrent fault diagnosis model based on traditional machine learning
And (2) establishing a traditional machine learning model for state identification and fault diagnosis of the gearbox according to the data collected in the step (1), identifying samples in fault states, and dividing specific types of each fault sample, such as pitting, tooth breakage, abrasion and the like, which respectively represent different fault types of the gear. In order to select the classification model with the best effect, a multi-classification model for fault diagnosis is respectively established by using algorithms based on a support vector machine, an artificial neural network, a random forest, a gradient lifting tree and the like.
And 3, evaluating the first layer algorithm model and filtering sample data which is not correctly classified.
a. Firstly, establishing an evaluation system of the fault diagnosis model in the step 2, evaluating by using evaluation indexes such as a confusion matrix, a classification accuracy rate, a classification recall rate and the like, and comparing the fault diagnosis performance of each algorithm on the first layer model.
b. And (4) visually analyzing the processing result of the first-layer model and checking the precision of fault diagnosis of each category. Setting a threshold value for the fault diagnosis precision according to actual production requirements, filtering out fault type samples lower than the precision threshold value in each model, and carrying out treatment of subsequent steps.
Step 4, establishing a fault diagnosis model according to the fault sample filtered in the step 3
The original fault data set is processed by the first layer model, the number of the data samples which are not correctly classified is only a small part, and the fault data set can be diagnosed by using an extreme learning machine. The extreme learning machine comprises three layers, namely an input layer, a hidden layer and an output layer. Suppose there are N training samples (x)i,yi) Wherein x isi=[xi1,xi2,...,xin]∈RnIs an n-dimensional input sample, yi=[yi1,yi2,...,yin]T∈RmIs the corresponding m-dimensional output, the goal of learning is to find the relationship between the input and the output. Assuming that the hidden layer has L nodes and the activation function is g (-), the expression of the extreme learning machine is:
Figure BDA0003137201540000081
wherein beta isjIs the output weight, ω, of the jth hidden layer nodejIs a weight vector connecting the ith input node and the jth hidden node, bjIs the bias of the jth hidden layer node.
The extreme learning machine is a special feedforward neural network, and is characterized in that the weights and biases of nodes of a hidden layer are random or artificially given, updating is not needed, only the output weights are calculated in the learning process, back propagation iterative computation is not needed, and the efficiency is greatly improved. Each sample is used as an input layer node by the extreme learning machine, and the theoretical training set error can be infinitely close to 0, so that the extreme learning machine is used as a second layer model and has good precision and efficiency on a fault data set with a small sample size. Writing equation (1) into a compact format:
Hβ=Y (2)
wherein H ∈ RN×LIs the hidden layer output matrix and Y is the true label matrix.
Figure BDA0003137201540000082
Figure BDA0003137201540000083
The goal of the extreme learning machine is to minimize the training error, which can be expressed as the following optimization problem:
Figure BDA0003137201540000084
wherein C is a generalization factor, theta is a training error, and the transformation is converted into a dual optimization problem as follows:
Figure BDA0003137201540000085
wherein alpha isiIs the lagrangian multiplier for the ith training sample. Respectively calculate LDFor beta, thetaiAnd alphaiAnd let the result of partial differentiation be 0, i.e.:
Figure BDA0003137201540000091
the output weight can be derived as:
Figure BDA0003137201540000092
where I is the identity matrix. The final extreme learning machine output is:
Figure BDA0003137201540000093
although the random initialization of the extreme learning machine avoids negative feedback regulation, the robustness and generalization capability of the model are poor. In order to improve the stability and generalization capability of the extreme learning machine, a kernel method is introduced, a hidden layer is taken as an unknown feature by the kernel method, and the mapping from an input space to a feature space can be realized. The kernel matrix is defined as:
K=HHT (10)
the output of the kernel-limit learning machine is:
Figure BDA0003137201540000094
although the kernel extreme learning machine has better effect compared with the extreme learning machine, model parameters are added at the same time of the introduction of the kernel, and the relatively optimal solution of the model can be found by optimizing the parameters by adopting a grid search strategy.
Step 5, obtaining a final level machine learning model and evaluating the effect of the model
And (4) integrating the results of the step (2), the step (3) and the step (4) to obtain a final gear box multi-type concurrent fault diagnosis result based on the hierarchical machine learning model. The hierarchical machine learning is still a multi-classification problem solved essentially, so that the evaluation indexes of the multi-classification model are adopted to evaluate the multi-classification model, such as an overall confusion matrix, an overall classification accuracy and a fault diagnosis accuracy of each type, an overall classification recall and a fault diagnosis recall of each type.
The embodiment is a gear box multi-type concurrent fault diagnosis method based on hierarchical machine learning.
Referring to fig. 1 to 5, the method for diagnosing multiple types of concurrent faults of a gearbox based on hierarchical machine learning according to the present embodiment is applied to a gearbox to perform fault diagnosis, and includes the following specific steps:
step 1, constructing a gearbox fault data set. The specific mode is as follows:
in this embodiment, the experimental data of the QPZZ-II rotary machine vibration analysis and fault diagnosis test platform system is used as a research object, and the system can rapidly simulate various states and vibrations of the rotary machine, and can perform comparative analysis and diagnosis of various states. The system is widely applied to scientific research, teaching, product development, personnel training and the like in universities, industrial and mining institutions and scientific research institutions. The Japanese international force assistant mechanism has been used for training international equipment diagnosis high-level engineers by using a similar platform so far, and a good effect is obtained. The system is used for carrying out a gear box simulation experiment, the gear box consists of a large gear with 75 teeth and a small gear with 55 teeth, and the modulus of the two gears is 2. The experimental simulation gearbox operation process totally comprises a normal state and five fault states, wherein the five fault states comprise three single fault types and two concurrent fault types. Three single faults are respectively a large gear pitting fault, a large gear tooth breakage fault and a pinion abrasion fault, and two concurrent faults are respectively large gear pitting plus pinion abrasion and large gear tooth breakage plus pinion abrasion. In order to establish a machine learning model to diagnose the state of the gearbox, 8 characteristic variables are collected in an experiment and are respectively marked as F1, F2, F3, F4, F5, F6, F7 and F8, and Pearson correlation coefficients among the characteristic variables are shown in Table 1.
TABLE 1 relationships between characteristic variables
Figure BDA0003137201540000101
Figure BDA0003137201540000111
A total of 2100 valid samples were collected in the experiment, including 1000 normal samples, 200 bull gear pitting failure samples, 200 bull gear tooth breakage failure samples, 300 pinion wear samples, 200 bull gear pitting plus pinion wear failure samples, and 200 bull gear tooth breakage and pinion wear failure samples. And dividing the whole data set into a training set and a test set, wherein the training set accounts for 70%, the test set accounts for 30%, and then adding label values to each state sample data, wherein the specific data distribution is shown in table 2.
Table 2 data set state distributions and label conditions
Figure BDA0003137201540000112
And 2, diagnosing the faults of the gearbox by respectively using a support vector machine, a random forest, a gradient boosting decision tree and an artificial neural network as classifiers according to the data collected in the step 1. Training of each model is carried out by using a training set, and then the result of a test set is used as a measure of the effect of the model. The diagnostic results for each model on the test set are shown in tables 3 to 6.
TABLE 3 diagnosis of random forest on test set
Figure BDA0003137201540000121
TABLE 4 diagnostic results of support vector machines on test sets
Figure BDA0003137201540000122
TABLE 5 diagnostic results of gradient boosting decision Tree on test set
Figure BDA0003137201540000123
Figure BDA0003137201540000131
TABLE 6 diagnosis of artificial neural networks on test set
Figure BDA0003137201540000132
And 3, evaluating the first layer algorithm model and filtering sample data which is not correctly classified. The specific mode is as follows:
a. and firstly, evaluating the effect of each classifier according to the diagnosis result of the step two, wherein four commonly used multi-classification evaluation indexes are Average Accuracy (AA), macro accuracy (MP), Macro Recall (MR) and macro F1-Measure (MF). Their definitions are as follows:
Figure BDA0003137201540000133
Figure BDA0003137201540000134
Figure BDA0003137201540000135
Figure BDA0003137201540000136
where n represents the number of classes, which is 6 in this experiment, TP indicates that both the true label and the predicted label are positive classes, TN indicates that both the true label and the predicted label are negative classes, FP indicates that the true label is a negative class and the predicted label is a positive class, and FN indicates that the true label is a positive class and the predicted label is a negative class. The relationship between them is shown in table 7:
TABLE 7 relationship of TP, TN, FP, FN
Figure BDA0003137201540000141
As shown in fig. three, scores on four different indexes of each algorithm and classification accuracy on each category can be obtained according to tables 3, 4, 5, 6 and 7 and formulas (1), (2), (3) and (4).
b. The third graph comprises two parts, wherein the larger part can see the scores of the algorithms on four evaluation indexes, and the smaller part can see the classification accuracy of the algorithms on each type of sample. It can be seen from the small graph that the accuracy of the samples labeled 4 and 6 is much lower than that of the other classes, which indicates that the features of the two classes of fault samples are similar, so that the classifier cannot make a correct judgment. In order to improve the overall fault diagnosis accuracy, the two types of sample data should be filtered out and sent to the next layer of model for further diagnosis.
And 4, establishing a fault diagnosis model according to the fault sample filtered in the step 3. The specific mode is as follows:
filtered out are the failure samples for labels 4 and 6, so the second layer model is actually to perform a binary task. The number of samples labeled 4 is 300, the number of samples labeled 6 is 200, and the samples are also divided into 70% of training set and 30% of testing set. In the present experiment, a Logistic Regression (LR), an Extreme Learning Machine (ELM), a Kernel Extreme Learning Machine (KELM), and a kernel extreme learning machine (GKELM) that performs parameter optimization using a grid search strategy are respectively used for the two classification models, and gaussian kernel functions are selected as kernel functions in this example. The diagnostic results for each model on the test set are shown in tables 8 to 11.
TABLE 8 diagnostic results of LR on test set
Figure BDA0003137201540000151
TABLE 9 diagnosis of ELM on test set
Figure BDA0003137201540000152
TABLE 10 diagnostic results of KELM on test set
Figure BDA0003137201540000153
TABLE 11 diagnostic results of GKELM on test set
Figure BDA0003137201540000154
For the evaluation of the two-classification machine learning, an ROC curve is generally adopted, and the ROC is called a receiver operation characteristic curve, also called a sensitivity curve. The reason for this is that each point on the curve reflects the same sensitivity, and they all respond to the same signal stimulus, but only the results obtained under several different criteria. The receiver operation characteristic curve is a graph formed by a horizontal axis (FPR) of the false startle probability and a vertical axis (TPR) of the hit probability, and is drawn by different results obtained by adopting different judgment standards under a specific stimulation condition. The ROC curves for the four two classifiers in this experiment are shown in fig. 4, where:
Figure BDA0003137201540000161
Figure BDA0003137201540000162
the legend auc shows the area of each algorithm curve and the plane graph of the x-axis city, and according to the nature of the ROC curve, the larger the area value, the better the performance of the classifier, and it can be seen that GKELM works best as the second layer model. In order to explore the effect of hierarchical machine learning when the second layer model is GKELM, the samples filtered by the first layer model are respectively processed by the second layer GKELM model, and the specific diagnosis results of each category are obtained as shown in tables 12 to 15.
TABLE 12 RFC-GKELM diagnostic results
Figure BDA0003137201540000163
TABLE 13 SVC-GKELM diagnostic results
Figure BDA0003137201540000164
Figure BDA0003137201540000171
TABLE 14 GBDT-GKELM diagnostic results
Figure BDA0003137201540000172
TABLE 15 ANN-GKELM diagnostic results
Figure BDA0003137201540000173
And 5, combining the step 2, the step 3 and the step 4 to obtain a final hierarchical machine learning model and evaluating the effect of the final hierarchical machine learning model. The specific mode is as follows:
step 2, training a plurality of models, step 3, comparing the algorithm models of the first layer, step 4, classifying the filtered samples and comparing the algorithm models of the second layer. The hierarchical machine learning model in this example thus comprises two layers, a first layer for diagnosing the four states 1, 2, 3 and 5 of the gearbox and a second layer for diagnosing the two states 4 and 6. In this experiment, each layer of models includes a plurality of algorithms, and although the models in each layer are compared in the horizontal direction and the best model is selected, the influence between layers is not considered, and all the model combination diagnosis results are compared in order to determine the best model combination, as shown in table 16. From table 16, it is clear that the hierarchical machine learning model of the combination of SVC and GKELM works best.
TABLE 16 Combined diagnostic results for all models
Figure BDA0003137201540000181
Figure BDA0003137201540000191

Claims (4)

1. A multi-type concurrent fault diagnosis method for a gearbox based on hierarchical machine learning is characterized by comprising the following steps:
step 1: constructing a gearbox fault data set, comprising the sub-steps of:
step 1.1: a gearbox fault diagnosis experiment platform is built, the sensors are utilized to collect raw characteristic data, wherein the raw characteristic data comprises the rotating speed, the rotating acceleration and the displacement of the gearbox,
step 1.2: preprocessing the original characteristic data obtained in the last step, and constructing a fault data set for training a machine learning model; simultaneously, dividing the sample into a normal sample and a fault sample, and dividing a training set and a testing set in a random mode according to a data set in proportion so as to evaluate the effect of the trained model;
step 2: establishing a traditional machine learning two-classification model for the state identification of the gearbox according to the processed data in the step 1; then establishing a multi-classification model, wherein the model is a first-layer classification model;
and step 3: evaluating the multi-classification model obtained in the step 2 and filtering sample data which is not correctly classified, wherein the method comprises the following substeps:
step 3.1: establishing an evaluation system of the multi-classification models in the step 2, namely comparing the fault diagnosis performance of each multi-classification algorithm in the step two by using evaluation indexes of a confusion matrix, a classification accuracy, a classification recall rate and a classification accuracy;
step 3.2: displaying each evaluation index in the step 3.1 in an image mode to perform visual analysis, checking the precision of fault diagnosis of each category, and taking a model with the highest precision as an optimal model; filtering fault type samples lower than a precision threshold value in the optimal model;
and 4, step 4: establishing a fault diagnosis model which is a second-layer model according to the fault sample filtered in the step 3, and comprising the following substeps:
step 4.1: miningDiagnosing with an extreme learning machine, wherein the extreme learning machine comprises three layers, namely an input layer, a hidden layer and an output layer; suppose there are N training samples (x)i,yi) Wherein x isi=[xi1,xi2,...,xin]∈RnIs an n-dimensional input sample, yi=[yi1,yi2,...,yin]T∈RmIs the corresponding m-dimensional output, the goal of learning is to find the relationship between the input and the output. Assuming that the hidden layer has L nodes and the activation function is g (-), the expression of the extreme learning machine is:
Figure FDA0003137201530000021
wherein beta isjIs the output weight, ω, of the jth hidden layer nodejIs a weight vector connecting the ith input node and the jth hidden node, bjIs the bias of the jth hidden layer node;
the output of the learning machine is finally obtained as follows:
Figure FDA0003137201530000022
step 4.2: introducing a kernel matrix at the output in step 4.1: HH ═ KTThe output of the kernel limit learning machine is:
Figure FDA0003137201530000023
and 5: and (3) summarizing the partial type diagnosis result with higher precision obtained by the multi-classification model in the step (2) and the partial type diagnosis result filtered in the step (3) obtained by the limit learning machine in the step (4) to obtain a final diagnosis result.
2. The method of claim 1, wherein the data set comprises a training set and a testing set, wherein the training set accounts for 70% and the testing set accounts for 30%.
3. The method for diagnosing the multiple types of the concurrent faults of the gearbox through the brand new level machine learning as claimed in claim 1, wherein in the step 2, a logistic regression method is adopted for the state recognition of the gearbox, and the expression of the logistic regression is as follows
Figure FDA0003137201530000024
Where g (z) represents the activation function, x is the input vector, w is the weight vector, and b is the bias vector. For convenience of presentation, w and b were replaced by θ, and the results are as follows
Figure FDA0003137201530000031
Then, identifying the samples in the fault state by using a binary classification model, and dividing the specific type of each fault sample; after the division, a multi-classification model of fault diagnosis is established by adopting a plurality of different algorithms.
4. The method for diagnosing the multiple types of the concurrent faults of the gearbox based on the hierarchical machine learning as claimed in claim 3, wherein a multi-classification model for fault diagnosis is established by using a support vector machine and a neural network,
the support vector machine expression is as follows:
Figure FDA0003137201530000032
the neural network expression is as follows: suppose the number of layers of the neural network is K (K)>1) The number of nodes from the input layer to the output layer is m0,m1,…,mKThereby defining the dimension of the input vector as m0Dimension of the output vector is mk. The output vectors of each layer of the network are respectively expressed as follows:
an input layer:
Figure FDA0003137201530000033
hidden layer one:
Figure FDA0003137201530000034
hidden layer two:
Figure FDA0003137201530000035
an output layer:
Figure FDA0003137201530000036
weight matrix and offset vector for each layer
Figure FDA0003137201530000037
Then, the output of the network is
Figure FDA0003137201530000038
CN202110722255.9A 2021-06-29 2021-06-29 Gear box multi-type concurrent fault diagnosis method based on hierarchical machine learning Active CN113551904B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110722255.9A CN113551904B (en) 2021-06-29 2021-06-29 Gear box multi-type concurrent fault diagnosis method based on hierarchical machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110722255.9A CN113551904B (en) 2021-06-29 2021-06-29 Gear box multi-type concurrent fault diagnosis method based on hierarchical machine learning

Publications (2)

Publication Number Publication Date
CN113551904A true CN113551904A (en) 2021-10-26
CN113551904B CN113551904B (en) 2023-06-30

Family

ID=78131027

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110722255.9A Active CN113551904B (en) 2021-06-29 2021-06-29 Gear box multi-type concurrent fault diagnosis method based on hierarchical machine learning

Country Status (1)

Country Link
CN (1) CN113551904B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117909837A (en) * 2024-03-15 2024-04-19 山东科技大学 Gear fault diagnosis method driven by data and knowledge in cooperation
CN117909837B (en) * 2024-03-15 2024-05-31 山东科技大学 Gear fault diagnosis method driven by data and knowledge in cooperation

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107688825A (en) * 2017-08-03 2018-02-13 华南理工大学 A kind of follow-on integrated weighting extreme learning machine sewage disposal failure examines method
CN108918137A (en) * 2018-06-08 2018-11-30 华北水利水电大学 Fault Diagnosis of Gear Case devices and methods therefor based on improved WPA-BP neural network
US20200271720A1 (en) * 2020-05-09 2020-08-27 Hefei University Of Technology Method for diagnosing analog circuit fault based on vector-valued regularized kernel function approximation
CN112945552A (en) * 2021-02-04 2021-06-11 常州大学 Gear fault diagnosis method based on variable node double-hidden-layer limit learning machine

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107688825A (en) * 2017-08-03 2018-02-13 华南理工大学 A kind of follow-on integrated weighting extreme learning machine sewage disposal failure examines method
CN108918137A (en) * 2018-06-08 2018-11-30 华北水利水电大学 Fault Diagnosis of Gear Case devices and methods therefor based on improved WPA-BP neural network
US20200271720A1 (en) * 2020-05-09 2020-08-27 Hefei University Of Technology Method for diagnosing analog circuit fault based on vector-valued regularized kernel function approximation
CN112945552A (en) * 2021-02-04 2021-06-11 常州大学 Gear fault diagnosis method based on variable node double-hidden-layer limit learning machine

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
占健;吴斌;王加祥;余建波;: "基于OS-ELM的风机关键机械部件故障诊断方法", 机械制造, no. 04 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117909837A (en) * 2024-03-15 2024-04-19 山东科技大学 Gear fault diagnosis method driven by data and knowledge in cooperation
CN117909837B (en) * 2024-03-15 2024-05-31 山东科技大学 Gear fault diagnosis method driven by data and knowledge in cooperation

Also Published As

Publication number Publication date
CN113551904B (en) 2023-06-30

Similar Documents

Publication Publication Date Title
CN109685366A (en) Equipment health state evaluation method based on mutation data
CN109655259B (en) Compound fault diagnosis method and device based on deep decoupling convolutional neural network
CN113255848B (en) Water turbine cavitation sound signal identification method based on big data learning
CN111967486A (en) Complex equipment fault diagnosis method based on multi-sensor fusion
CN106371427A (en) Industrial process fault classification method based on analytic hierarchy process and fuzzy fusion
CN106355030A (en) Fault detection method based on analytic hierarchy process and weighted vote decision fusion
CN113762329A (en) Method and system for constructing state prediction model of large rolling mill
CN114295377B (en) CNN-LSTM bearing fault diagnosis method based on genetic algorithm
CN114676742A (en) Power grid abnormal electricity utilization detection method based on attention mechanism and residual error network
CN109240276B (en) Multi-block PCA fault monitoring method based on fault sensitive principal component selection
CN109298633A (en) Chemical production process fault monitoring method based on adaptive piecemeal Non-negative Matrix Factorization
CN114444582A (en) Mechanical equipment fault diagnosis method based on convolutional neural network and Bayesian network
WO2019178930A1 (en) Fault diagnosis method for mechanical device
CN115099260A (en) Online monitoring mechanical fault real-time diagnosis method for double-screw oil transfer pump
CN114429152A (en) Rolling bearing fault diagnosis method based on dynamic index antagonism self-adaption
CN115096627A (en) Method and system for fault diagnosis and operation and maintenance in manufacturing process of hydraulic forming intelligent equipment
CN115221973A (en) Aviation bearing fault diagnosis method based on enhanced weighted heterogeneous ensemble learning
CN115757103A (en) Neural network test case generation method based on tree structure
CN111497868A (en) Automobile sensor fault classification method based on BN-L STM network
WO2022188425A1 (en) Deep learning fault diagnosis method integrating prior knowledge
CN113110398B (en) Industrial process fault diagnosis method based on dynamic time consolidation and graph convolution network
CN117032165A (en) Industrial equipment fault diagnosis method
CN116109039A (en) Data-driven anomaly detection and early warning system
CN113551904B (en) Gear box multi-type concurrent fault diagnosis method based on hierarchical machine learning
CN114741876B (en) Intelligent inspection method for tower crane

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant