CN114239384A - Rolling bearing fault diagnosis method based on nonlinear measurement prototype network - Google Patents

Rolling bearing fault diagnosis method based on nonlinear measurement prototype network Download PDF

Info

Publication number
CN114239384A
CN114239384A CN202111429337.0A CN202111429337A CN114239384A CN 114239384 A CN114239384 A CN 114239384A CN 202111429337 A CN202111429337 A CN 202111429337A CN 114239384 A CN114239384 A CN 114239384A
Authority
CN
China
Prior art keywords
feature
attention
prototype
fault diagnosis
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111429337.0A
Other languages
Chinese (zh)
Inventor
苏祖强
吴然然
韩冷
张小龙
姜维龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202111429337.0A priority Critical patent/CN114239384A/en
Publication of CN114239384A publication Critical patent/CN114239384A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Resources & Organizations (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Computer Hardware Design (AREA)
  • Molecular Biology (AREA)
  • Game Theory and Decision Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Testing Of Devices, Machine Parts, Or Other Structures Thereof (AREA)
  • Medical Informatics (AREA)
  • Geometry (AREA)
  • General Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)

Abstract

The invention relates to the technical field of simulation analysis, in particular to a rolling bearing fault diagnosis method based on a nonlinear measurement prototype network, which comprises the steps of constructing a cascade attention prototype nonlinear measurement network, carrying out classification training on the constructed network, carrying out data processing on data with diagnosis, inputting the data into the trained cascade attention prototype nonlinear measurement network, and outputting a diagnosis result; the invention extracts the feature diagram through the prototype calculation module, calculates the prototype for the support set feature, splices the query sample feature and various prototypes one by one in the cascade attention module, then extracts the long-distance correlation of the spliced sample through the cascade attention mechanism, and finally inputs the feature extracted by the cascade attention module into the nonlinear measurement module, thereby realizing the accurate and effective bearing fault diagnosis under the condition of small sample.

Description

Rolling bearing fault diagnosis method based on nonlinear measurement prototype network
Technical Field
The invention relates to the technical field of simulation analysis, in particular to a rolling bearing fault diagnosis method based on a nonlinear measurement prototype network.
Background
The rolling bearing is one of the most critical components in large-scale rotating machinery equipment, is easy to damage after long-time operation in severe environment, even leads to the abnormal work of the whole unit, and causes huge economic loss or casualties. Therefore, accurate and intelligent fault diagnosis of the rolling bearing has very important significance in the industrial and academic fields.
The rolling bearing fault diagnosis method based on deep learning is rapidly developed in the past years, and fault diagnosis and identification are carried out on vibration signals by utilizing strong characteristic dimension reduction and mode identification capability of a neural network. Compared with the traditional diagnosis algorithm, the deep learning has stronger high-dimensionality and nonlinear abstract data feature extraction capability and more accurate pattern recognition capability without artificial feature extraction. Deep learning methods such as an Automatic Encoder (AE), a Deep Belief Network (DBN), a Convolutional Neural Network (CNN), and a deep residual error network (DRN) have been widely used in the field of diagnosis of rolling bearing faults with sufficient labeled samples, and have exhibited good performance. However, the success of these methods is due in large part to the large amount of label data, and in practical industrial application scenarios, it is difficult to directly obtain sufficient labeled fault samples since the rolling bearings are in most cases functioning properly during their life cycle. The marked fault sample scarcity can cause the problems of overfitting, poor robustness, low fault diagnosis accuracy and the like of a fault diagnosis model of the traditional deep learning method. Therefore, under the condition that the label fault samples are few, the research on the fault diagnosis model for the rolling bearing has important engineering significance.
Disclosure of Invention
Aiming at the problem that ideal recognition effect is difficult to obtain by a fault diagnosis method based on deep learning due to the scarcity of marked samples in the prior art, the invention provides a rolling bearing fault diagnosis method based on a nonlinear measurement prototype network.
Further, the cascade attention prototype nonlinear metric network comprises a sample set division module, a prototype calculation module, a cascade attention mechanism learning module and a nonlinear metric strategy classification training module, wherein:
dividing the sample set into a support set and a query set by using a sample set dividing module;
inputting the divided data sets into a prototype calculation module to obtain feature graphs corresponding to samples in the data sets, and calculating class prototypes through the feature graphs of the support sets;
splicing the feature graphs of the query set samples with the prototypes of all categories one by one, and extracting the long-distance correlation of the spliced samples by adopting a cascade attention mechanism learning module;
and inputting the long-distance correlation extracted by the cascade attention mechanism learning module into a nonlinear measurement strategy classification training module for classification training.
Further, inputting the divided data set into a prototype calculation module to obtain a feature map corresponding to the sample in the data set, namely, using a feature extractor
Figure BDA0003379652850000021
Sample x in sample set LiEmbedding into a feature space, represented as:
Figure BDA0003379652850000022
for type c faults, prototype P is generated by using support set SCThe method comprises the following steps:
Figure BDA0003379652850000023
wherein, yiA label representing the ith sample in support set S.
Further, the cascade attention mechanism learning module comprises a channel attention submodule and a space attention submodule, and the extracting the long-distance correlation of the spliced sample comprises:
the cascade attention mechanism learning module performs convolution on the input spliced sample and extracts a characteristic F;
respectively inputting the feature F into a channel attention submodule and a space attention submodule, wherein the channel attention submodule adaptively adjusts feature values among channels, establishes a channel dependency relationship and obtains a channel attention feature Fc';
The space attention submodule focuses on the position information of the target sample in the input feature mapping to obtain a space attention feature Fs';
Attention feature of channel Fc' and spatial attention feature FsAnd performing information fusion, and accumulating the fused characteristic information and the input characteristic F to obtain the long-distance correlation of the spliced sample.
Further, the channel attention submodule comprises a global average pooling layer, a first convolution block and a second convolution block, each convolution block is composed of a convolution layer, a BN layer and an activation function, the characteristic F is input into the global average pooling layer, the first convolution block and the second convolution block which are cascaded, a channel information structure body S is obtained through extraction, and the matrix product of the characteristic F unified by the channel information structure body S and the characteristic F are added to be used as the output of the channel attention submodule.
Further, the channel attention feature Fc' is represented as:
wherein the content of the first and second substances,
Figure BDA0003379652850000031
channel attention feature map, W, obtained for global average pooling of features F1And W2Weights of convolution layers in the first convolution block and the second convolution block respectively; sigma (.) is sigmoid activation function; gamma (.) is the relu activation function,
Figure BDA0003379652850000032
and
Figure BDA0003379652850000033
respectively, a matrix multiplication operation and an addition operation.
Further, the spatial attention submodule comprises a third rolling block and a global average pooling layer, the third rolling block is composed of a rolling layer and a BN layer, the characteristic F is input into the cascaded third rolling block and the global average pooling layer to extract a spatial information structure S ', the value obtained by multiplying the spatial information structure S' by the input characteristic F is added with the characteristic F to obtain a spatial attention characteristic Fs'。
Further, the spatial attention feature Fs' is represented as:
Figure BDA0003379652850000034
wherein the content of the first and second substances,
Figure BDA0003379652850000035
for the average pooling of features F in their channel dimensions, W3Represents the weight of the convolutional layer in the convolutional block, sigma (.) is sigmoid activation function,
Figure BDA0003379652850000036
and
Figure BDA0003379652850000037
respectively, a matrix multiplication operation and an addition operation.
Further, the additional convolution block adopted in the process of carrying out convolution on the input spliced sample by the cascade attention mechanism learning module comprises a convolution layer, a pooling layer, a BN layer and an activation function.
The invention extracts the feature diagram through the prototype calculation module, calculates the prototype for the support set feature, splices the query sample feature and various prototypes one by one in the cascade attention module, then extracts the long-distance correlation of the spliced sample through the cascade attention mechanism, and finally inputs the feature extracted by the cascade attention module into the nonlinear measurement module, thereby realizing the accurate and effective bearing fault diagnosis under the condition of small sample.
Drawings
FIG. 1 is a flow chart of an embodiment of a rolling bearing fault diagnosis method based on a nonlinear metric prototype network, which is disclosed by the invention;
FIG. 2 is a schematic diagram of a nonlinear metrology prototype network architecture according to the present invention;
FIG. 3 is a diagram of a prototype network architecture;
FIG. 4 is a schematic diagram of a linear metrology structure of a prototype;
FIG. 5 is a schematic view of a non-linear metrology structure in accordance with the present invention;
FIG. 6 is a schematic diagram of a cascade attention mechanism according to the present invention;
FIG. 7 is a schematic diagram of a vibration signal of a rolling bearing collected in a state a by the MFS experimental apparatus of the present invention;
FIG. 8 is a schematic diagram of the vibration signals of the rolling bearing collected in the MFS experimental apparatus of the present invention at state b;
FIG. 9 is a schematic diagram of the vibration signals of the rolling bearing collected in the state c of the MFS experimental apparatus of the present invention;
FIG. 10 is a schematic diagram of the vibration signals of the rolling bearing collected by the MFS experimental apparatus of the present invention at state d;
FIG. 11 is a schematic diagram of vibration signals of a rolling bearing collected in the MFS experimental apparatus of the present invention at state e;
FIG. 12 is a schematic diagram showing the comparison of diagnostic accuracy between different fault diagnosis and identification methods;
FIG. 13 is a schematic diagram of the output of a confusion matrix under the WDCNN fault diagnosis and identification method;
FIG. 14 is a schematic diagram of confusion matrix output under the SiaNet fault diagnosis and identification method;
FIG. 15 is a schematic diagram of confusion matrix output under the RelayNet fault diagnosis and identification method;
FIG. 16 is a schematic diagram of confusion matrix output under the ProNet fault diagnosis and identification method;
FIG. 17 is a schematic diagram of confusion matrix output under the NM-ProNet fault diagnosis and identification method;
FIG. 18 is a schematic diagram of confusion matrix output under the (CANM-ProNet) fault diagnosis and identification method of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a rolling bearing fault diagnosis method based on a nonlinear measurement prototype network, which comprises the steps of constructing a cascade attention prototype nonlinear measurement network, carrying out classification training on the constructed network, carrying out data processing on data with diagnosis, inputting the data into the trained cascade attention prototype nonlinear measurement network, and outputting a diagnosis result.
As shown in fig. 1, the present invention includes two parts, namely, nonlinear metric prototype network training and fault diagnosis and identification, specifically including:
1. training a nonlinear metric prototype network: based on the limited label sample set, the limited label sample set is divided into a training set and a test set, wherein the training set is further divided into a support set and a query set, the samples are mapped to an embedding space through a prototype network, and various types of prototypes are calculated based on the support set. And (4) splicing the query samples and the class prototypes in the embedding space one by one, and sending the query samples and the class prototypes into a cascade attention module to extract non-local information. And finally, the similarity between the sample and the prototype is better measured through a nonlinear measurement module so as to improve the fault performance. Initializing all parameters of the nonlinear measurement prototype network based on the steps, and feeding training samples through a gradient descent algorithm to train parameters of a network optimization model;
2. fault diagnosis of rolling bearing under small sample: processing vibration data of the rolling bearing to be diagnosed and identified; inputting identification data to be diagnosed into the trained nonlinear measurement prototype network in the process 1; and outputting a fault diagnosis result by the trained nonlinear metric prototype network.
The bearing fault diagnosis model based on the nonlinear measurement prototype network performs the following operations:
s11, dividing a sample set; before dividing a sample set, normalizing original vibration signal samples containing C-type fault categories into standardized one-dimensional samples, and dividing limited marked fault samples into support sets
Figure BDA0003379652850000051
And query set
Figure BDA0003379652850000052
Training sample set L as nonlinear metric prototype network is used for S12-S14;
s12, extracting a feature diagram in a prototype calculation module according to the divided data set, and calculating a category prototype through the feature of the support set; namely:
embedded prototype module utilization feature extractor in nonlinear metrology prototype networks
Figure BDA0003379652850000053
Training samples x in sample set LiEmbedding into a feature space:
Figure BDA0003379652850000061
for type c faults, prototype P is generated by using support set Sc
Figure BDA0003379652850000062
Since the prototype network measures the similarity between a sample and a class prototype in a linear manner, the linear measurement is intended to directly calculate the distance between features by predefining a fixed metric (e.g. euclidean distance), which requires that a feature extractor can extract obvious discriminant features as prototype representations, whereas a mechanical vibration signal is difficult to extract fault features with high recognizability under the condition of few labeled samples. Secondly, the fixed linear metric cannot learn the non-linear relationship between complex signals, and the diagnostic performance thereof will be greatly reduced. Aiming at the defects of prototype network linear measurement, a learnable nonlinear classifier is used for replacing a prototype network fixed linear measurement mode, class prototypes and query sample characteristics are spliced, nonlinear measurement is learnt through a nonlinear neural network, and similarity scoring is carried out on each batch of spliced samples to complete sample category identification.
And S13, splicing the characteristics of the query sample with various types of prototypes one by one based on the calculated prototypes, and extracting the long-distance correlation of the spliced sample by adopting a cascade attention module.
A query sample is spliced with the C type prototype characteristics, and a spliced sample l (x)i) Inputting the volume block in the cascade attention module, and performing primary feature extraction on the splicing features to obtain a feature map
Figure BDA0003379652850000063
Where H × W × C represents the height, width, and number of channels of the feature map, respectively. In the cascade attention module, the characteristic map is controlled
Figure BDA0003379652850000064
Flow into the channel attention and spatial attention modules, respectively. Through the channel attention module, the characteristic values among the channels can be adjusted in a self-adaptive mode, a channel dependency relationship is established, and the channel attention characteristic F is obtainedc'; in the space attention module, the position information of the target sample in the input feature mapping is mainly focused, and the unimportant target features are ignored to obtain the space attention feature Fs'; finally attention feature F of the channelc' and spatial attention feature FsAnd performing information fusion, and accumulating the fused characteristic information and the input characteristic F to extract the important characteristics of the spliced sample.
(1) The channel attention module. In the channel attention module, a global average pooling operation is firstly adopted to compress the feature F in a space dimension, and the space information of feature mapping is aggregated to generate a channel attention feature map
Figure BDA0003379652850000065
Then pass throughThe two rolling blocks extract the nonlinear relation between each channel, the channel dimensions of the two rolling blocks are subjected to dimensionality reduction processing and then to dimensionality enhancement processing, and then an activation function is used for obtaining a channel attention weight S. The channel attention internal network structure is shown in fig. 5, wherein CPBA represents the corresponding convolutional layer, pooling layer, BN layer and activation function, and CBA represents the convolutional layer, BN layer and activation function. Then, multiplying the input characteristic F by a channel information structure S matrix, and fusing the generated result with the F information to obtain the channel attention weighted characteristic
Figure BDA0003379652850000071
The final channel attention module output results are as follows:
Figure BDA0003379652850000072
in the formula (I), the compound is shown in the specification,
Figure BDA0003379652850000073
channel attention feature map obtained for global average pooling of F, W1And W2Respectively representing the weights of the two convolutions in the CBA, sigma (. and gamma.). sigma. are sigmoid and relu activation functions, respectively,
Figure BDA0003379652850000074
and
Figure BDA0003379652850000075
respectively, a matrix multiplication operation and an addition operation.
(2) A spatial attention module. In the space attention module, firstly, a layer of convolution layer is adopted to extract information from the characteristic F, and the output characteristics of the convolution layer are subjected to channel fusion to obtain a space attention characteristic diagram
Figure BDA0003379652850000076
The activation function is then used to obtain the spatial attention weight S'. The spatial attention network structure is shown in fig. 5, where CB represents the corresponding convolutional layer and BN layer. Then, inputting the characteristicsF is multiplied by the space information structure S' matrix, and the generated result is fused with the F information to obtain the space attention weighted feature
Figure BDA0003379652850000077
The final spatial attention module output results are as follows:
Figure BDA0003379652850000078
in the formula (I), the compound is shown in the specification,
Figure BDA0003379652850000079
for average pooling of F in its channel dimension, W3Denotes the weight of convolution kernel 7 × 7 convolution in CB, σ () is sigmoid activation function,
Figure BDA00033796528500000710
and
Figure BDA00033796528500000711
respectively, a matrix multiplication operation and an addition operation.
S14, inputting the features extracted by the attention module into a nonlinear measurement module to realize effective few-shot learning (FSL) bearing fault diagnosis, and conveying the spliced sample to the nonlinear measurement module
Figure BDA00033796528500000712
Through a series of continuous mapping of network layers, the module finally outputs a scalar quantity V with the C value between 0 and 1 through softmaxj,r
Figure BDA00033796528500000713
Vj,rRepresenting query samples
Figure BDA00033796528500000714
With a certain type of prototype pCSimilarity between, i.e. query samples
Figure BDA00033796528500000715
Probability values belonging to the class. The linear measurement and nonlinear measurement modes based on the prototype network are distinguished as shown in fig. 4:
in order to improve the accuracy of the classifier, a network model is trained by minimizing the classification loss of class prototypes corresponding to the query sample and the support set, the mean square error is used as a loss function, and the similarity probability value V is output through the abovej,rAnd a label for the query sample
Figure BDA0003379652850000081
And the class prototype belongs to the label
Figure BDA0003379652850000082
Calculating the mean square error LMSE
Figure BDA0003379652850000083
Finally, the network model is trained by minimizing the above equation:
Figure BDA0003379652850000084
after the class prototype is spliced with the feature map of the query sample, the long-distance correlation of the spliced sample with doubled feature dimension cannot be captured because the class prototype is directly input into a nonlinear measurement network and is influenced by the size of a receptive field. Therefore, a cascade attention mechanism is used to extract the long-distance correlation of the spliced sample, so as to better extract the nonlinear relation between the sample and the prototype through the nonlinear measurement module.
The identification process of fault diagnosis identification comprises the following steps:
s21, processing identification data to be diagnosed;
s22, inputting identification data to be diagnosed into a nonlinear measurement prototype network, and outputting a fault diagnosis result by the network;
the nonlinear measurement prototype network is a small sample supervised learning model and mainly comprises a prototype calculation module, a cascade attention module and a nonlinear measurement module.
In order to verify the effectiveness of the fault diagnosis and identification method disclosed by the invention, a comparison test is carried out by utilizing a vibration signal of a Machine Fault Simulator (MFS); the experiment simulates 5 health states of the rolling bearing, collects the belt end bearing Y-axis vibration signal of the simulator under the 44Hz conversion frequency, and the sampling frequency is 10240 Hz. Specifically, each set of health status data was repeatedly collected 6 times. After obtaining the vibration signals of five different states, data preprocessing is required for the vibration signals, and first, the vibration data with a length of 102400 is divided into 25 samples, each sample containing 4096 data points, so that the number of samples per class is 25 × 6 — 150. The original vibration waveforms for the five different conditions are shown in fig. 6 below, and their health states are shown in table 1 below. For the fault diagnosis of few labeled samples in practical application, all samples are randomly divided into a training set and a testing set, 20% of the samples are used as training samples, and the rest 80% are used as testing samples. The number of marked samples of the training set is set to be 4, each class respectively comprises 5, 10, 15 and 20 samples which are called 6-way 5/10/15/20-shot, and each shot sequentially comprises 5 methods from left to right, namely, a CNWDN, a SiaNet, a RelayNet, a ProNet, an NM-ProNet and the invention (CANM-ProNet).
TABLE 1 health status of rolling bearings
Figure BDA0003379652850000091
Table 2 data set description
Figure BDA0003379652850000092
Based on the above sample set, the method proposed by the present invention (CANM-ProNet) is compared with five other methods, including the parameter Network (silanet) described in WDCNN, document 32nd International Conference Machine Learning,2015, "parameter Network for One-Shot Image registration", the relationship Network (RelaNet) described in cvpr, 2018, "Learning to match: relationship Network for fe-Shot Learning", the document nips, 2017, "Prototypical Networks for fe-Shot Learning", and the present method without additive cascade attention (NM-ProNet). In order to ensure the fairness of the experiment, the six methods uniformly use the same feature extractor and the same hyper-parameter, obtain the same training and testing samples for each batch of data, and totally perform comprehensive evaluation on ten batches of data by repeating the experiment.
The method is characterized in that the structure parameters of the characteristic extractor are shown in the following table 3:
TABLE 3 parameters of the network layer
Figure BDA0003379652850000101
In order to reduce the influence of randomness of experimental data on experimental results, ten random experiments were performed, and the experimental results are shown in fig. 12 and table 4 below:
table 4 comparative experimental results
Figure BDA0003379652850000111
As is apparent from the table, the fault diagnosis performance of the conventional deep learning method WDCNN is not ideal, mainly because the true distribution of data in a high-dimensional space cannot be sufficiently reflected in the case of a small amount of training data. However, with the increase of training samples, the accuracy of the WDCNN is greatly improved, and when 20 training samples are used in each class, the fault diagnosis recognition rate is higher than that of 5 samples in each class by about 24%. In the table, the three FSL methods, i.e., SiaNet, proset and RelaNet, have significantly improved diagnostic performance compared to the conventional deep learning method, because SiaNet, RelaNet and proset all acquire knowledge from small samples through similarity calculation and class expansion. Of these three FSL methods, ProNet's overall recognition is best, with the average increase rates of WDCNN being about 7%, 6% and 4% in the cases of 5-shot, 10-shot, 15-shot and 20-shot, respectively. This shows that ProNet can better improve the classification accuracy of the rolling bearing by prototype fitting of the data distribution center. In addition, it can be found that the improvement precision of the FSL method gradually decreases with the increase of training samples. In the improved method based on the prototype network, NM-ProNet uses a nonlinear measurement strategy to judge whether spliced samples belong to the same class. When the number of each type of training samples is increased from 5 to 20, the recognition accuracy of NM-ProNet is increased from 81% to 91%, and is increased by about 16%, 11%, 5% and 5% compared with the original ProNet, which shows that in the prototype network, compared with the linear measurement mode, the fault diagnosis performance can be greatly improved by using the nonlinear measurement strategy, mainly because the fixed similarity measurement function cannot update the network model by learning more parameters, overfitting is easily caused, and the nonlinear measurement uses a learnable similarity measurement function, so that the classification effect is improved. However, as the feature dimension of the spliced sample is increased, the long-distance correlation cannot be acquired in the nonlinear measurement module due to the limitation of the receptive field, which affects the extraction of the fault features of the complex vibration signal. By comparing NM-ProNet and CANM-ProNet in the table, 5-shot, 10-shot, 15-shot and 20-shot of each type are respectively improved by about 3%, 4% and 2%, which shows that the long-distance correlation is obtained by adding an attention module to the spliced sample, so that the method can be better suitable for nonlinear measurement, and the performance of fault diagnosis is improved. Therefore, the proposed CANM-ProNet achieves the best test classification compared to other methods.
To compare the classification of experimental methods between classes in more detail, FIG. 8 lists the confusion matrix of the results of the WDCNN, SiaNet, RelayNet, ProNet, NM-ProNet and CANM-ProNet methods at 5-shot. Wherein, each type of test sample is 120, and the total number is 5 fault types. As can be seen from the overall classification situation, the following methods mainly focus on IF and BF, and the original signal waveforms of the two types shown in FIGS. 7-11 are combined to find that the IF and BF signal waveforms are somewhat consistent, so that the possibility of difficult complete distinction is caused. As can be seen from fig. 13, the recognition effect of the WDCNN is very poor, when BF is classified, more than half of samples are classified incorrectly, and the wrong labels are mostly concentrated on IF, which indicates that the WDCNN cannot learn the sample characteristics well to achieve the classification effect under the condition that only 5-shot labeled samples are used for network training, and is not suitable for fault diagnosis of the WDCNN under small samples. As can be seen from fig. 14 to 16, compared with WDCNN, the classification effects of the three methods are correspondingly improved, but the overall difference is not great, which indicates the classification effectiveness of the three methods in fault diagnosis under FSL. As shown in fig. 18, compared with the comparison method, the method provided herein improves the obvious classification effect, enhances the distinctiveness of IF and BF, and also improves the method of fig. 17 to a certain extent, because in the method provided herein, the similarity of the spliced sample can be better judged through nonlinear measurement, and meanwhile, the long-distance correlation of the spliced sample is calculated by using the cascade attention, so as to further obtain the more distinguishing features, so that the method has higher identification accuracy for each class, and therefore, compared with other methods, the proposed CANM-ProNet realizes the best small sample fault diagnosis accuracy.
The invention provides an improved FSL method of a rolling bearing fault diagnosis model aiming at an application scene of the shortage of fault marking data, which is called as a cascade attention and nonlinear metric improvement prototype network (CANM-ProNet). First, the prototype calculation module extracts feature maps of the support set and the query set, and calculates a prototype using the feature maps of the support set. The query feature map is then concatenated with each prototype and a cascade attention module is introduced to extract non-local information of the concatenated features. Finally, a non-linear metrology module is presented for better measuring the similarity between the samples and the prototype to improve fault diagnosis performance. Numerous experiments have shown that this method is more efficient than other methods with fewer samples of faults.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (9)

1. A rolling bearing fault diagnosis method based on a nonlinear measurement prototype network is characterized by comprising the steps of constructing a cascade attention prototype nonlinear measurement network, carrying out classification training on the constructed network, carrying out data processing on data with diagnosis, inputting the data into the trained cascade attention prototype nonlinear measurement network, and outputting a diagnosis result.
2. The rolling bearing fault diagnosis method based on the nonlinear measurement prototype network according to claim 1, wherein the cascade attention prototype nonlinear measurement network comprises a sample set division module, a prototype calculation module, a cascade attention mechanism learning module and a nonlinear measurement strategy classification training module, wherein:
dividing the sample set into a support set and a query set by using a sample set dividing module;
inputting the divided data sets into a prototype calculation module to obtain feature graphs corresponding to samples in the data sets, and calculating class prototypes through the feature graphs of the support sets;
splicing the feature graphs of the query set samples with the prototypes of all categories one by one, and extracting the long-distance correlation of the spliced samples by adopting a cascade attention mechanism learning module;
and inputting the long-distance correlation extracted by the cascade attention mechanism learning module into a nonlinear measurement strategy classification training module for classification training.
3. The rolling bearing fault diagnosis method based on the nonlinear metric prototype network according to claim 2, characterized in that the divided data sets are input into a prototype calculation module to obtain a feature map corresponding to the samples in the data sets, namely, a feature extractor is utilized
Figure FDA0003379652840000011
Sample x in sample set LiEmbedding into a feature space, representingComprises the following steps:
Figure FDA0003379652840000012
for type c faults, prototype P is generated by using support set SCThe method comprises the following steps:
Figure FDA0003379652840000013
wherein, yiA label representing the ith sample in support set S.
4. The rolling bearing fault diagnosis method based on the nonlinear metric prototype network according to claim 1, wherein the cascade attention mechanism learning module comprises a channel attention submodule and a space attention submodule, and the extracting of the long-distance correlation of the spliced sample comprises:
the cascade attention mechanism learning module performs convolution on the input spliced sample and extracts a characteristic F;
respectively inputting the feature F into a channel attention submodule and a space attention submodule, wherein the channel attention submodule adaptively adjusts feature values among channels, establishes a channel dependency relationship and obtains a channel attention feature Fc';
The space attention submodule focuses on the position information of the target sample in the input feature mapping to obtain a space attention feature Fs';
Attention feature of channel Fc' and spatial attention feature FsAnd performing information fusion, and accumulating the fused characteristic information and the input characteristic F to obtain the long-distance correlation of the spliced sample.
5. The rolling bearing fault diagnosis method based on the nonlinear metric prototype network according to claim 4, wherein the channel attention submodule comprises a global average pooling layer, a first convolution block and a second convolution block, each convolution block is composed of a convolution layer, a BN layer and an activation function, the feature F is input into the global average pooling layer, the first convolution block and the second convolution block which are cascaded to extract a channel information structure S, and the matrix product of the unified feature F of the channel information structure S and the feature F are added to be used as the output of the channel attention submodule.
6. The rolling bearing fault diagnosis method based on the nonlinear metric prototype network according to claim 5, wherein the channel attention feature Fc' is represented as:
Figure FDA0003379652840000021
wherein the content of the first and second substances,
Figure FDA0003379652840000022
channel attention feature map, W, obtained for global average pooling of features F1And W2Weights of convolution layers in the first convolution block and the second convolution block respectively; sigma (.) is sigmoid activation function; gamma (.) is the relu activation function,
Figure FDA0003379652840000023
and
Figure FDA0003379652840000024
respectively, a matrix multiplication operation and an addition operation.
7. The rolling bearing fault diagnosis method based on the nonlinear metric prototype network according to claim 4, wherein the spatial attention submodule comprises a third rolling block and a global average pooling layer, the third rolling block is composed of a convolution layer and a BN layer, the feature F is input into the cascaded third rolling block and global average pooling layer to extract a spatial information structure S ', the spatial information structure S' is multiplied by the input feature F to obtain a value, and the value is added with the feature F to obtain a spatial attention feature Fs'。
8. The rolling bearing fault diagnosis method based on the nonlinear metric prototype network according to claim 7, wherein the spatial attention feature Fs' is represented as:
Figure FDA0003379652840000031
wherein the content of the first and second substances,
Figure FDA0003379652840000032
for the average pooling of features F in their channel dimensions, W3Represents the weight of the convolutional layer in the convolutional block, sigma (.) is sigmoid activation function,
Figure FDA0003379652840000033
and
Figure FDA0003379652840000034
respectively, a matrix multiplication operation and an addition operation.
9. The rolling bearing fault diagnosis method based on the nonlinear metric prototype network according to claim 7, wherein the additional convolution blocks adopted in the convolution process of the cascaded attention mechanism learning module on the input spliced sample comprise a convolution layer, a pooling layer, a BN layer and an activation function.
CN202111429337.0A 2021-11-29 2021-11-29 Rolling bearing fault diagnosis method based on nonlinear measurement prototype network Pending CN114239384A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111429337.0A CN114239384A (en) 2021-11-29 2021-11-29 Rolling bearing fault diagnosis method based on nonlinear measurement prototype network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111429337.0A CN114239384A (en) 2021-11-29 2021-11-29 Rolling bearing fault diagnosis method based on nonlinear measurement prototype network

Publications (1)

Publication Number Publication Date
CN114239384A true CN114239384A (en) 2022-03-25

Family

ID=80751625

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111429337.0A Pending CN114239384A (en) 2021-11-29 2021-11-29 Rolling bearing fault diagnosis method based on nonlinear measurement prototype network

Country Status (1)

Country Link
CN (1) CN114239384A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115452957A (en) * 2022-09-01 2022-12-09 北京航空航天大学 Small sample metal damage identification method based on attention prototype network
CN116051911A (en) * 2023-03-29 2023-05-02 北京大学 Small sample bearing vibration image data fault diagnosis method based on uncertainty learning

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115452957A (en) * 2022-09-01 2022-12-09 北京航空航天大学 Small sample metal damage identification method based on attention prototype network
CN115452957B (en) * 2022-09-01 2024-04-12 北京航空航天大学 Small sample metal damage identification method based on attention prototype network
CN116051911A (en) * 2023-03-29 2023-05-02 北京大学 Small sample bearing vibration image data fault diagnosis method based on uncertainty learning

Similar Documents

Publication Publication Date Title
CN112069921A (en) Small sample visual target identification method based on self-supervision knowledge migration
CN111273623B (en) Fault diagnosis method based on Stacked LSTM
CN115018021B (en) Machine room abnormity detection method and device based on graph structure and abnormity attention mechanism
CN114239384A (en) Rolling bearing fault diagnosis method based on nonlinear measurement prototype network
CN111368920A (en) Quantum twin neural network-based binary classification method and face recognition method thereof
CN105809672A (en) Super pixels and structure constraint based image's multiple targets synchronous segmentation method
CN105046714A (en) Unsupervised image segmentation method based on super pixels and target discovering mechanism
CN115761900B (en) Internet of things cloud platform for practical training base management
CN112766283B (en) Two-phase flow pattern identification method based on multi-scale convolution network
CN111783336B (en) Uncertain structure frequency response dynamic model correction method based on deep learning theory
CN112784921A (en) Task attention guided small sample image complementary learning classification algorithm
Nero et al. Concept recognition in production yield data analytics
Du et al. Convolutional neural network-based data anomaly detection considering class imbalance with limited data
CN115980560A (en) CNN-GRU-based high-voltage circuit breaker mechanical fault diagnosis system, method and equipment
CN111695611A (en) Bee colony optimization kernel extreme learning and sparse representation mechanical fault identification method
Chang et al. Blind image quality assessment by visual neuron matrix
CN114926702B (en) Small sample image classification method based on depth attention measurement
CN116188445A (en) Product surface defect detection and positioning method and device and terminal equipment
CN113392191B (en) Text matching method and device based on multi-dimensional semantic joint learning
CN114627496A (en) Robust pedestrian re-identification method based on depolarization batch normalization of Gaussian process
CN114553790A (en) Multi-mode feature-based small sample learning Internet of things traffic classification method and system
CN114841266A (en) Voltage sag identification method based on triple prototype network under small sample
CN111652246B (en) Image self-adaptive sparsization representation method and device based on deep learning
CN114187272A (en) Industrial part surface defect detection method based on deep learning
CN109782156B (en) Analog circuit fault diagnosis method based on artificial immune diagnosis network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination