CN117113078A - Small sample bearing fault mode identification method and system based on multi-source data integration - Google Patents

Small sample bearing fault mode identification method and system based on multi-source data integration Download PDF

Info

Publication number
CN117113078A
CN117113078A CN202310995855.1A CN202310995855A CN117113078A CN 117113078 A CN117113078 A CN 117113078A CN 202310995855 A CN202310995855 A CN 202310995855A CN 117113078 A CN117113078 A CN 117113078A
Authority
CN
China
Prior art keywords
sample
domain
source
source data
target domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310995855.1A
Other languages
Chinese (zh)
Inventor
刘环宇
绳远远
王传朋
张化良
李君宝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN202310995855.1A priority Critical patent/CN117113078A/en
Publication of CN117113078A publication Critical patent/CN117113078A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01MTESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
    • G01M13/00Testing of machine parts
    • G01M13/04Bearings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Complex Calculations (AREA)

Abstract

A small sample bearing fault mode identification method and system based on multi-source data integration relate to the field of intelligent operation and maintenance and health management of mechanical equipment. The problem of current small sample bearing fault mode down, the sample lead to the discernment difficulty less, and exist discernment inaccuracy is solved. The method comprises the following steps: constructing a multi-source data sample according to the public data set; extracting multi-source data sample characteristics and constructing a source domain training set and a target domain training set; combining the target domain training set and the source domain training set to obtain a combined source domain training set; training a base classifier according to the target domain test set and the combined multi-source domain sample set; calculating inter-domain distribution measurement and sample similarity; constructing a weight matrix; classifying and identifying the target domain test set according to the base classifier to obtain class probability; and carrying out weighted integration on the class probability according to the weight matrix to obtain a classification result of the target domain test set, and completing fault mode identification. The method and the device are applied to the field of fault identification.

Description

Small sample bearing fault mode identification method and system based on multi-source data integration
Technical Field
The application relates to the field of intelligent operation and maintenance and health management of mechanical equipment, in particular to a small sample bearing fault mode identification method based on multi-source data integration.
Background
In the fault diagnosis and pattern recognition process of a mechanical equipment rotating part, the obtained historical fault data of the part to be monitored is usually limited due to the complex structure of the part and the limitation of the working condition environment. Bearing failure mode identification typically requires sufficient training data to build an accurate model. However, the small sample problem means that only a limited number of failed samples are available for training, and the number of samples for each failure mode may be unbalanced. This results in a challenge to train the model because it may not be able to learn and represent sufficiently different types of faults. Bearing failure monitoring typically involves vibration signal data collected by sensors, which may be affected by environmental noise and interference. In the case of small samples, it may be difficult to accurately distinguish between fault signals and noise signals due to the limited number of data points, thereby affecting the recognition performance of the fault mode. Also, in the case of small samples, efficient feature selection and extraction becomes more difficult. The conventional feature selection and extraction method may not capture the effective information hidden in the small sample, and the extracted features cannot fully reflect the feature difference of the fault mode due to the limited sample, so that the performance of the model is affected. In the case of small samples, the number of samples of different fault types may be unbalanced, which may result in a model that has poor recognition of fault modes for a few classes. Due to the limited number, it may occur that the samples are insufficient to adequately learn and distinguish rare fault types. Small sample problems can limit the generalization ability of the model in other bearing systems or practical applications. Because in practical applications, there may be multiple types of bearings and operating conditions, the model may not maintain good performance when migrating from one small sample data set to another.
Disclosure of Invention
Aiming at the problems that the identification is difficult and inaccurate due to few samples in the existing small sample bearing fault mode, the application provides a small sample bearing fault mode identification method based on multi-source data integration, which comprises the following steps:
a small sample bearing failure mode identification method based on multi-source data integration, the method comprising:
s1: constructing a multi-source data sample according to the public data set;
s2: extracting multi-source data sample characteristics, and constructing n source domain training sets and 1 target domain training set according to the characteristics;
s3: combining the 1 target domain training set and n multiple source domain training sets to obtain a combined source domain training set;
s4: training a base classifier according to the target domain training set and the combined multi-source domain sample set;
s5: calculating inter-domain distribution measurement and sample similarity according to the target domain test set and the combined multi-source domain sample set;
s6: constructing a weight matrix according to the inter-domain distribution measurement and the sample similarity;
s7: classifying and identifying the target domain test set to be identified according to the base classifier, and obtaining the class probability of the test set sample;
s8: and carrying out weighted integration on the class probability of the test set sample according to the weight matrix to obtain the classification result of the target domain test set, and completing fault mode identification.
Further, there is also provided a preferred manner, wherein the extracting the multi-source data sample feature in step S2 includes:
extracting time domain features, frequency domain features and time-frequency domain features in the multi-source data samples;
the frequency domain features include average frequency, root mean square frequency, frequency center, and root variance frequency.
Further, a preferred manner is provided in which the time-frequency domain characteristics are obtained by an empirical mode decomposition algorithm, and the original vibration signal is decomposed into a set of IMF components by the empirical mode decomposition algorithm.
Further, there is also provided a preferred mode, wherein the step S3 includes:
wherein,representing an initial source domain training set, +.>Representing a labeled target domain training set, D Si And representing the recombined source domain training set, wherein n is the number of the source domain training sets.
Further, a preferred manner is also provided, wherein in the step S4, the base classifier is trained by using an extreme learning machine.
Further, a preferred manner is also provided, where the inter-domain distribution metric and the sample similarity in step S5 are specifically:
the inter-domain distribution metric is:
wherein w is j For inter-domain distribution metrics, X, Y represents different sample sets, MMD is expressed as the maximum mean difference;
the sample similarity is:
wherein v is j Is similar to a sampleDegree n s X is the number of samples in the source domain t ∈D T For one target domain sample to be classified, cosine () is Cosine similarity.
Further, a preferred mode is also provided, and the step S7 specifically includes:
wherein,for the classification result, k represents a class label, +.>Representing the classification results of the different basis classifiers.
Based on the same inventive concept, the application also provides a small sample bearing fault mode identification system based on multi-source data integration, which comprises:
sample construction unit: for constructing a multi-source data sample from the public data set;
feature extraction unit: the method comprises the steps of extracting multi-source data sample characteristics, and constructing n source domain training sets and 1 target domain training set according to the characteristics;
sample reorganization unit: the method comprises the steps of combining the 1 target domain training sets with n source domain training sets to obtain combined source domain training sets;
training unit: the multi-source domain sample set is used for training a base classifier according to the target domain test set and the combined multi-source domain sample set;
a calculation unit: the method comprises the steps of calculating inter-domain distribution measurement and sample similarity according to a target domain test set and the combined multi-source domain sample set;
weight matrix construction unit: the weight matrix is constructed according to the inter-domain distribution measurement and the sample similarity;
multisource integration unit: the method comprises the steps of classifying and identifying a target domain test set to be identified according to the base classifier, and obtaining the class probability of a test set sample;
an identification unit: and the method is used for carrying out weighted integration on the class probability of the test set sample according to the weight matrix, obtaining the classification result of the target domain test set and completing the fault mode identification.
Based on the same inventive concept, the application also provides a computer readable storage medium for storing a computer program for executing a small sample bearing fault pattern recognition method based on multi-source data integration as described in any one of the above.
Based on the same inventive concept, the application also provides a computer device, which comprises a memory and a processor, wherein the memory stores a computer program, and when the processor runs the computer program stored in the memory, the processor executes the small sample bearing fault mode identification method based on multi-source data integration.
The application has the advantages that:
the application solves the problems of difficult identification and inaccurate identification caused by few samples in the existing small sample bearing fault mode.
The small sample bearing fault mode identification method based on multi-source data integration provided by the application has lower dependence on the historical fault data of the parts to be detected, and realizes bearing fault mode identification by combining the related data and the simulation data of the parts in the same field or the same type under the condition that the available historical fault data is very small, so that the method has higher practical application value.
According to the small sample bearing fault mode identification method based on multi-source data integration, the combination weight of the base classifier trained by the multi-source domain data set is fully calculated, the inter-domain distribution measurement among the data sets is considered, the influence of single sample similarity in different data sets is considered, and the identification results of the multi-classification model are effectively integrated by introducing the sample similarity matrix, so that the accuracy of target domain sample classification is improved.
According to the small sample bearing fault mode identification method based on multi-source data integration, disclosed by the application, the historical fault data of the bearing to be monitored can be expanded by constructing a multi-source data sample by using the public data set. By introducing multi-source data, including related data of other fields or types and simulation data, more samples can be provided for model training, and the number of samples of the target domain training set is increased, so that accuracy and generalization capability of the model can be improved. The present embodiment uses multi-source data samples to extract features and constructs n source domain training sets and 1 target domain training set. By combining a plurality of source domain training sets with a target domain training set, the combined source domain training set is obtained, and heterogeneity of data sources can be better considered. This helps to improve the adaptability and recognition accuracy of the model to the target domain data.
The application provides a small sample bearing fault mode identification method based on multi-source data integration. The selection and extraction process of the features can select the features with distinction and representativeness through reasonable feature engineering methods. Thus, the noise and the interference of redundant information on model training can be reduced, and the accuracy and the robustness of the model are improved.
According to the small sample bearing fault mode identification method based on multi-source data integration, the target domain test set and the combined multi-source domain sample set are combined, and the same mark and label system are used for model training. Therefore, the problem of inconsistent labels between different data sources and data types can be solved, and the consistency of training and evaluation of the model is ensured.
The method and the device are applied to the field of fault identification.
Drawings
FIG. 1 is a flow chart of a small sample bearing failure mode identification method based on multi-source data integration according to an embodiment;
FIG. 2 is a graph showing the average classification accuracy of multiple algorithms for different numbers of training sets of the target domain according to the eleventh embodiment;
fig. 3 shows the effect of the number of neurons on the average classification accuracy of the proposed algorithm according to the eleventh embodiment.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments.
An embodiment relates to a small sample bearing fault mode identification method based on multi-source data integration, the method including:
s1: constructing a multi-source data sample according to the public data set;
s2: extracting multi-source data sample characteristics, and constructing n source domain training sets and 1 target domain training set according to the characteristics;
s3: combining the 1 target domain training set and n multiple source domain training sets to obtain a combined source domain training set;
s3: combining the target domain training set and n source domain training sets to obtain a combined source domain training set;
s4: training a base classifier according to the target domain training set and the combined multi-source domain sample set;
s5: calculating inter-domain distribution measurement and sample similarity according to the target domain test set and the combined multi-source domain sample set;
s6: constructing a weight matrix according to the inter-domain distribution measurement and the sample similarity;
s7: classifying and identifying the target domain test set to be identified according to the base classifier, and obtaining the class probability of the test set sample;
s8: and carrying out weighted integration on the class probability of the test set sample according to the weight matrix to obtain the classification result of the target domain test set, and completing fault mode identification.
The method of the embodiment can expand the historical fault data of the bearing to be monitored by constructing a multi-source data sample by using the public data set. By introducing multi-source data, including related data of other fields or types and simulation data, more samples can be provided for model training, and the number of samples of the target domain training set is increased, so that accuracy and generalization capability of the model can be improved. The present embodiment uses multi-source data samples to extract features and constructs n source domain training sets and 1 target domain training set. By combining a plurality of source domain training sets with a target domain training set, the combined source domain training set is obtained, and heterogeneity of data sources can be better considered. This helps to improve the adaptability and recognition accuracy of the model to the target domain data.
In the embodiment, the multi-source data sample is subjected to feature extraction, and a source domain training set and a target domain training set are constructed. The selection and extraction process of the features can select the features with distinction and representativeness through reasonable feature engineering methods. Thus, the noise and the interference of redundant information on model training can be reduced, and the accuracy and the robustness of the model are improved.
The present embodiment uses the same label and tag system for model training by combining the target domain test set with the combined multi-source domain sample set. Therefore, the problem of inconsistent labels between different data sources and data types can be solved, and the consistency of training and evaluation of the model is ensured.
In the embodiment, through using the public data set, more data samples can be acquired, and the historical fault data is expanded. This helps to increase the number of samples that the model is trained in, improving the accuracy and generalization ability of the model. By extracting features of the multi-source data samples, key information characterizing the bearing failure mode can be extracted. The multi-source data sample features are respectively constructed into a source domain training set and a target domain training set, so that the characteristics of different data sources and the difference of fault modes are respectively considered. By combining the target domain training set with a plurality of source domain training sets, the information of multi-source data can be better utilized, the learning capacity and smooth data distribution of the model are improved, and the adaptability of the model to the target domain data is improved. By using the combined multi-source domain sample set and target domain test set, the training base classifier can reduce the dependence on target domain samples and improve the generalization capability of the model on unseen target domain samples. By calculating inter-domain distribution metrics and sample similarities between the target domain test set and the multi-source domain sample set, the degree of association between different samples can be obtained. According to the metrics, the influence of the weight matrix in the weighted integration can be adjusted according to the importance of the sample, so that the stability and the accuracy of the model are improved. The target domain test set is classified and identified through the base classifier, so that the probability of each category is obtained, confidence level estimation on sample classification can be provided, and the discrimination capability of the model on the target domain data can be more comprehensively known. The classification probability of the test set sample is weighted and integrated based on the weight matrix, so that the importance of different source domains and the importance of the sample can be comprehensively considered, and a final classification result of the target domain test set can be obtained. This can improve the robustness and recognition accuracy of the model, thereby enabling efficient recognition of the bearing failure mode.
In a second embodiment, the present embodiment is further defined by the small sample bearing fault mode identification method based on multi-source data integration in the first embodiment, and the extracting multi-source data sample features in step S2 includes:
extracting time domain features, frequency domain features and time-frequency domain features in the multi-source data samples;
the frequency domain features include average frequency, root mean square frequency, frequency center, and root variance frequency.
In practical applications, the time domain features are shown in table 1:
TABLE 1 time domain characterization
Wherein x is i I=1, 2l, n represents a time series and n is the number of data points, x max =max|x i I and IIRepresenting the absolute value average.
The frequency domain characteristics are shown in table 1:
TABLE 2 frequency domain characterization
Wherein p is i Represents x i I=1, 2l, N representing the number of spectral lines. f (f) i Represents the i-th spectral line amplitude, F mf Can represent the vibration energy in the frequency domain, F rmsf And F fc The dominant frequency location is described. F (F) rvf The degree of concentration or dispersion of spectral power energy may be characterized.
In the third embodiment, the method for identifying a failure mode of a small sample bearing based on multi-source data integration according to the second embodiment is further defined, the time-frequency domain features are obtained by an empirical mode decomposition algorithm, and the original vibration signal is decomposed into a group of IMF components by using the empirical mode decomposition algorithm.
Specifically, taking the energy of the first six-order IMF component as the time-frequency domain feature, the amplitude energy calculation of IMF is as follows:
wherein E is j For the amplitude energy of IMF, N represents the length of the jth IMF component data, H [ g ]]Representing the hilbert transform.
In the present embodiment, by decomposing the original vibration signal into IMF components, vibration characteristics in different frequency ranges can be extracted. Each IMF component may be considered as a vibration mode having a different frequency and amplitude, helping to more fully describe the time-frequency domain characteristics of the vibration signal. The EMD may separate noise and interference in the signal into high frequency IMF components while preserving fault signature information in low frequency IMF components. This helps to improve the accuracy and reliability of failure mode identification. The decomposed IMF components have good interpretability, and each IMF component may be associated with a particular vibration mode or failure mode. This makes the feature analysis of the vibration signal more intuitive and understandable, facilitating the diagnosis of the fault and interpretation of the fault pattern.
According to the method, an original vibration signal is decomposed into a group of IMF components by using an empirical mode decomposition algorithm, so that abundant time-frequency domain characteristic information can be extracted, the accuracy of fault mode identification is enhanced, and independent analysis capability on vibration characteristics of different frequency bands is provided.
In the second embodiment, when the fault is weak, the health state of the component is difficult to be fully represented by only the statistical characteristics of the time domain and frequency domain signals. Therefore, in this embodiment, EMD is used to extract more characteristic information of the bearing. EMD can reveal vibration signals from the angle of time-frequency amplitude distribution, and the original vibration signals are decomposed into a group of IMFs by using an EMD algorithm:
wherein c i (t) represents the jth IMF component of x (t), which represents signals of different frequency bands from high frequency to low frequency. r is (r) n (t) represents a residual signal with monotonic trend, and the amplitude energy of IMF is calculated as follows:
typically, the first six IMFs are able to contain almost all information of the vibration signal, with their intrinsic IMF components containing different information for vibration signals in different failure modes. Therefore, in the embodiment, the IMF amplitude energy of the first sixth order is extracted as the vibration signal time-frequency domain feature. The second and third embodiments together construct a feature set consisting of twelve time-domain features, four frequency-domain features, and six time-frequency-domain features.
In a fourth embodiment, the present embodiment is a further limitation of the small sample bearing fault pattern recognition method based on multi-source data integration according to the first embodiment, where the step S3 includes:
wherein,representing an initial source domain training set, +.>Representing a labeled target domain training set, D Si And representing the recombined source domain training set, wherein n is the number of the source domain training sets.
In the fault mode identification process, a base classifier needs to be trained firstly, but under actual working conditions, the sample data containing labels in the target domain are usually less due to the problems of high data acquisition difficulty, high marking cost of the target domain data and the like. Only a small amount of labeled data under the same working condition is adopted to train the model, the accuracy is low, and the classification result is not ideal. The training sample size of the model can be effectively improved by utilizing the data sets in the same field, and the model can effectively utilize the knowledge of the multi-source field samples. Combining tagged target domain with multiple source domain data can improve the classification accuracy of the training model.
In a fifth embodiment, the method for identifying a failure mode of a small sample bearing based on multi-source data integration according to the first embodiment is further defined, and in the step S4, an extreme learning machine is used to train a base classifier.
Real-time is an important consideration in failure mode identification. The ELM training process is very efficient, does not require iterative optimization, but directly solves for the output weights by randomly initializing weights and offsets. The training mode enables the training speed of the ELM to be very high, and is suitable for large-scale fault mode identification problems and real-time application scenes. Failure modes often involve large amounts of sensor data and complex signal information, which results in high and complex data processing requirements. The ELM can effectively cope with high-dimensional and complex data and improve the performance of fault mode recognition by mapping the data to a high-dimensional feature space and performing nonlinear processing on the data by utilizing neurons in a random hidden layer. Failure modes are often disturbed by problems such as data noise, incomplete information, sample imbalance, etc. ELM is robust in the training process and relatively insensitive to noise and incomplete information. Meanwhile, the ELM realizes a nonlinear classification decision boundary in a high-dimensional feature space, has better generalization capability, can be effectively popularized to unseen fault modes, and improves the accuracy and the robustness of fault mode identification. The ELM is based on a single iterative weight adjustment process, and has good online learning capability. In actual failure mode identification, it is often necessary to cope with the need for changes in data flow and real-time updates. ELM can adapt to new data sample fast, realize dynamic model update and online study.
The training base classifier of the extreme learning machine has the advantages of high training speed, expandability, capability of processing high-level and complex data, robustness and generalization capability in fault mode identification, capability of meeting online learning requirements and the like. These advantages make ELM an efficient method to build failure mode recognition systems and to obtain accurate, fast, robust fault diagnosis and monitoring capabilities.
In a sixth embodiment, the present embodiment is further defined by the small sample bearing fault mode identification method based on multi-source data integration according to the first embodiment, wherein the inter-domain distribution metric and the sample similarity in the step S5 specifically are:
the inter-domain distribution metric is:
wherein w is j For inter-domain distribution metrics, X, Y represents different sample sets, MMD is expressed as the maximum mean difference;
the sample similarity is:
wherein v is j For sample similarity n s X is the number of samples in the source domain t ∈D T For one target domain sample to be classified, cosine () is Cosine similarity.
Conventional inter-domain distribution metrics typically only consider the differences in overall sample distribution, but disregard the differences between the individual samples. The sample similarity weight based on cosine similarity can fully consider the similarity between each sample and other samples, and the embodiment takes the difference of individual samples into consideration, thereby being beneficial to more accurately evaluating the similarity between the source domain data and the target domain data. The cosine similarity-based sample similarity weight matrix may provide personalized weight distribution for each target domain sample to be classified. By calculating the cosine similarity between the target domain samples and the source domain samples, it can be determined which source domain samples have greater similarity to the target domain samples, thereby giving different weights to different samples. The personalized weight distribution can better distinguish and utilize samples similar to the target domain data in the source domain data, and improve the accuracy of fault mode identification. In the case of small samples, the number and diversity of data is limited, and conventional classification methods may suffer from over-fitting, under-fitting, and the like. The sample similarity weight based on cosine similarity can enable the target domain sample to better utilize the information of the source domain sample by introducing the similarity relation between the target domain and the source domain sample, and can capture the difference between samples at the individual sample level, so that the defect of small sample data is effectively overcome.
An embodiment seventh, the present embodiment is further defined by the small sample bearing fault pattern recognition method based on multi-source data integration according to the embodiment one, wherein the step S7 specifically includes:
wherein,for the classification result, k represents a class label, +.>Representing the classification results of the different basis classifiers.
An eighth embodiment is a small sample bearing failure mode identification system based on multi-source data integration according to the present embodiment, the system including:
sample construction unit: for constructing a multi-source data sample from the public data set;
feature extraction unit: the method comprises the steps of extracting multi-source data sample characteristics, and constructing n source domain training sets and 1 target domain training set according to the characteristics;
sample reorganization unit: the method comprises the steps of combining the 1 target domain training set with n source domain training sets to obtain a combined source domain training set;
training unit: the multi-source domain sample set is used for training a base classifier according to the target domain training set and the combined multi-source domain sample set;
a calculation unit: the method comprises the steps of calculating inter-domain distribution measurement and sample similarity according to a target domain test set and the combined multi-source domain sample set;
weight matrix construction unit: the weight matrix is constructed according to the inter-domain distribution measurement and the sample similarity;
multisource integration unit: the method comprises the steps of classifying and identifying a target domain test set to be identified according to the base classifier, and obtaining the class probability of a test set sample;
an identification unit: and the method is used for carrying out weighted integration on the class probability of the test set sample according to the weight matrix, obtaining the classification result of the target domain test set and completing the fault mode identification.
The computer readable storage medium according to the ninth embodiment is used for storing a computer program, and the computer program executes the small sample bearing fault pattern recognition method based on multi-source data integration according to any one of the first to seventh embodiments.
The computer device according to the tenth embodiment includes a memory and a processor, the memory stores a computer program, and when the processor runs the computer program stored in the memory, the processor executes the small sample bearing fault pattern recognition method based on the multi-source data integration according to any one of the first to seventh embodiments.
Embodiment eleven, this embodiment will be described with reference to fig. 2 and 3. The present embodiment provides a specific example for the small sample bearing failure mode identification method based on multi-source data integration in the first embodiment, and is also used for explaining the second embodiment to the seventh embodiment, specifically:
in this embodiment, 4 sets of public data sets from different institutions are selected for cross-domain verification, so as to perform overall performance comparison on the method of the present application. The selected dataset includes a JNU bearing dataset, an IMS bearing dataset, a NUAA bearing dataset, and a CWRU bearing dataset. Each data set selects 200 groups of four different fault type samples including normal state, outer ring fault, inner ring fault and rolling body fault, and the total number of the 200 groups is 800. Any one group is selected as a target domain data set, and the other three groups are selected as source domain data sets.
Fig. 2 is a comparison result of a small sample bearing failure mode recognition method based on multi-source data integration and multiple algorithms, where an ELM-MSDI model is a model established by the method provided by the application, and it can be seen that, when the sample size of a training set in a target domain is small, the method provided by the application has optimal classification accuracy.
In this example, the number of neurons is the core parameter that affects the accuracy of the classification of the method provided by the present application. To explore the influence of the number of neurons on the proposed model, the present embodiment uses 20, 30 and 40 sets of target domain data training models, respectively, to obtain global average results obtained with different numbers of neurons, using a target domain training set training model, and using the proposed multi-source integration strategy, as shown in fig. 3.
As can be seen from fig. 3, the trend of the influence of the number of neurons on the accuracy of the method provided by the present application trained using 20 sets, 30 sets of 40 sets of target domain training sets is consistent. When 100 neurons are selected, the prediction precision of the model is relatively low, and when the number of the neurons is between 200 and 1000, the influence of the number of the neurons on the model precision is not obvious, which indicates that the method provided by the application and the constructed model have better stability.
In the embodiment, an effective multi-source integrated model is provided aiming at the situation that in an actual application scene, the same working condition in different fields has fewer label data sets and the related data sets of different working conditions in the same field are easy to obtain. When the combination weight of the classifier trained by the multi-source domain data set is calculated, not only the inter-domain distribution measurement among the data sets is considered, but also the influence of single sample similarity among different data sets is considered, and the recognition result of the multi-classification model is effectively integrated by introducing a sample similarity matrix, so that the accuracy of target domain sample classification is improved. Finally, the method is verified in different classification tasks and compared with a plurality of groups of different strategies and classification algorithms, so that the method provided by the application has higher classification precision in fault mode identification under a small sample target domain training set.
It should be noted that, the method and the details thereof provided in the foregoing embodiments may be combined into the apparatus and the device provided in the foregoing embodiments, and are not described in detail herein. Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application. In the embodiments provided in the present application, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other manners. For example, the apparatus/device embodiments described above are merely illustrative, e.g., the division of modules or elements described above is merely a logical functional division, and may be implemented in other ways, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. The above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.

Claims (10)

1. A small sample bearing failure mode identification method based on multi-source data integration, the method comprising:
s1: constructing a multi-source data sample according to the public data set;
s2: extracting multi-source data sample characteristics, and constructing n source domain training sets and 1 target domain training set according to the characteristics;
s3: combining the target domain training set and n source domain training sets to obtain a combined source domain training set;
s4: training a base classifier according to the target domain training set and the combined multi-source domain sample set;
s5: calculating inter-domain distribution measurement and sample similarity according to the target domain test set and the combined multi-source domain sample set;
s6: constructing a weight matrix according to the inter-domain distribution measurement and the sample similarity;
s7: classifying and identifying the target domain test set to be identified according to the base classifier, and obtaining the class probability of the test set sample;
s8: and carrying out weighted integration on the class probability of the test set sample according to the weight matrix to obtain the classification result of the target domain test set, and completing fault mode identification.
2. The method for identifying a small sample bearing failure mode based on multi-source data integration according to claim 1, wherein the extracting multi-source data sample features in step S2 comprises:
extracting time domain features, frequency domain features and time-frequency domain features in the multi-source data samples;
the frequency domain features include average frequency, root mean square frequency, frequency center, and root variance frequency.
3. The method for identifying the small sample bearing fault mode based on multi-source data integration according to claim 2, wherein the time-frequency domain features are obtained by an empirical mode decomposition algorithm, and the original vibration signal is decomposed into a set of IMF components by the empirical mode decomposition algorithm.
4. The method for identifying a small sample bearing failure mode based on multi-source data integration according to claim 1, wherein the step S3 comprises:
wherein,representing an initial source domain training set, +.>Representing a labeled target domain training set, D Si And representing the recombined source domain training set, wherein n is the number of the source domain training sets.
5. The method for identifying the failure mode of the small sample bearing based on the multi-source data integration according to claim 1, wherein the step S4 is to train a base classifier by using an extreme learning machine.
6. The method for identifying a small sample bearing failure mode based on multi-source data integration according to claim 1, wherein the inter-domain distribution measurement and the sample similarity in step S5 are specifically:
the inter-domain distribution metric is:
wherein w is j For inter-domain distribution metrics, X, Y represents different sample sets, MMD is expressed as the maximum mean difference;
the sample similarity is:
wherein v is j For sample similarity n s X is the number of samples in the source domain t ∈D T For one target domain sample to be classified, cosine () is Cosine similarity.
7. The method for identifying a small sample bearing failure mode based on multi-source data integration according to claim 1, wherein the step S8 specifically comprises:
wherein,for classification results, k represents a class label, f i Sj (x t ) Representing the classification results of the different basis classifiers.
8. A small sample bearing failure mode recognition system based on multi-source data integration, the system comprising:
sample construction unit: for constructing a multi-source data sample from the public data set;
feature extraction unit: the method comprises the steps of extracting multi-source data sample characteristics, and constructing n source domain training sets and 1 target domain training set according to the characteristics;
sample reorganization unit: the method comprises the steps of combining the target domain training set with n source domain training sets to obtain a combined source domain training set;
training unit: the multi-source domain sample set is used for training a base classifier according to the target domain training set and the combined multi-source domain sample set;
a calculation unit: the method comprises the steps of calculating inter-domain distribution measurement and sample similarity according to a target domain test set and the combined multi-source domain sample set;
weight matrix construction unit: the weight matrix is constructed according to the inter-domain distribution measurement and the sample similarity;
multisource integration unit: the method comprises the steps of classifying and identifying a target domain test set to be identified according to the base classifier, and obtaining the class probability of a test set sample;
an identification unit: and the method is used for carrying out weighted integration on the class probability of the test set sample according to the weight matrix, obtaining the classification result of the target domain test set and completing the fault mode identification.
9. A computer readable storage medium for storing a computer program for executing a small sample bearing failure mode recognition method based on multi-source data integration according to any of claims 1-7.
10. A computer device comprising a memory and a processor, the memory having a computer program stored therein, the processor performing a multisource data integration based small sample bearing failure mode recognition method according to any of claims 1-7 when the processor runs the computer program stored in the memory.
CN202310995855.1A 2023-08-08 2023-08-08 Small sample bearing fault mode identification method and system based on multi-source data integration Pending CN117113078A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310995855.1A CN117113078A (en) 2023-08-08 2023-08-08 Small sample bearing fault mode identification method and system based on multi-source data integration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310995855.1A CN117113078A (en) 2023-08-08 2023-08-08 Small sample bearing fault mode identification method and system based on multi-source data integration

Publications (1)

Publication Number Publication Date
CN117113078A true CN117113078A (en) 2023-11-24

Family

ID=88806672

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310995855.1A Pending CN117113078A (en) 2023-08-08 2023-08-08 Small sample bearing fault mode identification method and system based on multi-source data integration

Country Status (1)

Country Link
CN (1) CN117113078A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117932230A (en) * 2024-03-21 2024-04-26 北京航空航天大学 Fault diagnosis method based on characteristic decoupling and sample recombination under small sample condition

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117932230A (en) * 2024-03-21 2024-04-26 北京航空航天大学 Fault diagnosis method based on characteristic decoupling and sample recombination under small sample condition
CN117932230B (en) * 2024-03-21 2024-05-28 北京航空航天大学 Fault diagnosis method based on characteristic decoupling and sample recombination under small sample condition

Similar Documents

Publication Publication Date Title
CN111967294B (en) Unsupervised domain self-adaptive pedestrian re-identification method
Zhang et al. A fault diagnosis method for wind turbines gearbox based on adaptive loss weighted meta-ResNet under noisy labels
CN111238807B (en) Fault diagnosis method for planetary gear box
CN112036301B (en) Driving motor fault diagnosis model construction method based on intra-class feature transfer learning and multi-source information fusion
CN105224872B (en) A kind of user's anomaly detection method based on neural network clustering
CN103728551B (en) A kind of analog-circuit fault diagnosis method based on cascade integrated classifier
CN114048568B (en) Rotary machine fault diagnosis method based on multisource migration fusion shrinkage framework
CN112819059B (en) Rolling bearing fault diagnosis method based on popular retention transfer learning
CN110334764A (en) Rotating machinery intelligent failure diagnosis method based on integrated depth self-encoding encoder
CN111458142A (en) Sliding bearing fault diagnosis method based on generation of countermeasure network and convolutional neural network
CN109086793A (en) A kind of abnormality recognition method of wind-driven generator
CN110375987A (en) One kind being based on depth forest machines Bearing Fault Detection Method
CN111353373A (en) Correlation alignment domain adaptive fault diagnosis method
CN115221930B (en) Fault diagnosis method for rolling bearing
CN111275108A (en) Method for performing sample expansion on partial discharge data based on generation countermeasure network
CN112990259B (en) Early fault diagnosis method for rotary mechanical bearing based on improved transfer learning
CN112507479B (en) Oil drilling machine health state assessment method based on manifold learning and softmax
CN117113078A (en) Small sample bearing fault mode identification method and system based on multi-source data integration
CN109726770A (en) A kind of analog circuit fault testing and diagnosing method
CN115392323A (en) Bearing fault monitoring method and system based on cloud edge cooperation
CN113869451B (en) Rolling bearing fault diagnosis method under variable working condition based on improved JGSA algorithm
Lu et al. A zero-shot intelligent fault diagnosis system based on EEMD
CN118094218A (en) Self-adaptive depth migration fault diagnosis method, system, device and medium
CN117451360A (en) Variable-working-condition small-sample rolling bearing fault diagnosis method and system based on self-attention mechanism
CN116399592A (en) Bearing fault diagnosis method based on channel attention dual-path feature extraction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination