CN115496108A - Fault monitoring method and system based on manifold learning and big data analysis - Google Patents

Fault monitoring method and system based on manifold learning and big data analysis Download PDF

Info

Publication number
CN115496108A
CN115496108A CN202211223483.2A CN202211223483A CN115496108A CN 115496108 A CN115496108 A CN 115496108A CN 202211223483 A CN202211223483 A CN 202211223483A CN 115496108 A CN115496108 A CN 115496108A
Authority
CN
China
Prior art keywords
fault
signal
time domain
detected
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211223483.2A
Other languages
Chinese (zh)
Inventor
彭六保
胡勇
曾志生
邴奇
佟文杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aerospace Intelligent Control Beijing Monitoring Technology Co ltd
Original Assignee
Aerospace Intelligent Control Beijing Monitoring Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aerospace Intelligent Control Beijing Monitoring Technology Co ltd filed Critical Aerospace Intelligent Control Beijing Monitoring Technology Co ltd
Priority to CN202211223483.2A priority Critical patent/CN115496108A/en
Publication of CN115496108A publication Critical patent/CN115496108A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning

Abstract

The invention discloses a fault monitoring method and a system based on manifold learning and big data analysis, belonging to the technical field of fault data analysis, comprising the steps of obtaining a historical fault signal sample, carrying out statistical analysis and calculation on time domain mechanism characteristics of the historical fault signal sample, generating a historical signal fault characteristic set, extracting characteristics with discriminant to carry out dimension reduction, generating a historical signal fault characteristic subset, carrying out training calculation, generating a fault result classifier, carrying out vibration signal real-time acquisition by using a sensor, obtaining a signal sample to be detected, carrying out statistical analysis and calculation on the time domain mechanism characteristics of the signal sample to be detected, generating a signal characteristic set to be detected, extracting the characteristics with discriminant from the signal characteristic set to be detected to carry out dimension reduction, generating the signal characteristic subset to be detected, carrying out prediction judgment by using the fault result classifier, and obtaining a fault diagnosis result of the signal sample to be detected.

Description

Fault monitoring method and system based on manifold learning and big data analysis
Technical Field
The invention belongs to the technical field of fault data analysis, and particularly relates to a fault monitoring method and system based on manifold learning and big data analysis.
Background
In the mechanical fault diagnosis process, three steps are generally included: the method comprises the steps of state signal monitoring based on a sensor, signal feature extraction based on a signal analysis method and fault mode identification based on big data analysis. The current common method is to acquire the operation parameter information through the online monitoring equipment, and then to perform manual analysis and diagnosis after finding out the parameter abnormality. However, many large-scale devices have complex structures, many wearing parts, more relative movements between the structures, and complex stress on the structural parts, so that mechanical faults have the characteristics of diversity, strong correlation between the faults, and high complexity. These factors add difficulty to the manual diagnostic process, making the diagnostic process less timely in time and the accuracy of the diagnostic results largely dependent on the expertise of the diagnostician. If the diagnostic signal component is complex, and the experience of the diagnostician is insufficient, erroneous judgment may occur.
Fault diagnosis methods can be classified into model-based methods, signal-based methods, hybrid active methods, and data-driven methods. In which a fault diagnosis model is built by collecting a large amount of historical data based on a data-driven approach. In recent years, the collection of massive fault data also provides new opportunities for data-driven fault diagnosis methods and is receiving increasing attention from researchers and engineers.
Rotating machine fault diagnosis is essentially a process of pattern recognition of faults. In the current development of the technology, with the long-term development of data acquisition and data storage technologies, the feature dimension for describing a fault state is continuously increased, and meanwhile, a large amount of redundant information is also increased, which brings great difficulty to subsequent pattern recognition.
When extracting vibration signal characteristics, the traditional data-based driving method mostly assumes a linear structure. In the actual fault diagnosis of mechanical equipment, the more precise a monitoring control system is, the more the number of sensors is, the more indexes representing the running state of the equipment are, and the data describing the state by a plurality of variables are abstracted to be high-dimensional data. The high dimensional data provides extremely rich and detailed information about the state of the equipment, and the characteristic information is often non-linear, which brings great difficulty to subsequent fault diagnosis if the previous characteristic extraction method is directly used.
Disclosure of Invention
Problems to be solved
The invention provides a fault monitoring method and system based on manifold learning and big data analysis, aiming at the problems that the existing characteristic information is often nonlinear, and if the existing characteristic extraction method is directly used, great difficulty is brought to the subsequent fault diagnosis work.
Technical scheme
In order to solve the above problems, the present invention adopts the following technical solutions.
A fault monitoring method based on manifold learning and big data analysis adopts the following steps:
step 1: acquiring a historical fault signal sample, preprocessing the historical fault signal sample, performing statistical analysis and calculating time domain mechanism characteristics of the historical fault signal sample, and generating a historical signal fault characteristic set;
and 2, step: extracting discriminative features from the historical signal fault feature set by using an LPP algorithm for dimension reduction to generate a historical signal fault feature subset;
and step 3: training and calculating the historical signal fault feature subset by using a Boosting frame algorithm to obtain a training sample subset, then training the training sample subset to generate a base classifier, performing n rounds of training to generate n base classifiers, and performing weighted fusion on the n base classifiers to generate a fault result classifier;
and 4, step 4: the method comprises the steps of collecting vibration signals in real time by using a sensor, obtaining a signal sample to be detected, preprocessing the signal sample to be detected, carrying out statistical analysis and calculating time domain mechanism characteristics of the signal sample to be detected, and generating a signal characteristic set to be detected;
and 5: extracting characteristics with discriminability from the signal characteristic set to be detected by using an LPP algorithm to reduce the dimension and generate a signal characteristic subset to be detected;
step 6: and (4) performing prediction judgment on the signal feature subset to be detected by using a fault result classifier, and obtaining a fault diagnosis result of the signal sample to be detected.
Preferably, the time domain mechanism characteristics include a time domain mean, a time domain standard deviation, a time domain square root amplitude, a time domain root mean square value, a time domain peak, a time domain waveform indicator, a time domain peak indicator, a time domain pulse indicator, a time domain margin indicator, a time domain skewness and a time domain kurtosis.
Preferably, the time domain mean value
Figure BDA0003877919360000031
The calculation formula of (a) is as follows:
Figure BDA0003877919360000032
wherein x (N) is a time domain sequence of the signal, N =1,2, …, and N is the number of sample points;
the time domain standard deviation sigma x The calculation formula of (a) is as follows:
Figure BDA0003877919360000033
the time domain square root amplitude x r The calculation formula of (c) is as follows:
Figure BDA0003877919360000034
the time domain root mean square x rms The calculation formula of (a) is as follows:
Figure BDA0003877919360000041
the time domain peak x p The calculation formula of (c) is as follows:
x p =max|x(n)|
the calculation formula of the time domain waveform index W is as follows:
Figure BDA0003877919360000042
the calculation formula of the time domain peak index C is as follows:
Figure BDA0003877919360000043
the calculation formula of the time domain pulse index I is as follows:
Figure BDA0003877919360000044
the calculation formula of the time domain margin index L is as follows:
Figure BDA0003877919360000045
the calculation formula of the time domain skewness S is as follows:
Figure BDA0003877919360000046
the calculation formula of the time domain kurtosis K is as follows:
Figure BDA0003877919360000047
further, after the time domain mechanism feature statistics is calculated, the time domain mechanism feature needs to be subjected to scale normalization, and the calculation formula is as follows:
Figure BDA0003877919360000051
Figure BDA0003877919360000052
wherein z is data after scale normalization, x is original data,
Figure BDA0003877919360000053
is the mean of the raw data and δ is the variance of the raw data.
Furthermore, the LPP algorithm extracts the discriminant features for dimension reduction by establishing neighborhood information on a sample point, deriving a linear transformation, retaining the original local information in a subspace after dimension reduction, and setting a feature set matrix as X m*n M is the number of samples, n is the number of features, and first, at sample point x i To select its neighborhood point x j On the basis of which a weighted adjacency matrix W = (W) is established ij ) m*m
Figure BDA0003877919360000054
Wherein σ 2 Obtaining the distance between points through weight after obtaining the relation between sample points in the original space as a proportional parameter, wherein the weight is large, the distance is short, the weight is small, the distance is long, the linear equidistant transformation from the original high-dimensional sample space to the low-dimensional feature subspace is set, and Y = u T X, its corresponding solution is as follows:
Figure BDA0003877919360000055
wherein D is a diagonal matrix,
Figure BDA0003877919360000056
the corresponding solution is then as follows:
Figure BDA0003877919360000057
wherein L = D-W is Laplace matrix, and W = (W) ij ) n*n The corresponding solution is as follows:
XLX T u=λXDX T u
further, x i And x j Whether the mutual domain point is judged by using a k domain method, and only x is needed i And x j One of the two points is one of the k points with the shortest distance to the other, the two points are mutual domain points
Furthermore, the Boosting framework algorithm performs training calculation on the historical signal fault feature subset, and the generated fault result classifier is that when m training samples (x) are arranged in the historical signal fault feature subset 1 ,y 1 ),(x 2 ,y 2 ),...(x m ,y m ) Wherein x is i ∈X,y i E.g., Y = { -1, +1}, initialize sample weight D 1 (i) =1/m, T =1,. Eta., T, where T is the number of integrated models, D being distributed over the sample t Upper training base classifier h t The formula is as follows:
h t (X)→{-1,+1}
base classifier h t The weight vector calculation formula is as follows:
Figure BDA0003877919360000061
wherein epsilon t Is h t In distribution D t A classification error rate of;
updating sample weights
Figure BDA0003877919360000062
Wherein Z is t =∑ i D t (i)exp(-α t y i h t (x i ))
Fault result classifier
Figure BDA0003877919360000063
Still further, the classification error rate ε t The calculation formula is as follows:
Figure BDA0003877919360000064
and further, after the detection of the signal sample to be detected is finished, storing the self signal characteristic subset to be detected into the historical signal fault characteristic subset according to the fault diagnosis result, and updating the fault result classifier according to the updating of the historical signal fault characteristic subset data when the preset time is reached.
A fault monitoring system based on manifold learning and big data analysis, comprising:
the data storage module is used for storing the historical fault signal sample, the historical signal fault characteristic set and the historical signal fault characteristic subset and caching the signal sample to be detected, the signal characteristic set to be detected and the signal characteristic subset to be detected;
the statistical calculation module is used for performing statistical analysis and calculation on the historical fault signal sample and the signal sample to be detected to generate a historical signal fault characteristic set and a signal characteristic set to be detected;
the dimension reduction calculation module is used for extracting discriminative features from the historical signal fault feature set and the signal feature set to be detected to perform dimension reduction, and generating a historical signal fault feature subset and a signal feature subset to be detected;
the classification generation module is used for training and calculating the historical signal fault feature subset to generate a fault result classifier;
and the fault diagnosis module is used for predicting and judging the characteristic subset of the signal to be detected by using the fault result classifier to obtain a fault diagnosis result of the signal sample to be detected.
A fault monitoring method and a system based on manifold learning and big data analysis are characterized in that a historical fault signal sample is obtained, time domain mechanism characteristics of the historical fault signal sample are calculated through statistical analysis, a historical signal fault characteristic set is generated, characteristics with discriminability are extracted for dimension reduction, a historical signal fault characteristic subset is generated, training calculation is conducted on the historical signal fault characteristic subset, a fault result classifier is generated, a sensor is used for vibration signal real-time collection, a signal sample to be detected is obtained, time domain mechanism characteristics of the signal sample to be detected are calculated through statistical analysis, a signal characteristic set to be detected is generated, characteristics with discriminability are extracted from the signal characteristic set to be detected for dimension reduction, a signal characteristic subset to be detected is generated, a fault result classifier is used for prediction judgment on the signal characteristic subset to be detected, a fault diagnosis result of the signal sample to be detected is obtained, and an online fault monitoring analysis effect is improved.
Advantageous effects
Compared with the prior art, the invention has the beneficial effects that:
(1) According to the method, a low-dimensional manifold structure is recovered from high-dimensional sampling data through manifold learning, namely a low-dimensional manifold in a high-dimensional space is found, corresponding embedding mapping is solved, dimensionality reduction or data visualization is realized, and then an online fault monitoring and analyzing effect is improved by using a big data analysis method based on ensemble learning;
(2) The method can extract the most discriminative characteristic to reduce the dimension by locally keeping the local property of the manifold of the projection attention vibration signal in manifold learning, extracts the nonlinear characteristic of the complex vibration signal by utilizing the manifold learning method, and successfully extracts the impact characteristic of the fault signal submerged in the noise;
(3) The method comprises the steps of obtaining different training sample subsets through the operation of a Boosting framework on a training sample set, and training the sample subsets to generate a base classifier; thus, after the training round number n is given, n base classifiers can be generated, then the Boosting framework algorithm performs weighted fusion on the n base classifiers to generate a final result classifier, and a hybrid intelligent fault diagnosis technology is formed through combination, integration or fusion, so that the sensitivity, robustness and accuracy of a diagnosis and prediction system are improved, and the capability of solving complex problems is achieved.
Drawings
In order to more clearly illustrate the embodiments or exemplary technical solutions of the present application, the drawings needed to be used in the embodiments or exemplary descriptions will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application and therefore should not be considered as limiting the scope, and it is also possible for those skilled in the art to obtain other drawings according to the drawings without inventive efforts.
FIG. 1 is a schematic representation of the steps of the present invention;
FIG. 2 is a schematic diagram of the system of the present invention;
FIG. 3 is a diagram of the PCA variance cumulative contribution ratio in example 3 of the present invention;
fig. 4 is a diagram of extracting three types of sample features according to embodiment 3 of the present invention.
Detailed Description
To further clarify the objects, technical solutions and advantages of the embodiments of the present application, the embodiments of the present application will be described in detail and completely with reference to the accompanying drawings of the embodiments of the present application, it should be understood that the embodiments described are a part of the embodiments of the present application and not all of the embodiments, and that the components of the embodiments of the present application generally described and illustrated in the drawings herein can be arranged and designed in a variety of different configurations.
Therefore, the following detailed description of the embodiments of the present application, as presented in the figures, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application, and all other embodiments that can be derived by one of ordinary skill in the art based on the embodiments in the present application without making creative efforts fall within the scope of the claimed application.
Example 1
As shown in fig. 1, a fault monitoring method based on manifold learning and big data analysis adopts the following steps:
the method comprises the steps of obtaining a historical fault signal sample, preprocessing the historical fault signal sample, performing statistical analysis and calculation on time domain mechanism characteristics of the historical fault signal sample to generate a historical signal fault characteristic set, extracting characteristics with discriminability from the historical signal fault characteristic set by using an LPP algorithm to perform dimension reduction to generate a historical signal fault characteristic subset, performing training calculation on the historical signal fault characteristic subset by using a Boosting frame algorithm to obtain a training sample subset, then training the training sample subset to generate a base classifier, performing n rounds of training to generate n base classifiers, performing weighted fusion on the n base classifiers to generate a fault result classifier, performing vibration signal real-time acquisition by using a sensor to obtain a signal sample to be detected, preprocessing the signal sample to be detected, performing statistical analysis and calculation on the time domain mechanism characteristics of the signal sample to be detected to generate a signal characteristic set to be detected, extracting the characteristics with discriminability from the signal characteristic set to be detected by using the LPP algorithm to perform dimension reduction to generate the signal characteristic subset to be detected, detecting the signal characteristic subset by using the fault result classifier to perform prediction judgment, and obtaining a fault diagnosis result of the signal sample to be detected.
The method comprises the steps of firstly obtaining a historical fault signal sample, preprocessing the historical fault signal sample, carrying out statistical analysis and calculating time domain mechanism characteristics of the historical fault signal sample, and generating a historical signal fault characteristic set, wherein the time domain mechanism characteristics comprise a time domain mean value, a time domain standard deviation, a time domain square root amplitude, a time domain root mean square value, a time domain peak value, a time domain waveform index, a time domain peak value index, a time domain pulse index, a time domain margin index, a time domain skewness and a time domain kurtosis.
The time domain mean value
Figure BDA0003877919360000101
The calculation formula of (c) is as follows:
Figure BDA0003877919360000102
wherein x (N) is a time domain sequence of the signal, N =1,2, …, and N is the number of sample points;
the time domain standard deviation sigma x The calculation formula of (a) is as follows:
Figure BDA0003877919360000103
the time domain square root amplitude x r The calculation formula of (a) is as follows:
Figure BDA0003877919360000104
the time domain root mean square x rms The calculation formula of (a) is as follows:
Figure BDA0003877919360000105
the time domain peak value x p The calculation formula of (a) is as follows:
x p =max|x(n)|
the calculation formula of the time domain waveform index W is as follows:
Figure BDA0003877919360000111
the calculation formula of the time domain peak index C is as follows:
Figure BDA0003877919360000112
the calculation formula of the time domain pulse index I is as follows:
Figure BDA0003877919360000113
the calculation formula of the time domain margin index L is as follows:
Figure BDA0003877919360000114
the calculation formula of the time domain skewness S is as follows:
Figure BDA0003877919360000115
the calculation formula of the time domain kurtosis K is as follows:
Figure BDA0003877919360000116
after the time domain mechanism feature statistics is calculated, scale normalization processing needs to be performed on the time domain mechanism feature, and the calculation formula is as follows:
Figure BDA0003877919360000117
Figure BDA0003877919360000118
wherein z is data after scale normalization, x is original data,
Figure BDA0003877919360000119
is the mean of the raw data and δ is the variance of the raw data.
Extracting discriminative features from a historical signal fault feature set by using an LPP algorithm for dimension reduction to generate a historical signal fault feature subset, wherein the step of extracting discriminative features by using the LPP algorithm for dimension reduction is to establish neighborhood information on a sample point, derive a linear transformation, retain original local information in a subspace after dimension reduction and set a feature set matrix as X m*n M is the number of samples, n is the number of features, and first, at sample point x i To select its neighborhood point x j ,x i And x j Whether the mutual domain point is judged by using a k domain method, and only x is needed i And x j When one of the k points is the shortest distance from the other, the two points are domain points.
On the basis of the above-mentioned operation, a weighted adjacency matrix W = (W) is established ij ) m*m
Figure BDA0003877919360000121
Wherein sigma 2 Obtaining the distance between points through weight after obtaining the relation between sample points in the original space as a proportional parameter, wherein the weight is great, the distance is short, the weight is small, the distance is long, and the linear equidistant transformation from the original high-dimensional sample space to the low-dimensional feature subspace is set to ensure that Y = u T X, which corresponds to the following solution:
Figure BDA0003877919360000122
wherein D is a diagonal matrix,
Figure BDA0003877919360000123
the corresponding solution is then as follows:
Figure BDA0003877919360000124
wherein L = D-W is Laplace matrix, and W = (W) ij ) n*n The corresponding solution is as follows:
XLX T u=λXDX T u
training and calculating the historical signal fault feature subset by using a Boosting framework algorithm to obtain a training sample subset, then training the training sample subset to generate a base classifier, performing n rounds of training to generate n base classifiers, performing weighted fusion on the n base classifiers to generate a fault result classifier, performing training and calculating on the historical signal fault feature subset by using the Boosting framework algorithm, and generating the fault result classifier when m training samples (x) are arranged in the historical signal fault feature subset 1 ,y 1 ),(x 2 ,y 2 ),...(x m ,y m ) Wherein x is i ∈X,y i E.g., Y = { -1, +1}, initialize sample weight D 1 (i) T =1/m, T = 1.. Times, T, where T is the number of integrated models, and D is distributed over the sample t Upper training base classifier h t The formula is as follows:
h t (X)→{-1,+1}
base classifier h t The weight vector calculation formula is as follows:
Figure BDA0003877919360000131
wherein epsilon t Is h t In distribution D t A classification error rate of t The calculation formula is as follows:
Figure BDA0003877919360000132
updating sample weights
Figure BDA0003877919360000133
Wherein, Z t =∑ i D t (i)exp(-α t y i h t (x i ))
Fault result classifier
Figure BDA0003877919360000134
The method comprises the steps of collecting vibration signals in real time by using a sensor, obtaining a signal sample to be detected, preprocessing the signal sample to be detected, carrying out statistical analysis and calculating time domain mechanism characteristics of the signal sample to be detected, generating a signal characteristic set to be detected, extracting discriminative characteristics from the signal characteristic set to be detected by using an LPP algorithm, reducing dimensions, and generating a signal characteristic subset to be detected.
And predicting and judging the to-be-detected signal feature subset by using a fault result classifier to obtain a fault diagnosis result of the to-be-detected signal sample, storing the to-be-detected signal feature subset of the to-be-detected signal sample to the historical signal fault feature subset according to the fault diagnosis result after the to-be-detected signal sample is detected, and updating the fault result classifier according to the updating of the historical signal fault feature subset data when the preset time is reached.
As can be seen from the above description, in this example, the historical fault signal sample is obtained, the time domain mechanism features of the historical fault signal sample are statistically analyzed and calculated, the historical signal fault feature set is generated, the discriminative features are extracted for dimension reduction, the historical signal fault feature subset is generated, the training calculation is performed on the historical signal fault feature subset, the fault result classifier is generated, the sensor is used for collecting the vibration signal in real time, the signal sample to be detected is obtained, the time domain mechanism features of the signal sample to be detected are statistically analyzed and calculated, the signal feature set to be detected is generated, the discriminative features are extracted from the signal feature set to be detected for dimension reduction, the signal feature subset to be detected is generated, the fault result classifier is used for prediction and judgment on the signal feature subset to be detected, and the fault diagnosis result of the signal sample to be detected is obtained.
Example 2
A fault monitoring method based on manifold learning and big data analysis is used for carrying out state online monitoring on frequency domain information of rotating equipment, predicting the running trend of the equipment in time and finding out fault hidden dangers of the equipment. The method adopts discrete Fourier transform to carry out spectrum analysis on the vibration signal of the rotating equipment, compares the frequency spectrum analysis with certain fault characteristic parameters, can accurately find out the abnormal vibration component of the rotating equipment, judges the running state and the intact state of the equipment, and improves the service level of the equipment.
In the actual diagnosis process, in order to make the diagnosis accurate and reliable, as much fault sample data as possible is always collected to obtain enough fault information. While the large amount of data provides useful information, it also increases the difficulty of efficiently utilizing such data, and useful knowledge can be overwhelmed in redundant data, which in turn increases the difficulty of feature extraction. The redundancy of sample data makes the classifier complex, the discrimination of classification calculation is large, and the precision of classification may be affected.
In the fault mode classification problem, it is desirable that the extracted feature subset is beneficial to distinguish different data classes, and the classification error caused by the reduction of the data dimension is controlled within a small range. Therefore, a subset of state-sensitive features is selected from the fault feature set, leaving the feature component that contributes most to the fault to be diagnosed. The method and the device can extract the most discriminant feature to perform dimension reduction by utilizing the local property of the manifold of the vibration signal concerned by the Local Preserving Projection (LPP) in manifold learning, thereby having obvious advantage when the local feature is preserved. The basic idea of LPP is to first establish neighborhood information at a sample point and then derive a linear transformation from this, so that the original local information is preserved in the reduced-dimension subspace.
Let the sample matrix be X m*n M is the number of samples, n is the number of features, and first, at sample point x i To select its neighborhood point x j Generally, the following two methods are used to determine whether two points are neighboring points:
an epsilon field method: i.e. | | x i -x j || 2 <When epsilon is generated, the two are domain points;
k field method: only needs x i And x j When one of the two points is one of the k points with the shortest distance to the other, the two points are domain points.
On the basis of the above-mentioned operation, a weighted adjacency matrix W = (W) ij ) m*m
Figure BDA0003877919360000151
Wherein sigma 2 Obtaining the distance between points through weight after obtaining the relation between sample points in the original space as a proportional parameter, wherein the weight is great, the distance is short, the weight is small, the distance is long, and the linear equidistant transformation from the original high-dimensional sample space to the low-dimensional feature subspace is set to ensure that Y = u T X, which corresponds to the following solution:
Figure BDA0003877919360000152
wherein D is a diagonal matrix,
Figure BDA0003877919360000153
the corresponding solution is then as follows:
Figure BDA0003877919360000154
wherein L = D-W is Laplace matrix, and W = (W) ij ) n*n The corresponding solution is as follows:
XLX T u=λXDX T u
the basis of LPP is to construct a nearest neighbor graph that models the local structure of the high-dimensional dataset, which can model the geometry of the dataset well in case the distribution of the detection set is relatively uniform. However, the detection set is generated by random sampling, the distribution is unknown (and the samples are usually insufficient), and it is not possible to ensure that the local structure of the manifold is always accurately represented. In this case, the transformation matrix we get may not be optimal, i.e. there may be large deviations in the mapping.
When the traditional single artificial intelligence method breaks down on modern large-scale complex equipment, the defects of low precision, poor universality and the like exist. Aiming at the difficulties, different artificial intelligence technologies can be combined, integrated or fused in a certain mode to form a hybrid intelligent fault diagnosis technology, so that the sensitivity, robustness and accuracy of a diagnosis and prediction system are improved, and the capability of solving complex problems is achieved.
Boosting is a method for improving the accuracy of a weak classification algorithm, different training sample subsets are obtained through the operation of a Boosting framework on a training sample set, and the sample subsets are used for training to generate a base classifier; thus, after the number n of training rounds is given, n base classifiers can be generated, and then the Boosting framework algorithm performs weighted fusion on the n base classifiers to generate a final result classifier. The core idea of the algorithm is as follows: and increasing the weight of the last misclassified sample and reducing the weight of the last correctly classified sample. In the n base classifiers, the recognition rate of each single classifier is not necessarily high, but the combined result has high recognition rate, so that the recognition rate of the weak classification algorithm is improved.
Example 3
The method is characterized in that an intelligent operation and maintenance big data cloud platform of an aerospace intelligent control monitoring technology company Limited is adopted to collect low-pressure pump rotor operation vibration data (the number of sampling points is 8192, the sampling frequency is 1280 Hz) in real time, and sample information is as follows:
Figure BDA0003877919360000161
Figure BDA0003877919360000171
three types of vibration data are included: normal sample, unbalanced, misaligned. Extracting 11 mechanical characteristics from the three types of time domain vibration data, wherein a calculation formula of the mechanical characteristics is 11 time domain mechanical characteristics shown in embodiment 1;
different dimensions and dimension units exist among the 11 physical characteristic indexes, which affect the result of data analysis, and in order to eliminate the dimension influence, data standardization processing is needed to solve the comparability among the data indexes. After the raw data are subjected to data standardization processing, all indexes are in the same order of magnitude, and the method is suitable for comprehensive comparison and evaluation. The raw data set was normalized to a mean 0, variance 1 data set.
The feature extraction method and the classifier are respectively an LLP feature extraction method and a Boosting classifier.
The three types of labeled vibration samples are analyzed by PCA, as shown in FIG. 3, it can be seen from the figure that the cumulative contribution rate of the first three feature components of PCA is less than 90%, so that it is difficult to distinguish them in 3-dimensional space, which indicates that the samples may be nonlinear structures in high-dimensional space.
Since the samples are not simple linear structures in the original feature space, manifold Learning (LPP) based on nonlinear feature extraction is more effective than the linear feature extraction method (PCA), as shown in fig. 4.
To compare the fault diagnosis results, 0.75 training, 0.25 test was randomly selected for each type of sample. The results were averaged 10 times and the classification accuracy of each type of sample was tested as shown in the following table:
Figure BDA0003877919360000172
because the sample is not a simple linear structure in a high-dimensional space, the traditional linear feature extraction methods such as PCA (principal component analysis) are poor in analysis effect (the average accuracy of classification of normal samples is about 82%), but the LPP method is used for extracting nonlinear information of vibration signals, and meanwhile, the integrated learning method Boosting is combined to effectively improve the fault diagnosis effect: the correct rates for normal, unbalanced and misaligned faults are 98.8%, 98.7% and 96.6%, respectively, which illustrates that the method can be used for better predictive maintenance of equipment.
The method has the advantages that the operation vibration data of the low-pressure pump rotor is collected in real time, the signal time domain mechanism characteristics and manifold statistical characteristics of the data are extracted, the fault diagnosis prediction model is established by combining an integrated learning boosting machine learning algorithm, and example results show that the method can realize intelligent identification and alarm of equipment faults, reduce a large amount of routing inspection work such as artificial judgment in the operation and maintenance process of the equipment, and improve the capacity of intelligent management of the whole life cycle of the equipment.
Example 4
As shown in fig. 2, a fault monitoring system based on manifold learning and big data analysis includes:
the data storage module is used for storing the historical fault signal sample, the historical signal fault characteristic set and the historical signal fault characteristic subset and caching the signal sample to be detected, the signal characteristic set to be detected and the signal characteristic subset to be detected;
the statistical calculation module is used for performing statistical analysis and calculation on the historical fault signal sample and the signal sample to be detected to generate a historical signal fault characteristic set and a signal characteristic set to be detected;
the dimension reduction calculation module is used for extracting discriminative features from the historical signal fault feature set and the signal feature set to be detected to perform dimension reduction, and generating a historical signal fault feature subset and a signal feature subset to be detected;
the classification generation module is used for training and calculating the historical signal fault feature subset to generate a fault result classifier;
and the fault diagnosis module is used for predicting and judging the characteristic subset of the signal to be detected by using the fault result classifier to obtain a fault diagnosis result of the signal sample to be detected.
As can be seen from the above description, in this embodiment, the historical fault signal sample, the historical signal fault feature set, and the historical signal fault feature subset are stored in the data storage module, the signal sample to be detected, the signal feature set to be detected, and the signal feature subset to be detected are cached, the statistical calculation module performs statistical analysis and calculation to generate the historical signal fault feature set and the signal feature set to be detected, the dimension reduction calculation module extracts discriminative features to perform dimension reduction to generate the historical signal fault feature subset and the signal feature subset to be detected, the classification generation module performs training calculation to generate the fault result classifier, and the fault diagnosis module performs prediction judgment on the signal feature subset to be detected to obtain the fault diagnosis result of the signal sample to be detected.
The above examples are merely representative of preferred embodiments of the present invention, and the description thereof is more specific and detailed, but not to be construed as limiting the scope of the present invention. It should be noted that various changes, modifications and substitutions may be made by those skilled in the art without departing from the spirit of the invention, and all are intended to be included within the scope of the invention.

Claims (10)

1. A fault monitoring method based on manifold learning and big data analysis is characterized by comprising the following steps:
step 1: acquiring a historical fault signal sample, preprocessing the historical fault signal sample, performing statistical analysis and calculating time domain mechanism characteristics of the historical fault signal sample, and generating a historical signal fault characteristic set;
step 2: extracting discriminative features from the historical signal fault feature set by using an LPP algorithm for dimension reduction to generate a historical signal fault feature subset;
and step 3: training and calculating the historical signal fault feature subset by using a Boosting frame algorithm to obtain a training sample subset, then training the training sample subset to generate a base classifier, performing n rounds of training to generate n base classifiers, and performing weighted fusion on the n base classifiers to generate a fault result classifier;
and 4, step 4: the method comprises the steps of collecting vibration signals in real time by using a sensor, obtaining a signal sample to be detected, preprocessing the signal sample to be detected, performing statistical analysis and calculating time domain mechanism characteristics of the signal sample to be detected, and generating a signal characteristic set to be detected;
and 5: extracting characteristics with discriminability from the signal characteristic set to be detected by using an LPP algorithm to reduce the dimension and generate a signal characteristic subset to be detected;
step 6: and predicting and judging the characteristic subset of the signal to be detected by using a fault result classifier to obtain a fault diagnosis result of the signal sample to be detected.
2. The fault monitoring method based on manifold learning and big data analysis as claimed in claim 1, wherein: the time domain mechanism characteristics comprise a time domain mean value, a time domain standard deviation, a time domain square root amplitude, a time domain root mean square value, a time domain peak value, a time domain waveform index, a time domain peak value index, a time domain pulse index, a time domain margin index, a time domain skewness and a time domain kurtosis.
3. The fault monitoring method based on manifold learning and big data analysis as claimed in claim 2, wherein: the time domain mean value
Figure FDA0003877919350000011
The calculation formula of (a) is as follows:
Figure FDA0003877919350000021
wherein x (N) is a time domain sequence of the signal, N =1,2, …, and N is the number of sample points;
the time domain standard deviation sigma x The calculation formula of (a) is as follows:
Figure FDA0003877919350000022
the time domain square root amplitude x r The calculation formula of (a) is as follows:
Figure FDA0003877919350000023
the time domain root mean square x rms The calculation formula of (a) is as follows:
Figure FDA0003877919350000024
the time domain peak value x p The calculation formula of (a) is as follows:
x p =max|x(n)|
the calculation formula of the time domain waveform index W is as follows:
Figure FDA0003877919350000025
the calculation formula of the time domain peak index C is as follows:
Figure FDA0003877919350000026
the calculation formula of the time domain pulse index I is as follows:
Figure FDA0003877919350000031
the calculation formula of the time domain margin index L is as follows:
Figure FDA0003877919350000032
the calculation formula of the time domain skewness S is as follows:
Figure FDA0003877919350000033
the calculation formula of the time domain kurtosis K is as follows:
Figure FDA0003877919350000034
4. the fault monitoring method based on manifold learning and big data analysis as claimed in claim 3, wherein: after the time domain mechanism characteristic statistics is calculated, scale normalization processing needs to be performed on the time domain mechanism characteristic, and the calculation formula is as follows:
Figure FDA0003877919350000035
Figure FDA0003877919350000036
wherein z is data after scale normalization, x is original data,
Figure FDA0003877919350000037
is the mean of the raw data, and δ is the variance of the raw data。
5. The fault monitoring method based on manifold learning and big data analysis as claimed in claim 4, wherein: the LPP algorithm extracts the characteristic with discriminant to reduce the dimension, namely, firstly establishing neighborhood information on a sample point, deriving a linear transformation, retaining the original local information in a subspace after dimension reduction, and setting a characteristic set matrix as X m*n M is the number of samples, n is the number of features, and first, at sample point x i To select its neighborhood point x j On the basis of the above-mentioned formula, a weighted adjacency matrix W = (W) ij ) m*m
Figure FDA0003877919350000041
Wherein sigma 2 Obtaining the distance between points through weight after obtaining the relation between sample points in the original space as a proportional parameter, wherein the weight is great, the distance is short, the weight is small, the distance is long, and the linear equidistant transformation from the original high-dimensional sample space to the low-dimensional feature subspace is set to ensure that Y = u T X, which corresponds to the following solution:
Figure FDA0003877919350000042
wherein D is a diagonal matrix,
Figure FDA0003877919350000043
the corresponding solution is then as follows:
Figure FDA0003877919350000044
wherein L = D-W is Laplace matrix, and W = (W) ij ) n*n The corresponding solution is as follows:
XLX T u=λXDX T u。
6. the fault monitoring method based on manifold learning and big data analysis as claimed in claim 5, wherein: x is the number of i And x j Whether the mutual domain point is judged by using a k domain method, and only x is needed i And x j When one of the k points is the shortest distance from the other, the two points are domain points.
7. The fault monitoring method based on manifold learning and big data analysis as claimed in claim 6, wherein: the Boosting frame algorithm is used for training and calculating the historical signal fault feature subset, and the generated fault result classifier is that when the historical signal fault feature subset is internally provided with m training samples (x) 1 ,y 1 ),(x 2 ,y 2 ),...(x m ,y m ) Wherein x is i ∈X,y i E Y = { -1, +1}, initialize sample weight D 1 (i) =1/m, T =1,. Eta., T, where T is the number of integrated models, D being distributed over the sample t Upper training base classifier h t The formula is as follows:
h t (X)→{-1,+1}
base classifier h t The weight vector calculation formula is as follows:
Figure FDA0003877919350000051
wherein epsilon t Is h t In distribution D t A classification error rate of;
updating sample weights
Figure FDA0003877919350000052
Wherein Z is t =∑ i D t (i)exp(-α t y i h t (x i ))
Fault result classifier
Figure FDA0003877919350000053
8. The fault monitoring method based on manifold learning and big data analysis as claimed in claim 7, wherein: the classification error rate ε t The calculation formula is as follows:
Figure FDA0003877919350000054
9. the fault monitoring method based on manifold learning and big data analysis as claimed in claim 8, wherein: after the detection of the signal sample to be detected is completed, the signal characteristic subset to be detected is stored in the historical signal fault characteristic subset according to the fault diagnosis result, and when the preset time is reached, the fault result classifier is updated according to the updating of the historical signal fault characteristic subset data.
10. A fault monitoring system based on manifold learning and big data analysis, comprising:
the data storage module is used for storing the historical fault signal sample, the historical signal fault feature set and the historical signal fault feature subset, and caching the signal sample to be detected, the signal feature set to be detected and the signal feature subset to be detected;
the statistical calculation module is used for performing statistical analysis and calculation on the historical fault signal sample and the signal sample to be detected to generate a historical signal fault characteristic set and a signal characteristic set to be detected;
the dimension reduction calculation module is used for extracting discriminative features from the historical signal fault feature set and the signal feature set to be detected to perform dimension reduction, and generating a historical signal fault feature subset and a signal feature subset to be detected;
the classification generation module is used for training and calculating the historical signal fault feature subset to generate a fault result classifier;
and the fault diagnosis module is used for predicting and judging the characteristic subset of the signal to be detected by using the fault result classifier to obtain a fault diagnosis result of the signal sample to be detected.
CN202211223483.2A 2022-10-08 2022-10-08 Fault monitoring method and system based on manifold learning and big data analysis Pending CN115496108A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211223483.2A CN115496108A (en) 2022-10-08 2022-10-08 Fault monitoring method and system based on manifold learning and big data analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211223483.2A CN115496108A (en) 2022-10-08 2022-10-08 Fault monitoring method and system based on manifold learning and big data analysis

Publications (1)

Publication Number Publication Date
CN115496108A true CN115496108A (en) 2022-12-20

Family

ID=84472830

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211223483.2A Pending CN115496108A (en) 2022-10-08 2022-10-08 Fault monitoring method and system based on manifold learning and big data analysis

Country Status (1)

Country Link
CN (1) CN115496108A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117032828A (en) * 2023-08-14 2023-11-10 江苏智先生信息科技有限公司 Batch automatic configuration method for customized special system platform
CN117435981A (en) * 2023-12-22 2024-01-23 四川泓宝润业工程技术有限公司 Method and device for diagnosing operation faults of machine pump equipment, storage medium and electronic equipment

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117032828A (en) * 2023-08-14 2023-11-10 江苏智先生信息科技有限公司 Batch automatic configuration method for customized special system platform
CN117032828B (en) * 2023-08-14 2024-01-26 江苏智先生信息科技有限公司 Batch automatic configuration method for customized special system platform
CN117435981A (en) * 2023-12-22 2024-01-23 四川泓宝润业工程技术有限公司 Method and device for diagnosing operation faults of machine pump equipment, storage medium and electronic equipment
CN117435981B (en) * 2023-12-22 2024-03-01 四川泓宝润业工程技术有限公司 Method and device for diagnosing operation faults of machine pump equipment, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
CN111353482B (en) LSTM-based fatigue factor recessive anomaly detection and fault diagnosis method
CN115496108A (en) Fault monitoring method and system based on manifold learning and big data analysis
US11840998B2 (en) Hydraulic turbine cavitation acoustic signal identification method based on big data machine learning
CN111539553B (en) Wind turbine generator fault early warning method based on SVR algorithm and off-peak degree
CN110263846A (en) The method for diagnosing faults for being excavated and being learnt based on fault data depth
CN112257530B (en) Rolling bearing fault diagnosis method based on blind signal separation and support vector machine
CN101738998B (en) System and method for monitoring industrial process based on local discriminatory analysis
CN111580506A (en) Industrial process fault diagnosis method based on information fusion
CN112507479B (en) Oil drilling machine health state assessment method based on manifold learning and softmax
CN116380445B (en) Equipment state diagnosis method and related device based on vibration waveform
CN113376516A (en) Medium-voltage vacuum circuit breaker operation fault self-diagnosis and early-warning method based on deep learning
CN114118219A (en) Data-driven real-time abnormal detection method for health state of long-term power-on equipment
CN116593157A (en) Complex working condition gear fault diagnosis method based on matching element learning under small sample
CN115877205A (en) Intelligent fault diagnosis system and method for servo motor
CN115792681A (en) Single battery consistency detection algorithm based on Internet of vehicles big data platform
CN115758083A (en) Motor bearing fault diagnosis method based on time domain and time-frequency domain fusion
CN116108371B (en) Cloud service abnormity diagnosis method and system based on cascade abnormity generation network
CN116204825A (en) Production line equipment fault detection method based on data driving
CN109000924B (en) Method for monitoring state of ball screw pair based on K mean value
CN116383747A (en) Anomaly detection method for generating countermeasure network based on multi-time scale depth convolution
CN114004059B (en) Health portrait method for hydroelectric generating set
CN112069621B (en) Method for predicting residual service life of rolling bearing based on linear reliability index
CN114137915A (en) Fault diagnosis method for industrial equipment
Zhu et al. Depth prototype clustering method based on unsupervised field alignment for bearing fault identification of mechanical equipment
Guo et al. A Hybrid clustering method for bridge structure health monitoring

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination