CN115496108A

CN115496108A - Fault monitoring method and system based on manifold learning and big data analysis

Info

Publication number: CN115496108A
Application number: CN202211223483.2A
Authority: CN
Inventors: 彭六保; 胡勇; 曾志生; 邴奇; 佟文杰
Original assignee: Aerospace Intelligent Control Beijing Monitoring Technology Co ltd
Current assignee: Aerospace Intelligent Control Beijing Monitoring Technology Co ltd
Priority date: 2022-10-08
Filing date: 2022-10-08
Publication date: 2022-12-20

Abstract

The invention discloses a fault monitoring method and a system based on manifold learning and big data analysis, belonging to the technical field of fault data analysis, comprising the steps of obtaining a historical fault signal sample, carrying out statistical analysis and calculation on time domain mechanism characteristics of the historical fault signal sample, generating a historical signal fault characteristic set, extracting characteristics with discriminant to carry out dimension reduction, generating a historical signal fault characteristic subset, carrying out training calculation, generating a fault result classifier, carrying out vibration signal real-time acquisition by using a sensor, obtaining a signal sample to be detected, carrying out statistical analysis and calculation on the time domain mechanism characteristics of the signal sample to be detected, generating a signal characteristic set to be detected, extracting the characteristics with discriminant from the signal characteristic set to be detected to carry out dimension reduction, generating the signal characteristic subset to be detected, carrying out prediction judgment by using the fault result classifier, and obtaining a fault diagnosis result of the signal sample to be detected.

Description

Fault monitoring method and system based on manifold learning and big data analysis

Technical Field

The invention belongs to the technical field of fault data analysis, and particularly relates to a fault monitoring method and system based on manifold learning and big data analysis.

Background

In the mechanical fault diagnosis process, three steps are generally included: the method comprises the steps of state signal monitoring based on a sensor, signal feature extraction based on a signal analysis method and fault mode identification based on big data analysis. The current common method is to acquire the operation parameter information through the online monitoring equipment, and then to perform manual analysis and diagnosis after finding out the parameter abnormality. However, many large-scale devices have complex structures, many wearing parts, more relative movements between the structures, and complex stress on the structural parts, so that mechanical faults have the characteristics of diversity, strong correlation between the faults, and high complexity. These factors add difficulty to the manual diagnostic process, making the diagnostic process less timely in time and the accuracy of the diagnostic results largely dependent on the expertise of the diagnostician. If the diagnostic signal component is complex, and the experience of the diagnostician is insufficient, erroneous judgment may occur.

Fault diagnosis methods can be classified into model-based methods, signal-based methods, hybrid active methods, and data-driven methods. In which a fault diagnosis model is built by collecting a large amount of historical data based on a data-driven approach. In recent years, the collection of massive fault data also provides new opportunities for data-driven fault diagnosis methods and is receiving increasing attention from researchers and engineers.

Rotating machine fault diagnosis is essentially a process of pattern recognition of faults. In the current development of the technology, with the long-term development of data acquisition and data storage technologies, the feature dimension for describing a fault state is continuously increased, and meanwhile, a large amount of redundant information is also increased, which brings great difficulty to subsequent pattern recognition.

When extracting vibration signal characteristics, the traditional data-based driving method mostly assumes a linear structure. In the actual fault diagnosis of mechanical equipment, the more precise a monitoring control system is, the more the number of sensors is, the more indexes representing the running state of the equipment are, and the data describing the state by a plurality of variables are abstracted to be high-dimensional data. The high dimensional data provides extremely rich and detailed information about the state of the equipment, and the characteristic information is often non-linear, which brings great difficulty to subsequent fault diagnosis if the previous characteristic extraction method is directly used.

Disclosure of Invention

Problems to be solved

The invention provides a fault monitoring method and system based on manifold learning and big data analysis, aiming at the problems that the existing characteristic information is often nonlinear, and if the existing characteristic extraction method is directly used, great difficulty is brought to the subsequent fault diagnosis work.

Technical scheme

In order to solve the above problems, the present invention adopts the following technical solutions.

A fault monitoring method based on manifold learning and big data analysis adopts the following steps:

step 1: acquiring a historical fault signal sample, preprocessing the historical fault signal sample, performing statistical analysis and calculating time domain mechanism characteristics of the historical fault signal sample, and generating a historical signal fault characteristic set;

and 2, step: extracting discriminative features from the historical signal fault feature set by using an LPP algorithm for dimension reduction to generate a historical signal fault feature subset;

and step 3: training and calculating the historical signal fault feature subset by using a Boosting frame algorithm to obtain a training sample subset, then training the training sample subset to generate a base classifier, performing n rounds of training to generate n base classifiers, and performing weighted fusion on the n base classifiers to generate a fault result classifier;

and 4, step 4: the method comprises the steps of collecting vibration signals in real time by using a sensor, obtaining a signal sample to be detected, preprocessing the signal sample to be detected, carrying out statistical analysis and calculating time domain mechanism characteristics of the signal sample to be detected, and generating a signal characteristic set to be detected;

and 5: extracting characteristics with discriminability from the signal characteristic set to be detected by using an LPP algorithm to reduce the dimension and generate a signal characteristic subset to be detected;

step 6: and (4) performing prediction judgment on the signal feature subset to be detected by using a fault result classifier, and obtaining a fault diagnosis result of the signal sample to be detected.

Preferably, the time domain mechanism characteristics include a time domain mean, a time domain standard deviation, a time domain square root amplitude, a time domain root mean square value, a time domain peak, a time domain waveform indicator, a time domain peak indicator, a time domain pulse indicator, a time domain margin indicator, a time domain skewness and a time domain kurtosis.

Preferably, the time domain mean value

The calculation formula of (a) is as follows:

wherein x (N) is a time domain sequence of the signal, N =1,2, …, and N is the number of sample points;

the time domain standard deviation sigma _x The calculation formula of (a) is as follows:

the time domain square root amplitude x _r The calculation formula of (c) is as follows:

the time domain root mean square x _rms The calculation formula of (a) is as follows:

the time domain peak x _p The calculation formula of (c) is as follows:

x _p ＝max|x(n)|

the calculation formula of the time domain waveform index W is as follows:

the calculation formula of the time domain peak index C is as follows:

the calculation formula of the time domain pulse index I is as follows:

the calculation formula of the time domain margin index L is as follows:

the calculation formula of the time domain skewness S is as follows:

the calculation formula of the time domain kurtosis K is as follows:

further, after the time domain mechanism feature statistics is calculated, the time domain mechanism feature needs to be subjected to scale normalization, and the calculation formula is as follows:

wherein z is data after scale normalization, x is original data,

is the mean of the raw data and δ is the variance of the raw data.

Furthermore, the LPP algorithm extracts the discriminant features for dimension reduction by establishing neighborhood information on a sample point, deriving a linear transformation, retaining the original local information in a subspace after dimension reduction, and setting a feature set matrix as X _m*n M is the number of samples, n is the number of features, and first, at sample point x _i To select its neighborhood point x _j On the basis of which a weighted adjacency matrix W = (W) is established _ij ) _m*m ，

Wherein σ ² Obtaining the distance between points through weight after obtaining the relation between sample points in the original space as a proportional parameter, wherein the weight is large, the distance is short, the weight is small, the distance is long, the linear equidistant transformation from the original high-dimensional sample space to the low-dimensional feature subspace is set, and Y = u ^T X, its corresponding solution is as follows:

wherein D is a diagonal matrix,

the corresponding solution is then as follows:

wherein L = D-W is Laplace matrix, and W = (W) _ij ) _n*n The corresponding solution is as follows:

XLX ^T u＝λXDX ^T u

further, x _i And x _j Whether the mutual domain point is judged by using a k domain method, and only x is needed _i And x _j One of the two points is one of the k points with the shortest distance to the other, the two points are mutual domain points

Furthermore, the Boosting framework algorithm performs training calculation on the historical signal fault feature subset, and the generated fault result classifier is that when m training samples (x) are arranged in the historical signal fault feature subset ₁ ，y ₁ )，(x ₂ ，y ₂ )，...(x _m ，y _m ) Wherein x is _i ∈X，y _i E.g., Y = { -1, +1}, initialize sample weight D ₁ (i) =1/m, T =1,. Eta., T, where T is the number of integrated models, D being distributed over the sample _t Upper training base classifier h _t The formula is as follows:

h _t (X)→{-1,+1}

base classifier h _t The weight vector calculation formula is as follows:

wherein epsilon _t Is h _t In distribution D _t A classification error rate of;

updating sample weights

Wherein Z is _t ＝∑ _i D _t (i)exp(-α _t y _i h _t (x _i ))

Fault result classifier

Still further, the classification error rate ε _t The calculation formula is as follows:

and further, after the detection of the signal sample to be detected is finished, storing the self signal characteristic subset to be detected into the historical signal fault characteristic subset according to the fault diagnosis result, and updating the fault result classifier according to the updating of the historical signal fault characteristic subset data when the preset time is reached.

A fault monitoring system based on manifold learning and big data analysis, comprising:

the data storage module is used for storing the historical fault signal sample, the historical signal fault characteristic set and the historical signal fault characteristic subset and caching the signal sample to be detected, the signal characteristic set to be detected and the signal characteristic subset to be detected;

the statistical calculation module is used for performing statistical analysis and calculation on the historical fault signal sample and the signal sample to be detected to generate a historical signal fault characteristic set and a signal characteristic set to be detected;

the dimension reduction calculation module is used for extracting discriminative features from the historical signal fault feature set and the signal feature set to be detected to perform dimension reduction, and generating a historical signal fault feature subset and a signal feature subset to be detected;

the classification generation module is used for training and calculating the historical signal fault feature subset to generate a fault result classifier;

and the fault diagnosis module is used for predicting and judging the characteristic subset of the signal to be detected by using the fault result classifier to obtain a fault diagnosis result of the signal sample to be detected.

A fault monitoring method and a system based on manifold learning and big data analysis are characterized in that a historical fault signal sample is obtained, time domain mechanism characteristics of the historical fault signal sample are calculated through statistical analysis, a historical signal fault characteristic set is generated, characteristics with discriminability are extracted for dimension reduction, a historical signal fault characteristic subset is generated, training calculation is conducted on the historical signal fault characteristic subset, a fault result classifier is generated, a sensor is used for vibration signal real-time collection, a signal sample to be detected is obtained, time domain mechanism characteristics of the signal sample to be detected are calculated through statistical analysis, a signal characteristic set to be detected is generated, characteristics with discriminability are extracted from the signal characteristic set to be detected for dimension reduction, a signal characteristic subset to be detected is generated, a fault result classifier is used for prediction judgment on the signal characteristic subset to be detected, a fault diagnosis result of the signal sample to be detected is obtained, and an online fault monitoring analysis effect is improved.

Advantageous effects

Compared with the prior art, the invention has the beneficial effects that:

(1) According to the method, a low-dimensional manifold structure is recovered from high-dimensional sampling data through manifold learning, namely a low-dimensional manifold in a high-dimensional space is found, corresponding embedding mapping is solved, dimensionality reduction or data visualization is realized, and then an online fault monitoring and analyzing effect is improved by using a big data analysis method based on ensemble learning;

(2) The method can extract the most discriminative characteristic to reduce the dimension by locally keeping the local property of the manifold of the projection attention vibration signal in manifold learning, extracts the nonlinear characteristic of the complex vibration signal by utilizing the manifold learning method, and successfully extracts the impact characteristic of the fault signal submerged in the noise;

(3) The method comprises the steps of obtaining different training sample subsets through the operation of a Boosting framework on a training sample set, and training the sample subsets to generate a base classifier; thus, after the training round number n is given, n base classifiers can be generated, then the Boosting framework algorithm performs weighted fusion on the n base classifiers to generate a final result classifier, and a hybrid intelligent fault diagnosis technology is formed through combination, integration or fusion, so that the sensitivity, robustness and accuracy of a diagnosis and prediction system are improved, and the capability of solving complex problems is achieved.

Drawings

In order to more clearly illustrate the embodiments or exemplary technical solutions of the present application, the drawings needed to be used in the embodiments or exemplary descriptions will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application and therefore should not be considered as limiting the scope, and it is also possible for those skilled in the art to obtain other drawings according to the drawings without inventive efforts.

FIG. 1 is a schematic representation of the steps of the present invention;

FIG. 2 is a schematic diagram of the system of the present invention;

FIG. 3 is a diagram of the PCA variance cumulative contribution ratio in example 3 of the present invention;

fig. 4 is a diagram of extracting three types of sample features according to embodiment 3 of the present invention.

Detailed Description

To further clarify the objects, technical solutions and advantages of the embodiments of the present application, the embodiments of the present application will be described in detail and completely with reference to the accompanying drawings of the embodiments of the present application, it should be understood that the embodiments described are a part of the embodiments of the present application and not all of the embodiments, and that the components of the embodiments of the present application generally described and illustrated in the drawings herein can be arranged and designed in a variety of different configurations.

Therefore, the following detailed description of the embodiments of the present application, as presented in the figures, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application, and all other embodiments that can be derived by one of ordinary skill in the art based on the embodiments in the present application without making creative efforts fall within the scope of the claimed application.

Example 1

As shown in fig. 1, a fault monitoring method based on manifold learning and big data analysis adopts the following steps:

the method comprises the steps of obtaining a historical fault signal sample, preprocessing the historical fault signal sample, performing statistical analysis and calculation on time domain mechanism characteristics of the historical fault signal sample to generate a historical signal fault characteristic set, extracting characteristics with discriminability from the historical signal fault characteristic set by using an LPP algorithm to perform dimension reduction to generate a historical signal fault characteristic subset, performing training calculation on the historical signal fault characteristic subset by using a Boosting frame algorithm to obtain a training sample subset, then training the training sample subset to generate a base classifier, performing n rounds of training to generate n base classifiers, performing weighted fusion on the n base classifiers to generate a fault result classifier, performing vibration signal real-time acquisition by using a sensor to obtain a signal sample to be detected, preprocessing the signal sample to be detected, performing statistical analysis and calculation on the time domain mechanism characteristics of the signal sample to be detected to generate a signal characteristic set to be detected, extracting the characteristics with discriminability from the signal characteristic set to be detected by using the LPP algorithm to perform dimension reduction to generate the signal characteristic subset to be detected, detecting the signal characteristic subset by using the fault result classifier to perform prediction judgment, and obtaining a fault diagnosis result of the signal sample to be detected.

The method comprises the steps of firstly obtaining a historical fault signal sample, preprocessing the historical fault signal sample, carrying out statistical analysis and calculating time domain mechanism characteristics of the historical fault signal sample, and generating a historical signal fault characteristic set, wherein the time domain mechanism characteristics comprise a time domain mean value, a time domain standard deviation, a time domain square root amplitude, a time domain root mean square value, a time domain peak value, a time domain waveform index, a time domain peak value index, a time domain pulse index, a time domain margin index, a time domain skewness and a time domain kurtosis.

The time domain mean value

The calculation formula of (c) is as follows:

the time domain square root amplitude x _r The calculation formula of (a) is as follows:

the time domain peak value x _p The calculation formula of (a) is as follows:

x _p ＝max|x(n)|

the calculation formula of the time domain waveform index W is as follows:

the calculation formula of the time domain peak index C is as follows:

the calculation formula of the time domain pulse index I is as follows:

the calculation formula of the time domain margin index L is as follows:

the calculation formula of the time domain skewness S is as follows:

the calculation formula of the time domain kurtosis K is as follows:

after the time domain mechanism feature statistics is calculated, scale normalization processing needs to be performed on the time domain mechanism feature, and the calculation formula is as follows:

wherein z is data after scale normalization, x is original data,

is the mean of the raw data and δ is the variance of the raw data.

Extracting discriminative features from a historical signal fault feature set by using an LPP algorithm for dimension reduction to generate a historical signal fault feature subset, wherein the step of extracting discriminative features by using the LPP algorithm for dimension reduction is to establish neighborhood information on a sample point, derive a linear transformation, retain original local information in a subspace after dimension reduction and set a feature set matrix as X _m*n M is the number of samples, n is the number of features, and first, at sample point x _i To select its neighborhood point x _j ，x _i And x _j Whether the mutual domain point is judged by using a k domain method, and only x is needed _i And x _j When one of the k points is the shortest distance from the other, the two points are domain points.

On the basis of the above-mentioned operation, a weighted adjacency matrix W = (W) is established _ij ) _m*m ，

Wherein sigma ² Obtaining the distance between points through weight after obtaining the relation between sample points in the original space as a proportional parameter, wherein the weight is great, the distance is short, the weight is small, the distance is long, and the linear equidistant transformation from the original high-dimensional sample space to the low-dimensional feature subspace is set to ensure that Y = u ^T X, which corresponds to the following solution:

wherein D is a diagonal matrix,

the corresponding solution is then as follows:

XLX ^T u＝λXDX ^T u

training and calculating the historical signal fault feature subset by using a Boosting framework algorithm to obtain a training sample subset, then training the training sample subset to generate a base classifier, performing n rounds of training to generate n base classifiers, performing weighted fusion on the n base classifiers to generate a fault result classifier, performing training and calculating on the historical signal fault feature subset by using the Boosting framework algorithm, and generating the fault result classifier when m training samples (x) are arranged in the historical signal fault feature subset ₁ ，y ₁ )，(x ₂ ，y ₂ )，...(x _m ，y _m ) Wherein x is _i ∈X，y _i E.g., Y = { -1, +1}, initialize sample weight D ₁ (i) T =1/m, T = 1.. Times, T, where T is the number of integrated models, and D is distributed over the sample _t Upper training base classifier h _t The formula is as follows:

h _t (X)→{-1,+1}

base classifier h _t The weight vector calculation formula is as follows:

wherein epsilon _t Is h _t In distribution D _t A classification error rate of _t The calculation formula is as follows:

updating sample weights

Wherein, Z _t ＝∑ _i D _t (i)exp(-α _t y _i h _t (x _i ))

Fault result classifier

The method comprises the steps of collecting vibration signals in real time by using a sensor, obtaining a signal sample to be detected, preprocessing the signal sample to be detected, carrying out statistical analysis and calculating time domain mechanism characteristics of the signal sample to be detected, generating a signal characteristic set to be detected, extracting discriminative characteristics from the signal characteristic set to be detected by using an LPP algorithm, reducing dimensions, and generating a signal characteristic subset to be detected.

And predicting and judging the to-be-detected signal feature subset by using a fault result classifier to obtain a fault diagnosis result of the to-be-detected signal sample, storing the to-be-detected signal feature subset of the to-be-detected signal sample to the historical signal fault feature subset according to the fault diagnosis result after the to-be-detected signal sample is detected, and updating the fault result classifier according to the updating of the historical signal fault feature subset data when the preset time is reached.

As can be seen from the above description, in this example, the historical fault signal sample is obtained, the time domain mechanism features of the historical fault signal sample are statistically analyzed and calculated, the historical signal fault feature set is generated, the discriminative features are extracted for dimension reduction, the historical signal fault feature subset is generated, the training calculation is performed on the historical signal fault feature subset, the fault result classifier is generated, the sensor is used for collecting the vibration signal in real time, the signal sample to be detected is obtained, the time domain mechanism features of the signal sample to be detected are statistically analyzed and calculated, the signal feature set to be detected is generated, the discriminative features are extracted from the signal feature set to be detected for dimension reduction, the signal feature subset to be detected is generated, the fault result classifier is used for prediction and judgment on the signal feature subset to be detected, and the fault diagnosis result of the signal sample to be detected is obtained.

Example 2

A fault monitoring method based on manifold learning and big data analysis is used for carrying out state online monitoring on frequency domain information of rotating equipment, predicting the running trend of the equipment in time and finding out fault hidden dangers of the equipment. The method adopts discrete Fourier transform to carry out spectrum analysis on the vibration signal of the rotating equipment, compares the frequency spectrum analysis with certain fault characteristic parameters, can accurately find out the abnormal vibration component of the rotating equipment, judges the running state and the intact state of the equipment, and improves the service level of the equipment.

In the actual diagnosis process, in order to make the diagnosis accurate and reliable, as much fault sample data as possible is always collected to obtain enough fault information. While the large amount of data provides useful information, it also increases the difficulty of efficiently utilizing such data, and useful knowledge can be overwhelmed in redundant data, which in turn increases the difficulty of feature extraction. The redundancy of sample data makes the classifier complex, the discrimination of classification calculation is large, and the precision of classification may be affected.

In the fault mode classification problem, it is desirable that the extracted feature subset is beneficial to distinguish different data classes, and the classification error caused by the reduction of the data dimension is controlled within a small range. Therefore, a subset of state-sensitive features is selected from the fault feature set, leaving the feature component that contributes most to the fault to be diagnosed. The method and the device can extract the most discriminant feature to perform dimension reduction by utilizing the local property of the manifold of the vibration signal concerned by the Local Preserving Projection (LPP) in manifold learning, thereby having obvious advantage when the local feature is preserved. The basic idea of LPP is to first establish neighborhood information at a sample point and then derive a linear transformation from this, so that the original local information is preserved in the reduced-dimension subspace.

Let the sample matrix be X _m*n M is the number of samples, n is the number of features, and first, at sample point x _i To select its neighborhood point x _j Generally, the following two methods are used to determine whether two points are neighboring points:

an epsilon field method: i.e. | | x _i -x _j || ₂ <When epsilon is generated, the two are domain points;

k field method: only needs x _i And x _j When one of the two points is one of the k points with the shortest distance to the other, the two points are domain points.

On the basis of the above-mentioned operation, a weighted adjacency matrix W = (W) _ij ) _m*m ，

wherein D is a diagonal matrix,

the corresponding solution is then as follows:

XLX ^T u＝λXDX ^T u

the basis of LPP is to construct a nearest neighbor graph that models the local structure of the high-dimensional dataset, which can model the geometry of the dataset well in case the distribution of the detection set is relatively uniform. However, the detection set is generated by random sampling, the distribution is unknown (and the samples are usually insufficient), and it is not possible to ensure that the local structure of the manifold is always accurately represented. In this case, the transformation matrix we get may not be optimal, i.e. there may be large deviations in the mapping.

When the traditional single artificial intelligence method breaks down on modern large-scale complex equipment, the defects of low precision, poor universality and the like exist. Aiming at the difficulties, different artificial intelligence technologies can be combined, integrated or fused in a certain mode to form a hybrid intelligent fault diagnosis technology, so that the sensitivity, robustness and accuracy of a diagnosis and prediction system are improved, and the capability of solving complex problems is achieved.

Boosting is a method for improving the accuracy of a weak classification algorithm, different training sample subsets are obtained through the operation of a Boosting framework on a training sample set, and the sample subsets are used for training to generate a base classifier; thus, after the number n of training rounds is given, n base classifiers can be generated, and then the Boosting framework algorithm performs weighted fusion on the n base classifiers to generate a final result classifier. The core idea of the algorithm is as follows: and increasing the weight of the last misclassified sample and reducing the weight of the last correctly classified sample. In the n base classifiers, the recognition rate of each single classifier is not necessarily high, but the combined result has high recognition rate, so that the recognition rate of the weak classification algorithm is improved.

Example 3

The method is characterized in that an intelligent operation and maintenance big data cloud platform of an aerospace intelligent control monitoring technology company Limited is adopted to collect low-pressure pump rotor operation vibration data (the number of sampling points is 8192, the sampling frequency is 1280 Hz) in real time, and sample information is as follows:

three types of vibration data are included: normal sample, unbalanced, misaligned. Extracting 11 mechanical characteristics from the three types of time domain vibration data, wherein a calculation formula of the mechanical characteristics is 11 time domain mechanical characteristics shown in embodiment 1;

different dimensions and dimension units exist among the 11 physical characteristic indexes, which affect the result of data analysis, and in order to eliminate the dimension influence, data standardization processing is needed to solve the comparability among the data indexes. After the raw data are subjected to data standardization processing, all indexes are in the same order of magnitude, and the method is suitable for comprehensive comparison and evaluation. The raw data set was normalized to a mean 0, variance 1 data set.

The feature extraction method and the classifier are respectively an LLP feature extraction method and a Boosting classifier.

The three types of labeled vibration samples are analyzed by PCA, as shown in FIG. 3, it can be seen from the figure that the cumulative contribution rate of the first three feature components of PCA is less than 90%, so that it is difficult to distinguish them in 3-dimensional space, which indicates that the samples may be nonlinear structures in high-dimensional space.

Since the samples are not simple linear structures in the original feature space, manifold Learning (LPP) based on nonlinear feature extraction is more effective than the linear feature extraction method (PCA), as shown in fig. 4.

To compare the fault diagnosis results, 0.75 training, 0.25 test was randomly selected for each type of sample. The results were averaged 10 times and the classification accuracy of each type of sample was tested as shown in the following table:

because the sample is not a simple linear structure in a high-dimensional space, the traditional linear feature extraction methods such as PCA (principal component analysis) are poor in analysis effect (the average accuracy of classification of normal samples is about 82%), but the LPP method is used for extracting nonlinear information of vibration signals, and meanwhile, the integrated learning method Boosting is combined to effectively improve the fault diagnosis effect: the correct rates for normal, unbalanced and misaligned faults are 98.8%, 98.7% and 96.6%, respectively, which illustrates that the method can be used for better predictive maintenance of equipment.

The method has the advantages that the operation vibration data of the low-pressure pump rotor is collected in real time, the signal time domain mechanism characteristics and manifold statistical characteristics of the data are extracted, the fault diagnosis prediction model is established by combining an integrated learning boosting machine learning algorithm, and example results show that the method can realize intelligent identification and alarm of equipment faults, reduce a large amount of routing inspection work such as artificial judgment in the operation and maintenance process of the equipment, and improve the capacity of intelligent management of the whole life cycle of the equipment.

Example 4

As shown in fig. 2, a fault monitoring system based on manifold learning and big data analysis includes:

As can be seen from the above description, in this embodiment, the historical fault signal sample, the historical signal fault feature set, and the historical signal fault feature subset are stored in the data storage module, the signal sample to be detected, the signal feature set to be detected, and the signal feature subset to be detected are cached, the statistical calculation module performs statistical analysis and calculation to generate the historical signal fault feature set and the signal feature set to be detected, the dimension reduction calculation module extracts discriminative features to perform dimension reduction to generate the historical signal fault feature subset and the signal feature subset to be detected, the classification generation module performs training calculation to generate the fault result classifier, and the fault diagnosis module performs prediction judgment on the signal feature subset to be detected to obtain the fault diagnosis result of the signal sample to be detected.

The above examples are merely representative of preferred embodiments of the present invention, and the description thereof is more specific and detailed, but not to be construed as limiting the scope of the present invention. It should be noted that various changes, modifications and substitutions may be made by those skilled in the art without departing from the spirit of the invention, and all are intended to be included within the scope of the invention.

Claims

1. A fault monitoring method based on manifold learning and big data analysis is characterized by comprising the following steps:

step 2: extracting discriminative features from the historical signal fault feature set by using an LPP algorithm for dimension reduction to generate a historical signal fault feature subset;

and 4, step 4: the method comprises the steps of collecting vibration signals in real time by using a sensor, obtaining a signal sample to be detected, preprocessing the signal sample to be detected, performing statistical analysis and calculating time domain mechanism characteristics of the signal sample to be detected, and generating a signal characteristic set to be detected;

step 6: and predicting and judging the characteristic subset of the signal to be detected by using a fault result classifier to obtain a fault diagnosis result of the signal sample to be detected.

2. The fault monitoring method based on manifold learning and big data analysis as claimed in claim 1, wherein: the time domain mechanism characteristics comprise a time domain mean value, a time domain standard deviation, a time domain square root amplitude, a time domain root mean square value, a time domain peak value, a time domain waveform index, a time domain peak value index, a time domain pulse index, a time domain margin index, a time domain skewness and a time domain kurtosis.

3. The fault monitoring method based on manifold learning and big data analysis as claimed in claim 2, wherein: the time domain mean value

The calculation formula of (a) is as follows:

the time domain peak value x _p The calculation formula of (a) is as follows:

x _p ＝max|x(n)|

the calculation formula of the time domain waveform index W is as follows:

the calculation formula of the time domain peak index C is as follows:

the calculation formula of the time domain pulse index I is as follows:

the calculation formula of the time domain margin index L is as follows:

the calculation formula of the time domain skewness S is as follows:

the calculation formula of the time domain kurtosis K is as follows:

4. the fault monitoring method based on manifold learning and big data analysis as claimed in claim 3, wherein: after the time domain mechanism characteristic statistics is calculated, scale normalization processing needs to be performed on the time domain mechanism characteristic, and the calculation formula is as follows:

wherein z is data after scale normalization, x is original data,

is the mean of the raw data, and δ is the variance of the raw data。

5. The fault monitoring method based on manifold learning and big data analysis as claimed in claim 4, wherein: the LPP algorithm extracts the characteristic with discriminant to reduce the dimension, namely, firstly establishing neighborhood information on a sample point, deriving a linear transformation, retaining the original local information in a subspace after dimension reduction, and setting a characteristic set matrix as X _m*n M is the number of samples, n is the number of features, and first, at sample point x _i To select its neighborhood point x _j On the basis of the above-mentioned formula, a weighted adjacency matrix W = (W) _ij ) _m*m ，

wherein D is a diagonal matrix,

the corresponding solution is then as follows:

XLX ^T u＝λXDX ^T u。

6. the fault monitoring method based on manifold learning and big data analysis as claimed in claim 5, wherein: x is the number of _i And x _j Whether the mutual domain point is judged by using a k domain method, and only x is needed _i And x _j When one of the k points is the shortest distance from the other, the two points are domain points.

7. The fault monitoring method based on manifold learning and big data analysis as claimed in claim 6, wherein: the Boosting frame algorithm is used for training and calculating the historical signal fault feature subset, and the generated fault result classifier is that when the historical signal fault feature subset is internally provided with m training samples (x) ₁ ，y ₁ )，(x ₂ ，y ₂ )，...(x _m ，y _m ) Wherein x is _i ∈X，y _i E Y = { -1, +1}, initialize sample weight D ₁ (i) =1/m, T =1,. Eta., T, where T is the number of integrated models, D being distributed over the sample _t Upper training base classifier h _t The formula is as follows:

h _t (X)→{-1,+1}

base classifier h _t The weight vector calculation formula is as follows:

wherein epsilon _t Is h _t In distribution D _t A classification error rate of;

updating sample weights

Wherein Z is _t ＝∑ _i D _t (i)exp(-α _t y _i h _t (x _i ))

Fault result classifier

8. The fault monitoring method based on manifold learning and big data analysis as claimed in claim 7, wherein: the classification error rate ε _t The calculation formula is as follows:

9. the fault monitoring method based on manifold learning and big data analysis as claimed in claim 8, wherein: after the detection of the signal sample to be detected is completed, the signal characteristic subset to be detected is stored in the historical signal fault characteristic subset according to the fault diagnosis result, and when the preset time is reached, the fault result classifier is updated according to the updating of the historical signal fault characteristic subset data.

10. A fault monitoring system based on manifold learning and big data analysis, comprising:

the data storage module is used for storing the historical fault signal sample, the historical signal fault feature set and the historical signal fault feature subset, and caching the signal sample to be detected, the signal feature set to be detected and the signal feature subset to be detected;