CN111898705A - Fault feature parameter selection method based on fuzzy preference relation and adaptive hierarchical clustering

Info

Abstract

Description

Claims

CN111898705A

Publication number: CN111898705A
Application number: CN202010833932.XA
Authority: CN
Inventors: 郝慧娟; 程广河; 唐勇伟; 郝凤琦; 李娟�
Original assignee: Shandong Computer Science Center National Super Computing Center in Jinan
Current assignee: Shandong Computer Science Center National Super Computing Center in Jinan
Priority date: 2020-08-18
Filing date: 2020-08-18
Publication date: 2020-11-06
Anticipated expiration: 2040-08-18
Also published as: CN111898705B

The invention discloses a fault characteristic parameter selection method based on fuzzy preference relation and adaptive hierarchical clustering. The invention provides self-adaptive hierarchical clustering based on fuzzy relation based on logsig function, and is applied to fault diagnosis of equipment; sensitive features are calculated and selected based on the fuzzy relation without prior knowledge, so that the intelligence of the method is improved; the use of the optimized features simplifies the feature set, avoids dimension disasters, reduces the calculation burden and improves the fault diagnosis efficiency; the adaptive hierarchical clustering preferred in combination with features has higher diagnostic accuracy.

Fault feature parameter selection method based on fuzzy preference relation and adaptive hierarchical clustering

Technical Field

The invention relates to a fault characteristic parameter selection method based on fuzzy preference relation and adaptive hierarchical clustering, and belongs to the technical field of big data processing.

Background

With the development of science and technology, large-scale equipment is more and more complicated, the cooperation between the part is inseparabler, and the trouble of part all can bring the loss of shutting down, causes great economic loss, can endanger personal safety in the serious case. In addition, if the fault cannot be accurately positioned, blind repair and disassembly can cause precision errors, reliability reduction and the like. Therefore, the fault diagnosis technology is a precondition for ensuring the safe and stable operation of the equipment, and is also important for the maintenance of the equipment.

Due to the fact that the number of measuring points is large, the number of monitoring parameters (force, temperature, vibration, sound, energy, hydraulic pressure and the like) is large, diverse and complex state monitoring big data are formed, and fault diagnosis of equipment enters a big data era. The high-dimensional features can provide richer feature parameters for fault diagnosis, but the feature dimension is too high, and when the scale of the training sample is not large, the influences of overfitting and the like are brought to fault diagnosis and identification, so that the accuracy of fault diagnosis is influenced.

In neural networks, it is common to use

The function characterizes the fuzzy relationship between samples, and thus the ordered structure between samples. In different kinds of faults, the larger the difference between the same characteristics is, the more sensitive the characteristics are to the classification of the categories is, and the larger sensitivity coefficient is taken.

The hierarchical clustering algorithm belongs to an unsupervised classification algorithm, is suitable for clustering of data sets with any shapes, does not need to determine parameters such as a clustering center, the number of clusters and the like in advance, but has no uniform standard of end conditions, still needs to set corresponding thresholds, and has larger calculated amount.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a fault characteristic parameter selection method based on fuzzy preference relation and adaptive hierarchical clustering.

The invention provides the following technical scheme:

1) fuzzy preference relationship calculation

1.1) given System S ═<X，Q，U>Wherein X ═ { X ═ X₁，x₂，…，x_NDenotes a sample set, Q ═ Q₁，q₂，…，q_JIs the set of features, U ═ U₁，u₂，…，u_CIs the failure set;

x_Ke.g. X, for q_lThe fuzzy preference relation of the epsilon Q is as follows:

wherein q is_i1，q_j1E is Q; i is not equal to j; k is the number of clusters;

1.2) to d_ijFurther simplification is realized, as shown in a formula (2);

wherein Δ q ═ q_i1-q_j1；

2) Coefficient of sensitivity calculation

Assume a set of raw features q containing class C failures_m，j，m＝1，2，…，N；j＝1，2，…，J}_CWhere N is the number of samples per fault, J is the number of features, q_n，jRepresents the jth characteristic value of the nth sample;

the total number M of samples of the system S is nxc, and the total number L of data is nxc × J;

calculating the sensitivity coefficient of each characteristic according to the formula 2) to form a fuzzy relation matrix

The coefficient of sensitivity for each feature is:

3) sensitive feature selection

Sensitivity coefficient (SP) of all features₁，SP₂，…，SP_J) The front v sensitivity coefficients are selected as sensitivity characteristics Q ' ═ Q ' in sequence from small to large '₁，q′₂，…，q′_vV is the preset number of sensitive features;

the problem of feature redundancy is not considered in the sensitive feature selection, and redundant features may still be included; in order to further improve the efficiency and reduce the feature dimension, the invention uses the self-adaptive hierarchical clustering algorithm to remove redundant features.

4) Removing redundant features based on adaptive hierarchical clustering;

for a certain degree of clustering of the data set, the contour coefficient S_kThe definition is as follows:

wherein S is_IThe contour coefficient of the sample individual is shown, T is the number of samples in the data set, and k is the clustering number;

wherein a (I) represents a sample x_IAnd the average distance between all other samples belonging to class C, b (I) denotes sample x_IAnd the minimum of the average distances of all samples in each class other than class C;

normalizing the selected sensitive features Q' to obtain a normalized feature set, clustering according to a self-adaptive hierarchical clustering method, and clustering to obtain class numbersc is the preferred number of features, the center of the c class is taken as the preferred feature, and a preferred feature set is formed

The invention has the beneficial effects that:

1. the method removes redundant features by using a self-adaptive hierarchical clustering algorithm, adopts a clustering contour coefficient as an index for evaluating the clustering effectiveness, does not need to preset the clustering number, self-adaptively determines the clustering number, and obtains a certain clustering result, so that the inter-class distance is as large as possible, the intra-class distance is as small as possible, and good separability is realized among classes;

2. the invention provides self-adaptive hierarchical clustering based on fuzzy relation based on logsig function, and is applied to fault diagnosis of equipment; sensitive features are calculated and selected based on the fuzzy relation without prior knowledge, so that the intelligence of the method is improved; the use of the optimized features simplifies the feature set, avoids dimension disasters, reduces the calculation burden and improves the fault diagnosis efficiency; the adaptive hierarchical clustering preferred in combination with features has higher diagnostic accuracy.

3. The monitoring data often has the characteristics of ambiguity, uncertainty and the like, the fuzzy preference relationship has inherent advantages, the preference of a decision maker can be better reflected, and the system is more comprehensively described; aiming at the problem of fault diagnosis in a big data form, the method has inherent advantages by combining the fuzzy preference relationship, reduces the feature dimension, removes redundant features, selects the feature combination with the largest fault diagnosis information amount from high-dimensional features, and improves the efficiency of fault diagnosis.

Drawings

FIG. 1 is a graph of the logsig function over the interval [ -1, 1 ];

FIG. 2 is a fuzzy preference relationship based on the features of equation (1);

FIG. 3 is a flow chart of the adaptive hierarchical clustering algorithm of the present invention;

fig. 4 is a schematic diagram of the sensitive feature selection described in example 2.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Example 1

As shown in fig. 1 and 3.

A fault characteristic parameter selection method based on fuzzy preference relation and adaptive hierarchical clustering comprises the following steps:

1) fuzzy preference relationship calculation

x_Ke.g. X, for q_lThe fuzzy preference relation of the epsilon Q is as follows:

wherein q is_i1，q_j1E is Q; i is not equal to j; k is the number of clusters;

as can be seen from FIG. 2, d_ij(q)＝d_ji(q) when i ═ j, d_ij(q) 0.5, with increasing | Δ q |, d_ij(q) increases from 0.5 when q is increased_i，l＞＞q_j，lWhen d is greater than_ij(q) → 1. Therefore, in feature selection, it is only necessary to characterize the difference between two features, and it is not necessary to describe q in detail_i，lWhether greater or less than q_j，l。

1.2) to d_ijFurther simplified as shown in(2) Shown;

wherein Δ q ═ q_i1-q_j1；

As can be seen from FIG. 2, the parameter k takes different values, d_ijThe change is large, and the preference degree of the fuzzy relation of the features also changes.

2) Coefficient of sensitivity calculation

The coefficient of sensitivity for each feature is:

3) sensitive feature selection

4) Removing redundant features based on adaptive hierarchical clustering;

normalizing the selected sensitive features Q' to obtain a normalized feature set, clustering according to an adaptive hierarchical clustering method, wherein the number c of the clusters is the preferred number of the features, and the center of the c classes is taken as the preferred feature to form the preferred feature set

Example 2

As shown in fig. 4.

The method of embodiment 1 is used for fault diagnosis and fault type determination of the bearing-integrated simulation system, and comprises the following steps:

the vibration sensor was used to acquire 4 states of the bearing simulation system: normal state, outer ring fault, inner ring fault, rolling element fault;

A1) Feature extraction

Extracting time domain characteristics, frequency domain characteristics, EEMD decomposed IMF component characteristics and wavelet packet decomposed energy of the original vibration signal, and forming a characteristic set;

a1.1) time-domain features

Mean value:

standard deviation:

root mean square:

peak-to-peak value: f_p＝max|x(n)|；

The waveform index is as follows:

pulse factor:

margin indexes are as follows:

crest factor: f_cf＝F_p/F_rms；

Kurtosis:

skewness:

wherein x (N) is a time domain sequence of the signal, and N is the number of vibration sample points;

1.2) frequency domain characteristics

Frequency domain mean value:

center frequency:

frequency root mean square:

standard deviation of frequency:

(wherein f_kIs the frequency value of the K-th line, s (K) is the frequency spectrum of signal x (n), K is the number of lines;

a1.3) energy index

The i-th sub-band of the layer l of the wavelet packet decomposition is represented.

A2) Sensitive feature selection

The large number of features not only can reduce the calculation efficiency, but also can cause dimension disaster, the sensitive coefficients of 134 features are calculated according to the fuzzy preference relationship-based method, and 41 sensitive features are selected;

A3) preferred feature selection

And (4) performing normalization processing on the selected 41 sensitive features, and clustering by adopting an adaptive hierarchical clustering algorithm. In this example, the finally determined c is 12.

A4) Fault diagnosis

And introducing the 12 optimal characteristics into an adaptive hierarchical clustering algorithm, and identifying the vibration signals which are actually acquired according to the trained fault model to obtain a clustering category, thereby realizing fault diagnosis and determining the fault type. In this embodiment, the classification accuracy reaches 99.4%.

Finally, it should be noted that the above-mentioned contents are only used for illustrating the technical solutions of the present invention, and not for limiting the protection scope of the present invention, and that the simple modifications or equivalent substitutions of the technical solutions of the present invention by those of ordinary skill in the art can be made without departing from the spirit and scope of the technical solutions of the present invention.

serial number

Operating state

Status flag

Outer ring failure

Inner ring failure

Failure of rolling body

1. A fault characteristic parameter selection method based on fuzzy preference relation and adaptive hierarchical clustering is characterized by comprising the following steps:

1) fuzzy preference relationship calculation

1.1) given System S ═<X，Q，U>Wherein X ═ { X ═ X₁，x₂，...，x_NDenotes a sample set, Q ═ Q₁，q₂，...,q_JIs the set of features, U ═ U₁,u₂,...,u_CIs the failure set;

with respect to q_lThe fuzzy preference relation of the epsilon Q is as follows:

wherein q is_il，q_jlE is Q; i is not equal to j; k is the number of clusters;

1.2) to d_ijFurther simplification is realized, as shown in a formula (2);

wherein Δ q ═ q_il-q_jl；

2) Coefficient of sensitivity calculation

Assume a set of raw features q containing class C failures_m，j，m＝1，2，...，N；j＝1，2，...，J}_CWhere N is the number of samples per fault, J is the number of features, q_n，jRepresents the jth characteristic value of the nth sample;

The coefficient of sensitivity for each feature is:

3) sensitive feature selection

Sensitivity coefficient (SP) of all features₁，SP₂，...，SP_I) The front v sensitivity coefficients are selected as sensitivity characteristics Q ' ═ Q ' in sequence from small to large '₁,q'₂,...,q'_vV is the preset number of sensitive features;

4) removing redundant features based on adaptive hierarchical clustering;