CN112232252A

CN112232252A - Optimal transport-based transmission chain unsupervised domain adaptive fault diagnosis method

Info

Publication number: CN112232252A
Application number: CN202011152085.7A
Authority: CN
Inventors: 刘朝华; 蒋林博; 王畅通; 吴亮红; 陈磊; 张红强; 李小花
Original assignee: Hunan University of Science and Technology
Current assignee: Hunan University of Science and Technology
Priority date: 2020-10-23
Filing date: 2020-10-23
Publication date: 2021-01-15
Anticipated expiration: 2040-10-23
Also published as: CN112232252B

Abstract

The invention discloses a transmission chain unsupervised domain adaptive fault diagnosis method based on optimal transportation, which comprises the following steps of: building a transmission chain fault database; constructing an unsupervised domain adaptive feature extraction and classification model; establishing an unsupervised domain adaptive fault diagnosis model fusing optimal transportation; and (3) sending the target domain data into the trained feature extractor and softmax classifier to obtain a prediction label of the target domain data, and calculating the classification precision of the target domain data. The invention combines the optimal transport theory and the domain adaptation theory, aligns the distribution of the feature space and the label space by minimizing the distance between the features and the labels of the fault source domain and the target domain, solves the domain adaptation problem of the source domain and the target domain in fault diagnosis, and improves the precision of fault diagnosis.

Description

Optimal transport-based transmission chain unsupervised domain adaptive fault diagnosis method

Technical Field

The invention relates to a fault diagnosis method for a transmission chain, in particular to an unsupervised domain adaptive fault diagnosis method for the transmission chain based on optimal transportation.

Background

The transmission chain plays an important role in industrial application and is a component of many industrial systems, such as high-speed rails, airplanes, wind power systems and the like, but the transmission chain is prone to failure. The fault diagnosis of the transmission chain is important for ensuring the safe operation of the system, reducing the machine downtime and saving the maintenance cost. Therefore, the diagnosis of the failure of the drive train has received a great deal of attention, and a large number of failure diagnosis methods have been proposed.

Fault diagnosis methods can be broadly classified into model-based methods and data-driven methods. Early fault diagnosis is mainly based on a physical model, which can accurately describe the relation between a fault and a related industrial system; however, this fault diagnosis method has two main disadvantages: 1) highly dependent on a priori knowledge of the system; 2) disturbances in the industrial operation process and some assumptions about the system (e.g., the form of noise and the operating conditions of the system) may be inappropriate, which may lead to uncertainty and misdiagnosis. The data-driven fault diagnosis method directly analyzes the acquired data, and the technology of signal processing, machine learning and the like is used, so that the dependence on the prior knowledge of the system can be reduced, and the method is more suitable for modern industrial application; such as Hilbert transform and wavelet transform, are widely used for fault signature analysis. While these methods may achieve good performance on the task of fault diagnosis, they still face the challenge of automatically extracting fault signatures from the initial fault signal. Moreover, manual extraction of fault features is very time consuming.

In recent years, deep learning has been greatly developed for application in the field of fault diagnosis, such as a deep stack network and a convolutional neural network, which have good capability of automatically extracting features, and are used to solve the problem of fault diagnosis. Most existing methods are able to accurately detect faults, but the success of these methods is based on two assumptions: 1) a plurality of labeled training data; 2) the training data of the source domain and the test data of the target domain are distributed identically. When these two assumptions are violated, the performance of these algorithms may be significantly degraded. However, in practical industrial applications, such as wind power generation systems, these assumptions often cannot be established due to changes in the operating environment and instability in the load torque. Moreover, collecting data is very time consuming, so it is often difficult to collect a large amount of labeled training data. It is also generally recognized that models trained from raw data may be less reliable. Even if tag data is available under certain operating conditions, the data distribution may change with new operating conditions.

The domain adaptation is an effective method for solving the problem of data imbalance or data scarcity, can effectively relieve the influence of data distribution difference, and obviously improves the performance of the classifier. However, the existing fault diagnosis method based on the domain adaptation theory mainly has two disadvantages: 1) when the distribution of the source domain and the target domain is multi-modal, the probability measures used by these methods cannot represent the difference between the source domain and the target domain; 2) the difference between the source domain and the target domain is only considered in the high-dimensional feature space, and the difference in the label space is often ignored, resulting in insufficient adaptation. Optimal Transport (OT) theory, as a powerful tool, can calculate the distance between probability distributions. The OT distance can be calculated directly from the samples of the distribution without the need for density estimation or other non-parametric methods.

Disclosure of Invention

In order to solve the technical problems, the invention provides the optimal transport-based transmission chain unsupervised domain adaptive fault diagnosis method which is simple in algorithm and high in diagnosis precision.

The technical scheme for solving the problems is as follows: an optimal transportation-based transmission chain unsupervised domain adaptive fault diagnosis method comprises the following steps:

(1) building a transmission chain fault database: acquiring fault data of a source domain and a target domain of a transmission chain, wherein a labeled source domain fault sample set is

The corresponding label is

Wherein

For the m-th source domain fault sample,

is composed of

Corresponding labels, s represents a source domain, and m is the total number of source domain samples; unlabeled target domain fault sample set is

Wherein

An nth target domain fault sample is obtained, t represents a target domain, and n is the total number of target domain samples; the source domain fault sample set and the target domain fault sample set jointly form a transmission chain fault database;

(2) constructing an unsupervised domain adaptive feature extraction and classification model: an automatic encoder is used as a feature extractor f to automatically extract features of source domain and target domain fault samples, and a softmax classifier g is trained by using the source domain samples, so that a prediction label of the target domain fault samples is obtained, and unsupervised domain adaptive feature extraction and classification model construction of the target domain fault samples are realized;

(3) establishing an unsupervised domain adaptive fault diagnosis model fusing optimal transportation: processing fault samples of a source domain and a target domain through an unsupervised domain adaptive feature extraction and classification model to obtain sample features and corresponding labels; integrating the feature extractor f and the softmax classifier g into a target function of an optimal transport solver, calculating a transport planning matrix gamma through sample features and labels, aligning the distribution of a source domain and a target domain, and transporting source domain label information to the target domain; alternately optimizing the unsupervised domain adaptive feature extraction and classification model and the optimal transport solver, thereby obtaining an unsupervised domain adaptive fault diagnosis model fusing optimal transport;

(4) and (3) sending the target domain data into a trained feature extractor f and a softmax classifier g to obtain a prediction label of the target domain data, and calculating the classification precision of the target domain data.

In the optimal transportation-based unsupervised domain adaptive fault diagnosis method for the transmission chain, in the step (2), an automatic encoder is adopted as a feature extractor f, and the specific process of automatically extracting the features of the fault samples of the source domain and the target domain comprises the following steps:

the automatic encoder passes a function h_θEncoding input data x into a representative feature y; this process is represented as:

y＝h_θ(x)＝σ(Wx+b)

wherein θ represents a parameter of the encoding portion; w represents a weight matrix of the encoding process, b represents a bias vector of the encoding process; σ represents an activation function;

accordingly, the decoding part passes through a function h'_θ′Reconstructing the representative features back into the input data:

x′＝h′_θ′(y)＝σ′(W′y+b′)

wherein θ ' represents a parameter of the decoding part, x ' is reconstructed input data, W ' represents a weight matrix of the decoding process, b ' represents a bias vector of the decoding process, and σ ' represents an activation function of the decoding process;

the loss function of the entire autoencoder network is:

wherein x is_ρAnd x'_ρRespectively represents the ρ -th input data and the input data reconstructed by the decoder, ρ ═ 1,2, …, I is the number of input data, | x'_ρ-x_ρ||₂Represents x'_ρ-x_ρ2-norm of (d).

In the optimal transport-based drive chain unsupervised domain adaptive fault diagnosis method, in the step (2), the softmax classifier has the function of estimating the probability that each sample belongs to each class, and taking the class with the highest probability as the class of the sample;

given input data and corresponding labels { (x)⁽¹⁾,y⁽¹⁾),(x⁽²⁾,y⁽²⁾),…,(x^(τ),y^(τ)) In which x^(τ)Denotes the τ th input data, y^(τ)Denotes x^(τ)Corresponding label, and y^(τ)E {1,2, …, k }, where k represents the number of categories of the tag; then data x is input^(τ)Probability of label of

Comprises the following steps:

wherein the content of the first and second substances,

denotes the parameters of the softmax classifier, T denotes the transpose of the vector,

is shown to all

Summing to realize the normalization of the probability value;

denotes the l parameter

Transposing;

cross entropy loss is used as a loss function for the softmax classifier

Where 1{ expression is true } -, 1{ expression is false } -, 0, and y^(α)For alpha-th input data x^(α)α ═ 1,2, …, τ, β denotes group β tags, β ═ 1,2, …, k;

denotes the beta parameter

The transposing of (1).

In the optimal transportation-based unsupervised domain adaptive fault diagnosis method for the transmission chain, in the step (3), an optimal transportation solver is used for calculating the distance between the probability distributions of a source domain and a target domain, and the domain adaptation problem is regarded as a special case of a discrete optimal transportation problem;

let the edge distribution of the source domain and target domain samples be mu respectively_sAnd mu_tAnd then the distance between the feature space and the label space of the source domain and the target domain is:

wherein C (i, j) represents the cost of moving the probability mass from the ith source domain sample feature and label to the jth target domain sample feature and label; c is a cost matrix, i is 1,2, …, m, j is 1,2, …, n;

features representing an ith source domain sample;

features representing a jth target domain sample; η represents a trade-off parameter;

indicating the L2 distance between the sample features of the computed source domain and the target domain;

representing a cross-entropy loss between the compute source domain and the target domain label;

since the target domain exemplar does not have a corresponding label,

cannot be used directly, so the source domain trained softmax classifier g generates labels for target domain sample features

Namely:

the objective function for solving the optimal transport plan is:

wherein, n ═ { gamma ∈ (R)⁺)^m×n|γ1_n＝μ_s,γ ^T1_m＝μ_tDenotes the set of all transport plans γ; gamma (i, j) is the probability quality of the ith source domain sample feature and label transferring to the jth target domain sample feature and label; 1_nAn n × 1 vector representing an element of 1; 1_mAn mx 1 vector representing an element of 1; (R)⁺)^m×nRepresenting a positive real matrix with dimension m × n;<·,·>expressing Frobenius dot product;

initial value gamma for transport planning₀The method comprises the following steps:

wherein, C₀A cost matrix for moving the probability mass from the source domain samples to the target domain samples, and

γ₀(i, j) is the probability mass of the transfer of the ith source domain sample to the jth target domain sample.

In the above optimal transportation-based transmission chain unsupervised domain adaptive fault diagnosis method, in the step (3), the target loss function to be optimized by the whole model is as follows:

where f' denotes the decoder of the auto-encoder,

and

respectively representing source domain features

And

decoding into input data; l is_fAnd L_gRespectively representing the loss functions of the feature extractor f and the softmax classifier g;

alternately solving the objective loss function using a cooperative descent algorithm:

when the parameters of the feature extractor f and the softmax classifier g are fixed, the target loss function is written as:

the above formula is a standard linear programming problem, which is solved by a network simplex flow algorithm;

when the transport plan γ is fixed, the target loss function is written as:

the Adam algorithm is used to solve the above equation.

In the optimal transportation-based transmission chain unsupervised domain adaptive fault diagnosis method, in the step (4), the prediction accuracy accurve of the target domain sample label is calculated according to the following formula:

wherein the content of the first and second substances,

is a sample

The predicted label of the tag is used to predict,

is a corresponding true label; 1 {. is a binary function when

When the temperature of the water is higher than the set temperature,

is 1; when in use

When the temperature of the water is higher than the set temperature,

is 0.

The invention has the beneficial effects that:

1. according to the invention, an AE network is used as a feature extractor to extract representative features in original data, and a softmax classifier trained by source domain samples is used for classifying the target domain fault samples, so that unsupervised fault diagnosis of the target domain samples is realized.

2. According to the method, an optimal transmission theory is fused in the domain adaptation problem, the feature extractor and the softmax classifier trained by the source domain samples are used for the target domain samples by aligning the feature space and the label space of the source domain and the target domain, and the feature extraction capability and the fault diagnosis capability are improved.

Drawings

FIG. 1 is a flow chart of the present invention.

FIG. 2 is a schematic diagram of an automatic encoder according to the present invention.

FIG. 3 is a schematic diagram of the optimal transport problem of the present invention.

FIG. 4 is a bar graph of a comparative experiment of the present invention.

Detailed Description

The invention is further described below with reference to the figures and examples.

As shown in fig. 1, a method for diagnosing an unsupervised domain adaptive fault of a transmission chain based on optimal transportation includes the following steps:

The corresponding label is

Wherein

For the m-th source domain fault sample,

is composed of

Wherein

Is the nth target domain fault sample, t represents the target domain, and n is the total number of target domain samples.

(2) Constructing an unsupervised domain adaptive feature extraction and classification model: and (3) adopting an automatic encoder as a feature extractor f to automatically extract the features of the source domain fault samples and the target domain fault samples, and training a softmax classifier g by using the source domain samples, so as to obtain a prediction label of the target domain fault samples and realize unsupervised domain adaptive feature extraction and classification of the target domain fault samples.

The auto-encoder is used to learn more compact sample features with class discrimination, thereby speeding up the training process;

as shown in FIG. 2, the autoencoder passes a function h_θInput data x is encoded as representative characteristics y. This process is represented as:

y＝h_θ(x)＝σ(Wx+b)

where θ represents a parameter of the encoding portion. W denotes a weight matrix of the encoding process, and b denotes a bias vector of the encoding process. σ represents an activation function;

x′＝h′_θ′(y)＝σ′(W′y+b′)

where θ' represents a parameter of the decoding portion. x' is the reconstructed input data. W 'denotes a weight matrix of the decoding process, and b' denotes a bias vector of the decoding process. σ' represents an activation function of the decoding process;

the loss function of the entire autoencoder network is:

wherein x is_ρAnd x'_ρThe input data ρ is the input data reconstructed by the decoder, and ρ is 1,2, …, and I are the number of input data. L x'_ρ-x_ρ||₂Represents x'_ρ-x_ρ2-range ofCounting;

the softmax classifier is the most common algorithm to solve the multi-classification problem. The main function of Softmax is to estimate the probability that each sample belongs to each class and to take the class with the highest probability as the class of the sample;

given input data and corresponding labels { (x)⁽¹⁾,y⁽¹⁾),(x⁽²⁾,y⁽²⁾),…,(x^(τ),y^(τ))}. Wherein x is^(τ)Denotes the τ th input data, y^(τ)Denotes x^(τ)Corresponding label, and y^(τ)E {1,2, …, k }, k representing the number of categories of the tag. Then data x is input^(τ)The label probability of (a) is:

wherein the content of the first and second substances,

is shown to all

And summing to realize the normalization of the probability value.

Denotes the l parameter

Transposing;

the cross entropy loss is used as a loss function for the softmax classifier:

where 1{ expression is true } -, 1, and 1{ expression is false } -, 0. y is^(α)For alpha-th input data x^(α)α ═ 1,2, …, τ. β represents a β -th class label, β ═ 1,2, …, k.

Denotes the beta parameter

The transposing of (1).

(3) Establishing an unsupervised domain adaptive fault diagnosis model fusing optimal transportation: processing fault samples of a source domain and a target domain through an unsupervised domain adaptive feature extraction and classification model to obtain sample features and corresponding labels; integrating the feature extractor f and the softmax classifier g into a target function of an optimal transport OT solver, calculating a transport planning matrix gamma through sample features and labels to align the distribution of a source domain and a target domain, and transporting label information of the source domain to the target domain; and alternately optimizing the unsupervised domain adaptive feature extraction and classification model and the OT solver, thereby obtaining the unsupervised domain adaptive fault diagnosis model fusing the optimal transportation.

The optimal transport OT is used to calculate the distance between the source and target domain probability distributions. The domain adaptation problem can be seen as a special case of the discrete OT problem;

as shown in FIG. 3, let the edge distributions of the source domain and target domain samples be μ_sAnd mu_tAnd then the distance between the feature space and the label space of the source domain and the target domain is:

where C (i, j) represents the cost of moving the probability mass from the ith source domain sample feature and label to the jth target domain sample feature and label. C is a cost matrix, i is 1,2, …, m, j is 1,2, …, n.

Representing the characteristics of the ith source domain sample.

Representing the characteristics of the jth target domain sample. η represents a trade-off parameter.

Indicating the L2 distance between the computed source and target domain sample features.

since the target domain exemplar does not have a corresponding label,

Namely:

the objective function for solving the optimal transport plan is:

wherein, n ═ { gamma ∈ (R)⁺)^m×n|γ1_n＝μ_s,γ ^T1_m＝μ_tRepresents the set of all transport plans γ. And gamma (i, j) is the probability quality of the transfer of the ith source domain sample characteristic and label to the jth target domain sample characteristic and label. 1_nRepresenting an n × 1 vector with an element of 1, 1_mRepresenting an m 1 vector with elements of 1. (R)⁺)^m×nRepresenting a matrix of positive real numbers of dimension m n.<·,·>Expressing Frobenius dot product;

γ₀(i, j) is the probability mass of the ith source domain sample being transferred to the jth target domain sample;

the objective loss function to be optimized for the entire model is:

where f' denotes the decoder of the auto-encoder,

and

respectively representing source domain features

And

decoded into input data. L is_fAnd L_gRespectively representing the loss functions of the feature extractor f and the softmax classifier g;

and alternately solving the target loss function by using a cooperative descent algorithm. When the parameters of the feature extractor f and the softmax classifier g are fixed, the target loss function can be written as:

the above formula is a standard linear programming problem, which can be solved by a network simplex flow algorithm;

when the transport plan γ is fixed, the target loss function can be written as:

the Adam algorithm is used to solve the above equation.

The prediction accuracy of the target domain sample label is calculated according to the following formula:

wherein the content of the first and second substances,

is a sample

The predicted label of the tag is used to predict,

is the corresponding true label. 1 {. is a binary function when

When the temperature of the water is higher than the set temperature,

is 1; when in use

When the temperature of the water is higher than the set temperature,

is 0.

In order to verify the effectiveness of the invention, the invention selects real bearing fault data in the transmission system for verification. The source domain samples are normal and fault samples under a load condition and are labeled. And the target domain samples are normal and fault samples in another state and have no label. The motor loads are four, namely 0hp, 1hp, 2hp and 3hp, and 6 cross domains are formed for experimental verification: 0-1hp, 0-2hp, 0-3hp, 1-2hp, 1-3hp and 2-3 hp. The failure diameters were 0.007 inches and 0.014 inches. And selecting methods such as a Support Vector Machine (SVM), K-nearest neighbor, softmax classifier, back propagation neural network (BP), migration component analysis (TCA), Joint Distribution Adaptation (JDA), correlation alignment method (CORAL) and the like for comparison experiment with the method. The experimental result is shown in fig. 4, the method can realize bearing fault diagnosis under various working conditions, and the precision is higher than that of other comparison methods.

In conclusion, the optimal transportation-based transmission chain unsupervised domain adaptive fault diagnosis method aligns the distribution of the source domain and the target domain by minimizing the distance between the fault feature space of the source domain and the target domain and the label space, better extracts the depth feature of the target domain sample, ensures that a label classifier trained by the source domain can be well used for target domain data, realizes unsupervised domain adaptive label prediction of the target domain, and improves the fault diagnosis precision.

Claims

1. A transmission chain unsupervised domain adaptive fault diagnosis method based on optimal transportation is characterized by comprising the following steps: the method comprises the following steps:

The corresponding label is

Wherein

For the m-th source domain fault sample,

is composed of

Wherein

2. The optimal transport based drive chain unsupervised domain adaptive fault diagnosis method according to claim 1, characterized in that: in the step (2), an automatic encoder is used as a feature extractor f, and the specific process of automatically extracting the features of the source domain fault sample and the target domain fault sample comprises the following steps:

y＝h_θ(x)＝σ(Wx+b)

x′＝h′_θ′(y)＝σ′(W′y+b′)

the loss function of the entire autoencoder network is:

3. The optimal transport based drive chain unsupervised domain adaptive fault diagnosis method according to claim 2, characterized in that: in the step (2), the softmax classifier has the function of estimating the probability of each sample belonging to each class and taking the class with the highest probability as the class of the sample;

Comprises the following steps:

wherein the content of the first and second substances,

is shown to all

Summing to realize the normalization of the probability value;

denotes the l parameter

Transposing;

cross entropy loss is used as a loss function for the softmax classifier

denotes the beta parameter

The transposing of (1).

4. The optimal transport based drive chain unsupervised domain adaptive fault diagnosis method according to claim 3, characterized in that: in the step (3), the optimal transport solver is used for calculating the distance between the probability distributions of the source domain and the target domain, and the domain adaptation problem is regarded as a special case of the discrete optimal transport problem;

features representing an ith source domain sample;

since the target domain exemplar does not have a corresponding label,

Namely:

the objective function for solving the optimal transport plan is:

wherein the content of the first and second substances,

represents a set of all transport plans γ; gamma (i, j) is the probability quality of the ith source domain sample feature and label transferring to the jth target domain sample feature and label; 1_nAn n × 1 vector representing an element of 1; 1_mAn mx 1 vector representing an element of 1; (R)⁺)^m×nRepresenting a positive real matrix with dimension m × n;<·,·>expressing Frobenius dot product;

5. The optimal transport based drive chain unsupervised domain adaptive fault diagnosis method according to claim 4, characterized in that: in the step (3), the objective loss function to be optimized by the whole model is as follows:

where f' denotes the decoder of the auto-encoder,

and

respectively representing source domain features

And

when the transport plan γ is fixed, the target loss function is written as:

the Adam algorithm is used to solve the above equation.

6. The optimal transport based drive chain unsupervised domain adaptive fault diagnosis method according to claim 5, characterized in that: in the step (4), the prediction accuracy accurve of the target domain sample label is calculated according to the following formula:

wherein the content of the first and second substances,

is a sample

The predicted label of the tag is used to predict,

is a corresponding true label; 1 {. is a binary function when

When the temperature of the water is higher than the set temperature,

is 1; when in use

When the temperature of the water is higher than the set temperature,

is 0.