CN113419519B

CN113419519B - Electromechanical product system or equipment real-time fault diagnosis method based on width learning

Info

Publication number: CN113419519B
Application number: CN202110796980.0A
Authority: CN
Inventors: 刘杰; 王冲
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2021-07-14
Filing date: 2021-07-14
Publication date: 2022-05-13
Anticipated expiration: 2041-07-14
Also published as: CN113419519A

Abstract

The invention provides a width learning-based real-time fault diagnosis method for an electromechanical product system or equipment, which is characterized in that on the basis of a traditional width learning system, a plurality of methods including loss function optimization, namely cost-sensitive learning, random inactivation Dropout and ensemble learning are provided, so that an improved width learning system is obtained. Aiming at the problems of different data types, serious imbalance of categories and the like of actual monitoring data, on the premise of guaranteeing training and optimization efficiency, cost weight, inactivation probability and the like are set as adjustable parameters, and integrated learning voting prediction is carried out according to bagging algorithm bagging to predict a final result, so that the problems of uncertain influence and category imbalance commonly existing in fault diagnosis are solved; the system is fast in training, accurate in prediction, high in stability and robustness based on the improved width learning, is applied to real-time monitoring of the health state of a complex system or equipment, and can prevent faults and provide maintenance suggestions.

Description

Electromechanical product system or equipment real-time fault diagnosis method based on width learning

Technical Field

The application relates to the field of fault diagnosis, in particular to a fault diagnosis method for realizing real-time and accurate monitoring of a fault state by adopting a width learning system aiming at fault diagnosis of an electromechanical product system or equipment.

Background

With the rapid development of information technology and artificial intelligence level, the complexity and risk coefficient of the electromechanical product are greatly improved while the degree of integration and intelligence of the electromechanical product is continuously improved. The electromechanical products are closely connected in the operation process, so that the electromechanical products are easily influenced in a cross mode, and chain reaction is possibly caused by a small fault, so that the whole system is damaged. Not only can bring huge economic loss, but also can endanger the life safety of related personnel. Therefore, the status monitoring and fault diagnosis of the electromechanical product system or device is becoming more and more important in the system health and safety management, and is also becoming the focus of the researchers.

In order to ensure safe and stable operation of electromechanical products, relatively conservative operation and maintenance strategies such as regular inspection and maintenance, part replacement and the like are mainly adopted for daily maintenance at the present stage, but the reliability and the robustness of the system cannot be fundamentally improved, and the operation and maintenance cost of the system is higher and higher. Once an emergency such as a shock, impact, or periodic inspection fails to timely detect wear and degradation of the components, the system is likely to be at great risk. Meanwhile, the regular maintenance also depends on the comprehensive quality and capability of maintenance support personnel, so that the system is more interfered by human factors. Therefore, the demand of accurate and stable fault diagnosis technology in the operation and maintenance of the electromechanical product system is more and more urgent, and accurate maintenance and advanced maintenance based on the technology are also concerned.

Currently, there are many methods available for fault diagnosis, such as expert system models, physical models, data-driven models, and so on. Because the design of a complex system or equipment follows the fault-oriented safety principle, the function and the structure are complex and redundancy exists, and the mechanism is very difficult to research by using the traditional algorithms such as an expert system model, a physical model and the like, even can not be realized. And by adopting the data driving model, the dependence on the prior knowledge of the equipment can be avoided, and the fault diagnosis model can be obtained quickly and cheaply only by enough related monitoring data, maintenance data and the like.

Deep learning networks are very common algorithms in data-driven models. It has the unique advantages in processing high-dimensional data and establishing a complex nonlinear model, and thus is widely used in various fields. However, most deep learning networks are complex in structure and involve a large number of hyper-parameters, which not only makes it difficult to adjust parameters and structures, making the training process extremely time-consuming, but also theoretically making it extremely difficult to analyze deep structures. When the deep learning system is applied to monitoring the state of a complex system or equipment, if sudden situations occur, the operation and maintenance environment is often suddenly changed, the existing fault diagnosis model is not applicable any more, and at the moment, if the model structure and parameters can be timely updated and adjusted to fit the current state of the system or the equipment, the reliability of the system and the equipment can be effectively guaranteed.

Therefore, a deep network and a related method aiming at improving the model training speed are attracting attention. The Breadth Learning System (BLS) provides a new idea for replacing a deep Learning network, avoids a deep hierarchy structure, and meanwhile, carries out network expansion through an incremental Learning mode, and greatly improves the efficiency of model training and optimization. BLS can balance model accuracy and efficiency well, and its effectiveness is verified in various fields including fault diagnosis soon after its introduction. In the field of fault diagnosis, the BLS also has better adaptability, faster calculation speed and higher classification precision. However, BLS is currently used for fault diagnosis of devices such as rotors and bearings, and monitoring data is uniform in type and low in dimension. The application and improvement of the BLS in the current fault diagnosis field mainly aim to improve the classification precision, for example, the BLS is combined with the existing feature extraction method principal component analysis method, Hilbert change and the like, and the problem of class imbalance commonly encountered in the fault diagnosis field is not comprehensively considered. Moreover, the consideration of BLS model parameters and structural stability is also lacking.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention aims to provide a width learning-based real-time fault diagnosis method for an electromechanical product system or equipment. In order to solve the problem that the structure and parameters of the model are difficult to determine and reduce the influence of uncertain factors, the Dropout layer is added into the BLS, and the integrated learning is adopted, so that the real-time monitoring on the health state of the electromechanical product system or equipment is realized on the basis of meeting the requirement of diagnosis precision.

In order to achieve the purpose, the solution adopted by the invention is as follows:

a width learning-based real-time fault diagnosis method for an electromechanical product system or equipment comprises the following steps:

step 1: collecting fault states and monitoring data of electromechanical product systems or equipment to obtain historical monitoring data;

step 2: performing data preprocessing on the historical monitoring data obtained in the step 1 to obtain a preprocessed data set [ X, Y ], wherein X is a relevant variable of a fault state of an electromechanical product system or equipment, and Y is a fault state of the electromechanical product system or the equipment;

and step 3: introducing cost-sensitive learning into an existing width learning system, setting a random inactivation layer, and performing voting type parallel integrated learning to obtain an improved width learning system, wherein the improved width learning system is obtained by integrating N added cost parameters and the width learning system with the random inactivation layer, and according to the preprocessed data set [ X, Y ] obtained in the step 2, the relevant variables X of the fault state of the electromechanical product system or the equipment are used as the input of the improved width learning system, the fault state Y of the electromechanical product system or the equipment is used as the output of the improved width learning system, and the fault diagnosis model of the electromechanical product system or the equipment is obtained by calculation, and the method specifically comprises the following steps:

step 31: step 3, introducing cost-sensitive learning into the existing width learning system, specifically, setting an adjustable cost parameter in an input layer of the existing width learning system, where the adjustable cost parameter has a function of converting an original input of a model into a cost-sensitive input, and the adjustable cost parameter is an adjustable weight matrix Λ:

in the formula: k is the total number of samples input into the width learning system; lambda [ alpha ]_iIs an adjustable parameter;

when the ith sample is a normal sample, λ_iIs 1;

when the ith sample is a failure sample, λ_iIs a number greater than 1;

step 32: step 3, a random inactivation layer is arranged in the existing width learning system, specifically, a random inactivation layer is arranged on a hidden layer of the width learning system, the hidden layer of the width learning system comprises a mapping feature node group and an enhanced node group, and the random inactivation layer has the function of randomly inactivating the mapping feature nodes and the enhanced nodes; inputting a fault state related variable X of the electromechanical product system or the equipment in a width learning system provided with an adjustable cost parameter and a random inactivation layer, taking the fault state Y of the electromechanical product system or the equipment as output, and calculating to obtain an output weight W; obtaining a single classifier comprising a cost parameter, a hidden layer and a random inactivation layer according to the output weight W and the fault state related variable X of the electromechanical product system or equipment;

step 33: obtaining an improved width learning system by adopting voting type parallel integrated learning, wherein the improved width learning system is composed of N width learning systems comprising cost parameters, a hidden layer and a random inactivation layer; selecting N-1 different random inactivation probabilities theta in the random inactivation layer in the step 32 to obtain N-1 width learning systems comprising cost parameters, hidden layers and random inactivation layers; randomly adding white Gaussian noise to the relevant variable X of the fault state of the electromechanical product system or equipment to obtain an N-1 group of input sets; correspondingly inputting the N-1 input sets into the N-1 width learning systems comprising cost parameters, hidden layers and random inactivation layers, taking the fault state Y of the electromechanical product system or equipment as output, and establishing N-1 width learning system classifiers according to the method of obtaining a single classifier in the step 32; obtaining an ensemble learning system according to the N-1 width learning system classifiers and the single classifier obtained in step 32, wherein the ensemble learning system comprises N classifiers; predicting a final output result by integrating the learning output result set in a voting mode to obtain a fault diagnosis model of the electromechanical product or equipment;

and 4, step 4: and (3) acquiring real-time monitoring data, performing data preprocessing on the real-time monitoring data, which is the same as the data preprocessing in the step (2), to obtain the preprocessed real-time monitoring data, inputting the preprocessed real-time monitoring data into the fault diagnosis model of the electromechanical product or the equipment obtained in the step (3), obtaining the real-time fault state of the electromechanical product system or the equipment, and completing fault diagnosis.

Further, the relevant variable X of the fault state of the electromechanical product system or the equipment obtains cost-sensitive input Lambda X through an input layer added with cost parameters, and n groups of random mapping nodes Z of the hidden layer are calculated according to the cost-sensitive input Lambda XⁿAnd m groups of enhanced nodes E^m：

Zⁿ＝[Z₁,Z₂,…,Z_i,…,Z_n]

Z_i＝φ_i(ΛXW_ei+β_ei)

E^m＝[E₁,E₂,…,E_j,…,E_m]

E_j＝ξ_j(Z′ⁿW_hj+β_hj)

In the formula: z is a linear or branched member_iMapping features for the ith group; phi is a_iIs a non-linear activation function; w_eiDetermining an optimal input weight matrix through sparse self-coding; beta is a_eiAn optimal input bias matrix determined by sparse self-encoding; weight matrix W_eiThe dimension of (a) is P multiplied by v, and P is the number of variable columns contained in X; mapping feature Z_iV nodes are contained; e_jA jth enhanced node; xi_jIs a non-linear activation function; w is a group of_hjAnd beta_hjRandom matrix and bias matrix, W, used for the jth set of high-dimensional features, respectively_hjAnd beta_hjAre all (∑)_nV) x η, mapping feature E_jComprises eta nodes;

randomly mapping the n groups of feature nodes ZⁿAnd m groups of enhanced nodes E^mCarrying out matrix splicing to obtain a hidden layer H of the width learning system considering cost sensitive learning:

H＝[Zⁿ|E^m]。

further, in step 32, a random deactivation layer is set in the existing width learning system, which is specifically as follows:

Θ*H/(1-θ)

in the formula: theta is a 0,1 vector of the random inactivation probability theta randomly generated by using the Bernoulli function;

the final output result of the learning system for setting the adjustable cost parameter and the width of the random inactivation layer is as follows:

Θ*H/(1-θ)*W＝Θ[Zⁿ|E^m]/(1-θ)*W＝Y

and calculating the output weight W by using ridge regression, and obtaining the single classifier comprising the cost parameter, the hidden layer and the random inactivation layer according to the output weight W and the relevant variables X of the fault state of the electromechanical product system or the equipment.

Preferably, the fault state in step 1 includes a fault and no fault.

Preferably, the monitoring data in step 1 is heterogeneous data, and the heterogeneous data includes continuous features, discrete features and signal features.

Preferably, the data preprocessing in step 2 includes filling missing values, replacing outliers, processing dimension gaps, and digitizing; the abnormal value is a record obviously exceeding a specified environment threshold value of the system or the equipment; the processing dimension difference refers to data normalization, and the data normalization is to convert the monitoring data of the electromechanical product system or equipment into numerical variables of 0 to 1

Compared with the prior art, the invention has the beneficial effects that:

1. on the basis of the existing width learning system, cost sensitive learning is adopted to fully utilize original data, and the operation burden is not increased basically;

2. the cost sensitive parameter and the random inactivation probability parameter are set as adjustable parameters, so that the flexibility and the adaptability are higher;

3. random inactivation and integrated learning are integrated, so that the model structure and parameters are stable, and the universality is strong;

4. the improved width learning-based system is fast in training, accurate in prediction, high in stability and robustness, applied to monitoring the fault state of a complex system or equipment in real time, capable of preventing faults and providing maintenance suggestions, and has guiding and reference significance for other algorithms and scenes such as fault prediction and the like.

Drawings

FIG. 1 is a block flow diagram of a method for real-time fault diagnosis of an electromechanical product system or device based on width learning according to an embodiment of the present invention;

FIG. 2 is a block diagram of a single width learning system classifier containing cost parameters, hidden layers and random deactivation layers according to an embodiment of the present invention;

FIG. 3 is a flow chart of a comparative experiment for verifying the effectiveness of the present invention based on examples in an embodiment of the present invention;

FIG. 4 is a schematic diagram of a t-SNE projection of a one-year health status monitoring data of a typical electromechanical brake system in accordance with an embodiment of the present invention.

Detailed Description

Hereinafter, embodiments of the present invention will be described with reference to the drawings.

The embodiment of the invention provides a width learning-based real-time fault diagnosis method for an electromechanical product system or equipment, which comprises the following specific steps of:

step 1: collecting fault states and related monitoring data of the electrical product system or equipment in a period of time as historical monitoring data;

generally, the collected system or equipment fault state monitoring data is heterogeneous data, including continuous features, discrete features and signal features.

And 2, step: and (4) carrying out data preprocessing on the historical monitoring data, including missing values, abnormal values, dimension differences, digital processing and the like, so as to obtain a preprocessed data set [ X, Y ]. The key indexes of the signal characteristics are extracted and the discrete characteristics are digitized, so that the monitoring data can be converted into numerical data. The method is based on numerical data obtained after preprocessing, wherein the numerical data comprise continuous features and discrete features.

If the fault characteristic variables of the discrete data of the original fault characteristic variable data set contain the fault characteristic variables with only one state, the diagnosis of whether the system is in fault or not from the data is worthless, so that the fault characteristic variables need to be deleted, and subsequent research is carried out according to the remaining discrete data fault characteristic variables and the continuous data fault characteristic variables.

The electromechanical product system or equipment fault state related variable X is used as an input to the improved width learning system. The historical monitoring data needs to be cleaned up and converted, such as filling missing values, replacing abnormal values and processing dimension gaps. The abnormal value refers to data which is artificially judged to be abnormal records, for example, records such as wind speed, temperature and the like obviously exceed the specified environmental threshold of a system or equipment, for example, if the specified operating temperature range of the equipment is 0-50 ℃, temperature records obviously exceeding the specified range (for example, more than 100 ℃) in the monitoring data should be deleted. The process dimension gap is data normalization in order to balance the dimensional differences between numerical variables. The monitoring data is normalized and converted into numerical variables of 0 to 1.

The electromechanical product system or device failure state Y is used as an output of the improved width learning system, i.e., whether or not there is a failure. Various faults may occur in the system or the equipment, and the invention only distinguishes fault and no fault, and does not distinguish fault types.

And step 3: the method adopts a width learning system to carry out fault diagnosis, and aims at the uncertain influence and category imbalance problems in the fault diagnosis of complex systems and equipment to carry out the following improvements, so as to obtain the improved width learning system:

step 31: in order to fully utilize the information of the existing data and not increase the operation burden as much as possible, on the basis of deeply analyzing the width learning system, the invention adds the cost sensitive learning into the width learning system, thereby not only effectively reducing the influence of class imbalance, but also greatly improving the identification accuracy and the detection rate of abnormal states.

The method is characterized in that cost sensitive learning is introduced into the existing width learning system, namely adjustable cost parameters are set in an input layer of the existing width learning system, wherein the adjustable cost parameters have the function of converting original input of a model into cost sensitive input, and the adjustable cost parameters are adjustable weight matrixes Lambda:

when the ith sample is a normal sample, λ_iIs 1;

when the ith sample is a failure sample, λ_iIs a number greater than 1 and is denoted as λ.

And gradually increasing the lambda until the model prediction accuracy is reached when training a single width learning system classifier, namely, if the proportion of the correctly predicted samples in the total samples tends to be stable or begins to decline, an approximate local optimal solution can be obtained.

Assume the preprocessed data set is [ X, Y ]]∈R^K×(P+Q)K is the total sample size contained in the monitoring data of the electromechanical product system or the device, i.e., the total number of samples input into the width learning system, and is also the total number of rows of data, P is the number of variable columns contained in X, and Q is the number of columns of variables in Y, then the cost sensitive input is Λ X, as shown in fig. 2, n sets of mapping feature nodes Z can be obtained by calculating cost input data Λ Xⁿ：

Zⁿ＝[Z₁,Z₂,…,Z_i,…,Z_n] (1)

Z_i＝φ_i(XW_ei+β_ei) (2)

Wherein: x is formed by R^K×POriginal input data of K P dimensions; z_iMapping features for the ith group; phi is a unit of_iIs a non-linear activation function; w_eiIs an optimal input weight matrix determined by sparse self-encoding; beta is a_eiIs an optimal input bias matrix determined by sparse self-encoding; here weight W_eiAnd offset beta_eiAre all randomly generated and each mapped feature set Z_iThe number of nodes in (1) is represented by the weight W_eiAnd offset beta_eiIs determined if the dimension of the input X is K × P and the weight W_eiIs P x v, then Z_iThere are v mapping feature set nodes.

Further utilizing the enhanced feature nodes to supplement random mapping feature nodes to obtain m groups of enhanced nodes E^m：

E^m＝[E₁,E₂,…,E_j,…,E_m] (3)

E_j＝ξ_j(ZⁿW_hj+β_hj) (4)

Wherein: e_jA jth enhanced node; xi_jIs a non-linear activation function; w_hjAnd beta_hjRandom matrix and bias matrix for the jth set of high-dimensional features, respectively, and also generated randomly, of dimension (Σ)_nV) x η, enhanced node groupE_jContains eta nodes.

Will Zⁿ＝{Z₁,Z₂,...,Z_nAnd E^m＝{E₁,E₂,...,E_mPerforming matrix splicing to obtain the actual input H of the width learning system, namely considering a hidden layer of the width learning system for cost sensitive learning:

H＝[Zⁿ|E^m] (5)

converting the result H ═ Zⁿ|E^m]As an actual input of the width learning system, the final output result of the model is HW, where W is an output weight, and the result corresponds to the electromechanical product system or equipment failure state Y in the preprocessed data set, that is, the electromechanical product system or equipment failure state Y is used as an output of the width learning system, and then the following equation (6) can be obtained:

HW＝[Zⁿ|E^m]W＝Y (6)

solving for the output weight W using ridge regression yields:

wherein: the value c represents a further constraint on the output weight W squared weight sum; l is the total number of hidden layer nodes; i is an L-order identity matrix; t is matrix transposition operation; here, the total number L of hidden layer nodes is:

L＝nν+mη (8)

in the formula: v is a mapping feature group Z_iThe number of included nodes; eta is enhanced node group E_jThe number of nodes involved.

Step 32: a random inactivation layer is arranged on a hidden layer of the width learning system, so that the stability of a model structure and parameters is improved;

the introduction of Dropout in BLS is to perform random deletion of nodes for the hidden layer H of the breadth learning system considering cost-sensitive learning, for which a probability θ vector Θ, i.e. a 0,1 vector, whose dimension is the same as the total number L of hidden layer nodes, n ν + m η, is generated randomly by using Bernoulli function. Θ × H may indicate that the nodes in H are randomly disabled with a probability θ, and to keep the total node expected to be unchanged, the weights of the nodes after Dropout are further scaled, so that the hidden layer H' after random deactivation is:

H′＝Θ*H'/(1-θ) (9)

at this time, the width learning system failure state output Y is taken as the output of the width learning system:

Θ*HW′/(1-θ)＝Y (10)

obtaining the output weight W' after random inactivation by ridge regression:

when p enhancement nodes are newly added, which is equivalent to adding p columns in the matrix H', a new hidden layer H^m+1Is recorded as:

in the formula:

and

a random matrix and a bias matrix of the p enhanced nodes respectively, and the dimensionalities of the random matrix and the bias matrix are

Comprises p nodes;

deriving a new output weight W from an enhanced learning algorithm of the breadth learning system^m+1Wherein the output weights W are represented for simplicity^m+1Introducing symbols D and B^TSymbols D and B^TCorresponding to formula (13) and formula (14), respectively:

wherein (H')⁺The pseudo inverse matrix is H', and specifically comprises the following steps:

wherein: epsilon is an adjusting parameter for constraining the unit matrix I;

the symbol C introduced in formula (12) corresponds to formula (13):

wherein (C)⁺A pseudo-inverse matrix of C, specifically:

introducing symbols D and B^TRear output weight W^m+1The method specifically comprises the following steps:

after the width learning system adds the enhanced nodes, new output weight W can be quickly obtained by only calculating the pseudo-inverse of the corresponding nodes without recalculating all weights^m+1. According to the new output weight W^m+1A single classifier containing cost parameters, hidden layers, and randomly inactive layers can be obtained.

Step 33: and voting type parallel integrated learning is adopted, white noise is added, different parameters are set, the influence of uncertain factors is reduced, and the robustness of the model is improved. The method specifically comprises the following steps:

selecting N-1 different random inactivation probabilities theta in the random inactivation layer in the step 32 to obtain N-1 width learning systems comprising cost parameters, hidden layers and random inactivation layers; randomly adding white Gaussian noise to a fault state related variable X of an electromechanical product system or equipment to obtain an N-1 group of input sets; correspondingly inputting N-1 width learning systems containing cost parameters, hidden layers and random inactivation layers into the N-1 groups of input sets, taking the fault state Y of the electromechanical product system or equipment as output, and establishing N-1 width learning system classifiers according to the method for obtaining a single classifier in the step 32; obtaining an ensemble learning system based on the N-1 width learning system classifiers and the single classifier obtained in step 32, the ensemble learning system comprising N classifiers; predicting a final output result by integrating the learning output result set in a voting mode to obtain a fault diagnosis model of the electromechanical product or equipment;

in order to improve the generalization precision and stability of the model, different Dropout probabilities theta are selected to establish a plurality of BLS classifiers, and the final output result is predicted in a voting mode, so that although the calculation amount is increased, the burden of determining the structure and parameters of the model is greatly reduced, and the robustness in practical application can be improved.

And 4, step 4: and (3) acquiring real-time monitoring data, performing data preprocessing on the acquired real-time monitoring data according to the data preprocessing mode adopted in the step (2) to obtain preprocessed real-time monitoring data, inputting the preprocessed real-time monitoring data into the electromechanical product system or equipment fault diagnosis model obtained in the step (3), obtaining the fault state of the real-time monitoring data of the electromechanical product system or equipment, and completing fault diagnosis.

As shown in fig. 1, based on historical monitoring data of an electromechanical product system or equipment, inputting the data after cost-sensitive learning into a width learning system, and randomly inactivating actual input of the width learning system to construct a single classifier; randomly adding white noise in cost sensitive input, and selecting different random inactivation probabilities to construct a plurality of classifiers to form an integrated learning system; after the real-time monitoring data of the electromechanical product system or equipment is subjected to data processing, a plurality of classifiers in the integrated learning system can vote to output the real-time fault state of the electromechanical product system or equipment.

The method provided by the embodiment is used for verifying the effectiveness of the method based on actual monitoring data of a high-speed rail braking system.

In order to check the effectiveness of the proposed method, the part establishes a width learning system model and a deep learning network based on actual monitoring data of a high-speed rail running for one year: the ANN and the CNN are compared and tested to verify that the width learning system can greatly reduce the model training time on the premise of ensuring certain precision, and the method can well solve the problem of class imbalance, and the test flow is shown in figure 3.

The input data X are 43 variables that may be associated with a brake system failure. Including train level conditions such as: GPS position, speed, mode of operation, external power supply, operating time, line voltage, line current, etc.; brake system level conditions such as internal temperature, battery voltage, detected slip or slide, ED braking state, TCL braking state, achieved braking force, etc. For reasons of security, these variables are not listed here in detail and are named Var1, Var2, …, Var 43. Not all of these 43 variables are useful for the fault detection task.

Firstly, data cleaning and conversion, such as filling missing values and data standardization, are carried out on raw data collected by the sensor. Data normalization is to balance dimensional differences between numerical variables. Furthermore, some variables in the raw data need to be converted to numerical values, depending on the data requirements of the model. After data processing, 43 variables in the input data become numerical variables between [0,1 ].

The output data Y is the fault state of the high-speed rail brake system, i.e. whether the high-speed rail brake system is in fault. Various faults may occur in the brake system of the high-speed rail, such as: stopcock emergency brake valve closure, truck pneumatic brake stopcock closure, MTB isolation, stopcock closure, etc., herein to distinguish only a brake system fault or no fault. There are 28837 data points in the normal state, and 159 fault states, and there is a serious imbalance in the data.

In order to avoid the influence of random factors, the average precision and the running time of ten-fold cross validation are observed in experiments. In addition, in order to obtain an approximate local optimal solution, the number of nodes, the cost-sensitive weight, the Dropout probability and other parameters are gradually increased when the number of nodes, the cost-sensitive weight, the Dropout probability and the like are smaller until the test precision is basically kept unchanged or reduced. Under the same operating environment, the model outputs the average training time and the average G-mean and the average F1-score of fault classes in the test results through 10 times of cross validation.

As shown in FIG. 4, the processed data set is based on a two-dimensional projection graph of t-SNE embedded in a t-distribution random neighborhood, wherein 1.0 in black represents a fault sample, and 0.0 in gray represents a normal sample. the t-SNE algorithm is one of the most common and efficient methods for analyzing high-dimensional data visualizations. The method converts the similarity between data points into probability, and the high-dimensional data can be visualized by projecting the high-dimensional data to a two-dimensional or three-dimensional space. Since the faulty sample is completely covered when all data is used, only 10% of the normal sample is shown in the figure. By comparison, the 10% normal samples in the graph are substantially consistent with the t-SNE projection distribution of the entire data set.

It can also be seen from fig. 4 that the data is severely unbalanced and the two types of data have strong crossability, which makes the accuracy and recall of fault conditions not reach very high levels at the same time, i.e. F1-score and G-means cannot reach maximum values at the same time. Therefore, the generalized accuracy of the algorithm needs to be evaluated by adopting the comprehensive index of the two indexes, and since the two indexes are expected to be the same in an actual scene, the geometric mean value F-G of the two indexes is taken as the comprehensive index in the method, not the arithmetic mean value.

Firstly, respectively comparing the effects of the BLS added with the cost sensitivity and the Dropout with the original BLS, and marking the BLS added with the cost sensitivity as C-BLS; the BLS for Dropout was set to D-BLS, and the results are shown in Table 1:

TABLE 1 Effect of Using cost sensitivity and Dropout

	BLS	C-BLS	D-BLS
				F1-score	0.60465	0.89524	0.7439
G-means	0.78429	0.94894	0.86645
				F-G	0.68864	0.9217	0.80284
Average training duration(s)	1.847	1.917	8.623

It can be seen that the accuracy of BLS can be greatly improved with cost-sensitive learning and almost no computational burden is added, and Dropout can make the prediction result more stable despite the increased amount of computation.

Next, different models were compared, and the experimental results are shown in table 2: the CD-BLS is a model corresponding to the BLS added with cost sensitivity and Dropout; the ANN and the CNN are respectively models adopting corresponding deep learning networks:

TABLE 2 comparison of the effects of different models

	CD-BLS	ANN	CNN
				F1-score	0.90816	0.87999	0.91667
G-means	0.95519	0.95743	0.95743
				F-G	0.93138	0.91789	0.93683
Average training duration(s)	10.147	232.652	76.899

It can be seen that the training duration of the CD-BLS is significantly much shorter than ANN and CNN when the generalization accuracies are approximately the same. Experiments prove that the abnormity monitoring model constructed based on the width learning system meets the requirements of real-time performance and accuracy in practice. Moreover, the experiment is based on the monitoring data of the high-speed train running for one year, and the high efficiency of the BLS can be expected to be more reflected under the condition of very large data volume, namely when the starting time point of the monitoring data is far away.

Compared with the prior art, the electromechanical product system or equipment real-time fault diagnosis method based on width learning effectively solves the problem of unbalanced category in monitoring data, and does not increase operation burden; the training time is greatly reduced while high-precision diagnosis is met, and a diagnosis model can be rapidly updated if an emergency occurs; a Dropout layer and integrated learning are added, the applicability, stability and robustness of the model are improved, and the model structure and parameters can be used in different scenes.

The above-mentioned embodiments are merely illustrative of the preferred embodiments of the present invention, and do not limit the scope of the present invention, and various modifications and improvements made to the technical solution of the present invention by those skilled in the art without departing from the spirit of the present invention shall fall within the protection scope defined by the claims of the present invention.

Claims

1. A real-time fault diagnosis method of an electromechanical product system or equipment based on width learning is characterized by comprising the following steps:

in the formula: k is the total number of samples input into the width learning system; lambda_iIs an adjustable parameter;

when the ith sample is a normal sample, λ_iIs 1;

when the ith sample is a failure sample, λ_iIs a number greater than 1;

step 33: obtaining an improved width learning system by adopting voting type parallel integrated learning, wherein the improved width learning system is composed of N width learning systems comprising cost parameters, a hidden layer and a random inactivation layer; selecting N-1 different random inactivation probabilities theta in the random inactivation layer in the step 32 to obtain N-1 width learning systems comprising cost parameters, hidden layers and random inactivation layers; randomly adding Gaussian white noise to the relevant variable X of the fault state of the electromechanical product system or equipment to obtain N-1 groups of input sets; correspondingly inputting the N-1 input sets into the N-1 width learning systems comprising cost parameters, hidden layers and random inactivation layers, taking the fault state Y of the electromechanical product system or equipment as output, and establishing N-1 width learning system classifiers according to the method of obtaining a single classifier in the step 32; obtaining an ensemble learning system according to the N-1 width learning system classifiers and the single classifier obtained in step 32, wherein the ensemble learning system comprises N classifiers; predicting a final output result by integrating the learning output result set in a voting mode to obtain a fault diagnosis model of the electromechanical product or equipment;

2. The method for real-time fault diagnosis of electromechanical product system or equipment based on width learning as claimed in claim 1, wherein the variables X related to fault state of electromechanical product system or equipment are subject to cost sensitive input Λ X via an input layer added with cost parameters, and n groups of randomly mapped feature nodes Z of a hidden layer are calculated according to the cost sensitive input Λ XⁿAnd m groups of enhanced nodes E^m：

Zⁿ＝[Z₁,Z₂,…,Z_i,…,Z_n]

Z_i＝φ_i(ΛXW_ei+β_ei)

E^m＝[E₁,E₂,…,E_j,…,E_m]

E_j＝ξ_j(ZⁿW_hj+β_hj)

In the formula: z_iMapping features for the ith group; phi is a_iIs a non-linear activation function; w_eiDetermining an optimal input weight matrix through sparse self-coding; beta is a_eiAn optimal input bias matrix determined by sparse self-encoding; weight matrix W_eiThe dimension of (a) is P multiplied by v, and P is the number of variable columns contained in X; mapping feature Z_iV nodes are contained; e_jA jth enhanced node; xi_jIs a non-linear activation function; w_hjAnd beta_hjRandom matrix and bias matrix, W, used for the jth set of high-dimensional features, respectively_hjAnd beta_hjAre all (∑)_nV) x η, mapping feature E_jComprises eta nodes;

H＝[Zⁿ|E^m]。

3. the method for real-time fault diagnosis of electromechanical product systems or devices based on width learning according to claim 2, wherein the step 32 is to set a random deactivation layer in the existing width learning system, specifically as follows:

Θ*H/(1-θ)

in the formula: theta is a 0,1 vector of a random deactivation probability theta randomly generated by utilizing a Bernoulli function;

Θ*H/(1-θ)*W＝Θ[Zⁿ|E^m]/(1-θ)*W＝Y

4. The method for real-time fault diagnosis of electromechanical product systems or devices based on width learning according to claim 1, wherein said fault status in step 1 comprises fault and no fault.

5. The method for real-time fault diagnosis of electromechanical product system or device based on width learning according to claim 1, wherein the monitoring data in step 1 is heterogeneous data, and the heterogeneous data comprises continuous features, discrete features and signal features.

6. The method for real-time fault diagnosis of electromechanical product system or device based on width learning according to claim 1, wherein the data preprocessing in step 2 comprises filling missing values, replacing outliers, processing dimension gaps and digitizing; the abnormal value is a record obviously exceeding a specified environment threshold value of the system or the equipment; the processing dimension difference refers to data normalization, and the data normalization is to convert the monitoring data of the electromechanical product system or equipment into numerical variables of which the value is more than 0 and less than 1.