CN113505445B

CN113505445B - Real-time fault diagnosis method and system based on sequential random forest

Info

Publication number: CN113505445B
Application number: CN202111058850.3A
Authority: CN
Inventors: 宋佳; 艾绍洁; 赵凯; 苏江城; 尚维泽; 蔡国飙
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2021-09-10
Filing date: 2021-09-10
Publication date: 2022-03-01
Anticipated expiration: 2041-09-10
Also published as: CN113505445A

Abstract

The invention provides a real-time fault diagnosis method and a system based on a sequential random forest, which comprises the following steps: acquiring a residual signal to be detected of the nonlinear system to be diagnosed; extracting a time-frequency characteristic vector of a residual signal to be detected based on wavelet packet decomposition; substituting the time-frequency characteristic vector into a trained random forest fault separator, and carrying out sequential probability ratio inspection on a nonlinear system to be diagnosed to obtain target fault information; the target fault information comprises a target fault type and target fault occurrence time; and substituting the time-frequency characteristic vector and the target fault type into a trained regression random forest fault identifier to obtain fault size information corresponding to the target fault type. The invention solves the technical problems that the real-time fault diagnosis cannot be realized and the missed diagnosis rate and the misdiagnosis rate are high in the prior art.

Description

Real-time fault diagnosis method and system based on sequential random forest

Technical Field

The invention relates to the technical field of unmanned aerial vehicle fault diagnosis, in particular to a real-time fault diagnosis method and system based on sequential random forests.

Background

The existing methods for diagnosing faults through quantitative analysis are mainly classified into methods based on analytical models and methods based on data driving. The core idea of fault diagnosis based on the analytical model is to estimate the system variable observed value through the state and take the residual error generated between the system variable observed value and the system state variable true value as the diagnosis source and basis. The residual is typically obtained in two ways: the key point of the former is to design an observer and a filter with strong filtering and high dynamic response characteristics based on a state observer and based on a reference model, and the difficulty of the latter is to construct an accurate high-dimensional nonlinear system nominal model. The estimation precision of the state observer and the design precision of the reference model directly influence the accuracy of fault diagnosis to a certain extent.

In order to realize active fault-tolerant control of a system, fault diagnosis needs to realize two main aspects of functions: firstly, determining the type and severity of the fault, namely, having fault separation and identification capability; and secondly, a real-time system state monitoring mechanism is provided, and the occurrence of faults can be rapidly detected. The data driving method based on machine learning does not depend on a model, and the capability of the fault diagnosis system can be greatly improved by learning and extracting features from mass data to fit the nonlinear mapping relation between residual observation data and the system state. Random forests, which are typical methods for ensemble learning, have excellent overfitting resistance and noise insensitivity by introducing randomness. Compared with a neural network, the random forest has few hyper-parameters, does not need feature subset selection, can process high-dimensional data generated by a complex system, uses unbiased estimation for training and learning, and does not need normalized processing and extra verification set setting. The method is simple to implement and can realize rapid fault diagnosis. In addition, the random forest has the capability of fitting regression and isolating classification, wherein the random forest can realize the identification of the size of the fault, and the random forest can realize the classification of the fault mode. Although the random forest has the advantages, the dependence of state data on a space domain is only considered, the historical data relevance on a time domain is ignored, the diagnosis based on a single sample is limited by random disturbance such as sensor noise, real-time fault diagnosis cannot be realized, and the missed diagnosis rate and the misdiagnosis rate are high.

Disclosure of Invention

In view of the above, the present invention provides a real-time fault diagnosis method and system based on sequential random forest to solve the technical problems of the prior art that real-time fault diagnosis cannot be realized and the missed diagnosis rate and the misdiagnosis rate are high.

In a first aspect, an embodiment of the present invention provides a real-time fault diagnosis method based on a sequential random forest, including: acquiring a residual signal to be detected of the nonlinear system to be diagnosed; extracting a time-frequency characteristic vector of the residual signal to be detected based on wavelet packet decomposition; substituting the time-frequency characteristic vector into a trained random forest fault separator, and carrying out sequential probability ratio inspection on the nonlinear system to be diagnosed to obtain target fault information; the target fault information comprises a target fault type and target fault occurrence time; and substituting the time-frequency characteristic vector and the target fault type into a trained regression random forest fault identifier to obtain fault size information corresponding to the target fault type.

Further, acquiring a residual signal to be detected of the nonlinear system to be diagnosed, including: establishing a nominal model of the nonlinear system to be diagnosed, and acquiring a nominal system residual error signal of the nominal model; acquiring an observation residual signal of the nonlinear system to be diagnosed; and obtaining the residual signal to be detected based on the nominal system residual signal and the observation residual signal.

Further, the method further comprises: obtaining the residual signal to be measured based on the nominal system residual signal and the observation residual signal, including: determining the residual signal to be measured by the following formula: e.g. of the type_l=e_s-g(e_n,t_r,t_c) (ii) a Wherein e is_lFor the residual signal to be measured, e_sFor the observed residual signal, e_nFor the nominal system residual signal, g (e)_n,t_r,t_c) As a nominal residual injection function, t_r,t_cRespectively command signal change time instant and nominal residual injection duration.

Further, the method further comprises: acquiring a residual error sample training set of the nonlinear system to be diagnosed; the residual sample training set comprises residual sample data in a normal mode and residual sample data in a fault mode; and training a preset random forest fault separator based on the residual error sample training set, and optimizing the structural parameter combination of the preset random forest fault separator by using a sparrow optimization algorithm to obtain the trained random forest fault separator corresponding to the optimal structural parameter combination.

Further, the method further comprises: acquiring a fault type training set of the nonlinear system to be diagnosed; the fault type training set comprises residual sample data of the nonlinear system to be diagnosed under each fault type; and training a preset regression random forest fault identifier based on the fault type training set, and optimizing the structural parameter combination of the preset regression random forest fault identifier by using a sparrow optimization algorithm to obtain the trained regression random forest fault identifier corresponding to the optimal structural parameter combination.

Further, substituting the time-frequency characteristic vector into a random forest fault separator after training, and performing sequential probability ratio test on the nonlinear system to be diagnosed to obtain target fault information, wherein the steps comprise: substituting the time-frequency characteristic vector into a random forest fault separator after training, and obtaining the probability output of each fault type corresponding to the current state according to the forest voting result; calculating sequential probability ratio statistics for each fault type based on the probability outputs; and performing sequential probability ratio test on the nonlinear system to be diagnosed based on the sequential probability ratio statistic of each fault type to obtain target fault information.

Further, calculating sequential probability ratio statistics for each fault type based on the probability outputs, including: calculating a sequential probability ratio statistic for each fault type by:

(ii) a Wherein the content of the first and second substances,

sequential probability ratio statistics of k fault types occurring at the time t of the nonlinear system to be diagnosed,

for the nonlinear system to be diagnosed at t-t_sSequential probability ratio statistic, t, for the k-th fault type occurring at a time_sFor the truncation time interval of the residual signal to be measured,P_k(X_t) Probability P of occurrence of k-th fault type at t moment of the nonlinear system to be diagnosed₀(X_t) And the probability that the nonlinear system to be diagnosed is normal at the moment t is obtained.

In a second aspect, an embodiment of the present invention further provides a real-time fault diagnosis system based on a sequential random forest, including: the system comprises an acquisition module, an extraction module, a fault separation module and a fault identification module; the acquisition module is used for acquiring a residual signal to be detected of the nonlinear system to be diagnosed; the extraction module is used for extracting the time-frequency characteristic vector of the residual signal to be detected based on wavelet packet decomposition; the fault separation module is used for substituting the time-frequency characteristic vector into a trained random forest fault separator and carrying out sequential probability ratio inspection on the nonlinear system to be diagnosed to obtain target fault information; the target fault information comprises a target fault type and target fault occurrence time; and the fault identification module is used for substituting the time-frequency characteristic vector and the target fault type into a regression random forest fault identifier after training to obtain fault size information corresponding to the target fault type.

In a third aspect, an embodiment of the present invention further provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the method according to the first aspect when executing the computer program.

In a fourth aspect, the present invention further provides a computer-readable medium having non-volatile program code executable by a processor, where the program code causes the processor to execute the method according to the first aspect.

The invention provides a real-time fault diagnosis method and a real-time fault diagnosis system based on sequential random forests, which can realize that data is continuously increased until a fault is detected in the fault detection process by combining a random forest fault separator and sequential probability ratio detection, can realize real-time self-adaptive multi-sample fault diagnosis, and solves the technical problems that the real-time fault diagnosis cannot be realized and the missed diagnosis rate and the misdiagnosis rate are high in the prior art.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

Fig. 1 is a flowchart of a real-time fault diagnosis method based on a sequential random forest according to an embodiment of the present invention;

fig. 2 is a construction diagram of a four-rotor unmanned aerial vehicle system fault diagnosis residual observer provided by the embodiment of the invention;

fig. 3 is a schematic diagram of a real-time fault diagnosis system based on a sequential random forest according to an embodiment of the present invention;

fig. 4 is a schematic diagram of another real-time fault diagnosis system based on sequential random forests according to an embodiment of the present invention.

Detailed Description

The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The first embodiment is as follows:

fig. 1 is a flowchart of a method for real-time fault diagnosis based on sequential random forests, which is applied to fault detection of a nonlinear system according to an embodiment of the present invention. As shown in fig. 1, the method specifically includes the following steps:

and step S102, acquiring a residual signal to be detected of the nonlinear system to be diagnosed.

And step S104, extracting the time-frequency characteristic vector of the residual signal to be detected based on wavelet packet decomposition.

And S106, substituting the time-frequency characteristic vector into the trained random forest fault separator, and carrying out sequential probability ratio inspection on the nonlinear system to be diagnosed to obtain target fault information. The target fault information comprises a target fault type and a target fault occurrence time.

And S108, substituting the time-frequency characteristic vector and the target fault type into the trained regression random forest fault identifier to obtain fault size information corresponding to the target fault type.

The invention provides a real-time fault diagnosis method based on sequential random forests, which can realize that data is continuously increased until a fault is detected in the fault detection process by combining a random forest fault separator and sequential probability ratio detection, can realize real-time self-adaptive multi-sample fault diagnosis, and solves the technical problems that the real-time fault diagnosis cannot be realized and the missed diagnosis rate and the misdiagnosis rate are high in the prior art.

Optionally, step S102 further includes the steps of:

and step S1021, establishing a nominal model of the nonlinear system to be diagnosed, and acquiring a nominal system residual signal of the nominal model.

Step S1022, an observation residual signal of the nonlinear system to be diagnosed is obtained.

And step S1023, obtaining a residual signal to be measured based on the nominal system residual signal and the observation residual signal. Specifically, the residual signal to be measured is determined by the following equation: e.g. of the type_l=e_s-g(e_n,t_r,t_c) (ii) a Wherein e is_lFor the residual signal to be measured, e_sTo observe the residual signal, e_nFor a nominal system residual signal, g (e)_n,t_r,t_c) As a nominal residual injection function, t_r,t_cRespectively command signal change time instant and nominal residual injection duration.

In the following, the embodiment of the present invention takes a quad-rotor unmanned aerial vehicle system as an example of a nonlinear system to be diagnosed, and describes a specific process for acquiring a residual signal to be detected. It should be noted that the quad-rotor drone system in the embodiment of the present invention satisfies the following conditions:

(1) the four-rotor unmanned aerial vehicle body is a rigid body, and deformation and quality change do not occur in the flight process;

(2) the mass distribution of the quad-rotor unmanned aerial vehicle is uniform, and the gravity center is positioned at the geometric center of the unmanned aerial vehicle body;

(3) the quad-rotor unmanned aerial vehicle flies at low altitude and in short distance, so that the surface of the earth is seen as a plane, and the gravity acceleration is unchanged in the flying process;

(4) the influence of earth rotation and revolution on the motion of the quad-rotor unmanned aerial vehicle is ignored, and the ground coordinate system is an inertial coordinate system.

Selecting a coordinate system of the body, origin O_bAt the centroid of quadrotor unmanned plane, x_bAxis directed to the head, y_bAxis pointing to the right of the fuselage, z_bPointing to the lower part of the fuselage. Selecting a ground coordinate system, origin O_eAt any point on the ground, x_eThe axis being in a horizontal plane and pointing in a certain direction, y_eThe axis being perpendicular to the ground and directed towards the centre of the earth, z_eThe axial direction is determined by the right hand rule.

Defining Euler angle to represent flight attitude of quad-rotor unmanned aerial vehicle: the roll angle phi is the angle of the body coordinate system rotating around the x-axis (roll right is a positive direction), the pitch angle theta is the angle of the body coordinate system rotating around the y-axis (upward along the nose direction is a positive direction), and the yaw angle psi is the angle of the body coordinate system rotating around the z-axis (yaw right is a positive direction).

A mathematical model of the quad-rotor unmanned aerial vehicle, which is composed of a translation model and a rotation model, is constructed as follows:

(1)

in the formula (I), the compound is shown in the specification,x,y,zis the three-axis direction of the ground coordinate system,J _x ,J _y ,J _zis a three-axis moment of inertia,j _ris the rotary inertia of the rotor, U₂The rolling moment, U, being the rotation of the body about its x-axis₃To wind the machine bodyyPitching moment of shaft rotation, U₄Yawing moment, omega, for quad-rotor unmanned aerial vehicles₁,ω₂,ω₃,ω₄At four rotary-wing speeds, U₁As a total tension, D_x,D_y,D_zThe air resistance coefficients of the three-axis direction are respectively, m is the mass of the quad-rotor unmanned aerial vehicle, and g is the gravity acceleration.

To model the failure of a sensor accelerometer measuring pitch angle, equation (1) is simplified to a longitudinal channel model without loss of generality:

(2)

a common sensor fault model is established as follows:

constant value deviation fault:

；

and (3) jamming failure:

；

gain variation failure:

；

outlier data failure:

。

sensor faults that the solution is capable of diagnosing here include, but are not limited to, the above 4 faults.

In order to better realize the attitude control of the quad-rotor unmanned aerial vehicle, a Linear Active Disturbance Rejection Control (LADRC) controller is adopted, and a pitch angle residual signal is obtained from a linear extended observer (LESO) of the controller. Based on vertical passageway (2) formula design third-order LESO of four rotor unmanned aerial vehicle:

(3)

where e is the error between observed quantity z1 and controlled object output y, y is the output of the controller, and β₁，β₂，β₃Is an observer adjustable parameter. Fig. 2 is a construction diagram of a four-rotor unmanned aerial vehicle system fault diagnosis residual observer provided by the embodiment of the invention.

Taking the formula (1) as a nominal model of the quad-rotor unmanned aerial vehicle system, inputting an observation residual signal of the quad-rotor unmanned aerial vehicle system and a residual signal of the nominal model into a simulation unmanned aerial vehicle system model LESO, and obtaining a final residual signal to be detected:

e_l=e_s-g(e_n,t_r,t_c)(4)

in the formula, e_l,e_s,e_nRespectively a residual signal to be measured, an observation residual signal and a nominal system residual signal. g (e, t)₁,t₂)=e,t₁<t<t₂Is a nominal residual injection function, where t_r,t_cRespectively, the command signal change time instant and the nominal residual injection duration, the latter being determined empirically from the residual behavior time resulting from the control change.

And then, carrying out 3-layer wavelet packet decomposition on the residual signal to be detected, reconstructing the decomposed signals of 8 frequency bands, and calculating 9 statistical parameters including waveform indexes, margin indexes, pulse indexes, peak indexes, absolute mean values, mean square errors, skewness, kurtosis and square root amplitudes of all reconstructed signals, thereby obtaining the extracted 72-dimensional characteristic vector of the residual signal sample.

Optionally, the method provided by the embodiment of the present invention further includes a training and optimizing process for the preset random forest fault separator, where the process is an offline training part. Specifically, the method comprises the following steps:

acquiring a residual error sample training set of a nonlinear system to be diagnosed; the residual sample training set comprises residual sample data in a normal mode and residual sample data in a fault mode;

training a preset random forest fault separator based on a residual sample training set, and optimizing a structural parameter combination of the preset random forest fault separator by using a sparrow optimization algorithm to obtain a trained random forest fault separator corresponding to the optimal structural parameter combination.

Specifically, the sampling time of residual sample data obtained in each of k +1 modes (including a normal mode and k fault modes) is T_sAnd intercept to obtain [ t ]_f-t_w,t_f+t_w]All length t inside_wOf a sample of (1), intercept interval t_sAnd obtaining characteristic data samples through characteristic extraction, respectively labeling the data samples, and generating a fault residual error sliding sample training set.

And setting a search space of a structural parameter combination (the number of decision trees contained in the random forest and the number of characteristic variables used for the binary tree in the nodes) of a preset random forest model, and designating population scale and maximum genetic algebra. And setting a population step structure comprising the proportion of a leader, a follower and a cautioner, and giving a danger threshold. And randomly generating an initial population in a search space, and finishing the initialization of the improved sparrow structure parameter optimization algorithm.

All the parameter combinations in the set search space range of the parameter combinations are regarded as a sparrow, the positions of individuals in the population are updated by using a sparrow algorithm based on the mixed performance fitness function, and the step updating is realized on the population. The hybrid performance fitness function is defined as follows:

fit _f(X)=ξ_f1, oob _f(X)+ξ_f2, T _f(X)(5)

in the formula (I), the compound is shown in the specification,Xin order to combine the variables for the parameters,T(X) Is composed ofXTraining time for classifying the training set obtained in a by the constructed random forest classifier oob: (X) Out-of-bag error, ξ, for corresponding random forest classification_iThe coefficient of interest is the mixing performance.

And carrying out Rivie flight variation on specific population individuals. And selecting the sparrow individuals with a certain quantity and returned to participate in the tournament each time by utilizing tournament selection, selecting the individual with the minimum fitness to carry out Levy flight variation, and repeating the selection step until the number of the selected individuals is equal to the total number of population individuals.

And updating and selecting the individual position, and replacing the original position with the position with smaller fitness after the flight of the Levy. The position updating formula is as follows:

in the formula (I), the compound is shown in the specification,

for point-to-point multiplication, Levy (d) is the flight variation factor of Levie, and d is the dimension of the structural parameter to be optimized.

Is the position of the ith individual of the variation in the it generation.

And after the maximum genetic algebra is reached, exiting the iterative computation, selecting the individual with the minimum fitness as the optimal individual, and constructing a classified random forest according to the corresponding optimal structural parameter combination to finish off-line training.

Optionally, step S106 further includes the following steps:

step S1061, substituting the time-frequency characteristic vector into a trained random forest fault separator, and obtaining the probability output of each fault type corresponding to the current state according to the forest voting result;

step S1062, calculating sequential probability ratio statistics for each fault type based on the probability output; specifically, the sequential probability ratio statistic for each fault type is calculated by the following equation:

wherein the content of the first and second substances,

sequential probability ratio statistics of the k fault type of the nonlinear system to be diagnosed at the time t,

for the non-linear system to be diagnosed at t-t_sSequential probability ratio statistic, t, for the k-th fault type occurring at a time_sFor the truncation of the residual signal to be measured, P_k(X_t) Probability, P, of occurrence of k-th fault type at time t for the nonlinear system to be diagnosed₀(X_t) The probability that the nonlinear system to be diagnosed is normal at the moment t is obtained;

and step S1063, based on the sequential probability ratio statistic of each fault type, performing sequential probability ratio test on the nonlinear system to be diagnosed to obtain target fault information.

The current time t to t-t_wInputting residual sampling samples of all the preorders into a trained classification regression forest, and obtaining the probability output of each fault corresponding to the current state according to the forest voting result:

(6)

in the formula, k is a fault mode label;

when the voting result of the binary tree at the time t is a fault k, the voting result is 1, otherwise, the voting result is 0;

a classification strategy based on m randomly selected characteristic variables for the ith binary tree; n is the number of decision trees contained in the random forest, and m is the number of characteristic variables used for the binary tree in the node; i is the cumulative count factor.

Taking the fault mode as a discrete random variable X = {0,1,2, …, k }, and satisfying h in the formula (6) at the time t_n,m(X_t) Probability of (2)And (4) distribution. Calculating sequential probability ratio statistics at t moment aiming at kth fault:

(7)

in the formula, P_k(X) represents the probability of the system experiencing a class k fault, P₀(X) represents the probability that the system is normal.

Iterative computation of sequential probability ratio statistic LR according to equation (7)_tCarrying out sequential inspection:

(1) if it is

And stopping checking and judging that the system is in a normal mode. In the formula, P_FNRFor missed diagnosis rate, P_FPRThe misdiagnosis rate is indicated.

(2) If it is

And stopping checking and judging that the system is in the kth fault mode.

(3) If it is

And continuously waiting for the sample at the next sampling moment to be tested, and judging that the system is in the state of the previous sampling moment.

In the embodiment of the invention, because the sequential probability ratio test can only test the fault occurrence or non-occurrence characteristic, the sequential test decision rule is improved to be applied to the fault test of multiple fault types, and the improvement is as follows:

(1) simultaneous computation of sequential probability ratio statistics for k failure modes

And independently performing the iterative calculation and inspection steps, and judging the result

Then the independent test is continued.

(2) If the judgment result has and only has 1 type fault

If yes, judging that the system has the kth fault; if the judgment result has 2 or more types of faults

Then it is determined that the system is maximized

Corresponding class k fault.

(3) Class k failures continue to be checked independently, other classes of failures suspend checking, and

。

(4) if the k-th type fault is judged

And (5) returning to the step (1) to continue the calculation of all fault modes after judging that the fault performance is finished.

In order to eliminate the fault occurrence detection delay caused by the accumulation of negative values below a negative threshold and the fault performance detection delay caused by the accumulation of positive values above a positive threshold, the embodiment of the invention also corrects the iterative calculation of the sequential probability ratio statistic:

in the formula (I), the compound is shown in the specification,

representing the increment of the sequential probability ratio statistic at time t.

Optionally, the method provided in the embodiment of the present invention further includes a training and optimizing process for the pre-set regression random forest fault identifier, where the process is an offline training part. Specifically, the method comprises the following steps:

acquiring a fault type training set of a nonlinear system to be diagnosed; the fault type training set comprises residual sample data of the nonlinear system to be diagnosed under each fault type;

and training the preset regression random forest fault identifier based on the fault type training set, and optimizing the structural parameter combination of the preset regression random forest fault identifier by using a sparrow optimization algorithm to obtain the trained regression random forest fault identifier corresponding to the optimal structural parameter combination.

Specifically, k regression random forest identifiers are respectively established for k different faults, the identifiers are independent in pairs, and the establishing method is the same. The offline portion of the fault identification comprises the steps of:

carrying out equal difference value on the size of the kth fault, and sampling residual sample data obtained under different fault sizes for T_sSimply discrete sampling and clipping to obtain [ t ]_f-t_w+t_s,t_f+t_s]And (3) obtaining characteristic data samples through characteristic extraction, respectively labeling the data samples, wherein the labels are values of the size of the fault, and generating a fault residual error sample training set.

And setting a search space of random forest structure parameters (the number of decision trees contained in the random forest and the number of characteristic variables used for the binary tree in the nodes), and specifying the population size and the maximum genetic algebra. And setting a population step structure comprising the proportion of a leader, a follower and a cautioner, and giving a danger threshold. And randomly generating an initial population in a search space, and finishing the initialization of the improved sparrow structure parameter optimization algorithm.

All parameter combinations in the set parameter search space range are regarded as a sparrow, the positions of individuals in the population are updated by using a sparrow algorithm based on the mixed performance fitness function, and the step updating is realized on the population. The hybrid performance fitness function is defined as follows:

and after the maximum genetic algebra is reached, exiting the iterative computation, selecting the individual with the minimum fitness as the optimal individual, and constructing a regression random forest according to the corresponding optimal structure parameter combination to finish off-line training.

The online part of the fault identification comprises the following steps:

after the fault mode is obtained by fault separation, selecting a regression random forest identifier of the corresponding mode, and identifying the current time from t to t-t_wAnd inputting residual sampling samples of all the preamble moments into a trained regression random forest identifier, and averaging recognition results of each decision tree of the random forest to obtain recognition results.

It can be known from the above description that the embodiment of the present invention provides a real-time fault diagnosis method based on a sequential random forest, which treats fault types as random variables, takes the proportion of voting results of each decision tree constituting the random forest to the total voting number as the probability of different fault mode types, and calculates the likelihood probability ratio of the current time of the random variable based on the distribution probability, where the probability ratio depends on the distribution condition of the random variables at all times in the preamble. The built sequential random forest algorithm can continuously increase data in the fault detection process until the fault is detected, and can realize real-time self-adaptive multi-sample fault diagnosis. Meanwhile, in the process of training a diagnostic model, a random forest hyper-parameter is optimized by using an improved sparrow algorithm, a Levin variation factor is introduced, a mixed performance fitness function is determined based on the error outside the bag and the training time to realize optimization of a random forest structure, high precision can be maintained, and meanwhile, the fault diagnosis data processing time is further reduced to shorten the diagnosis time delay.

Compared with the prior art, the real-time fault diagnosis method based on the sequential random forest provided by the embodiment of the invention has the following technical effects:

1. the method has a faster fault diagnosis speed, and the fault diagnosis time delay is only one ts duration in the embodiment.

2. The method can realize accurate fault diagnosis, can realize the recognition of tiny faults in the noise level and even the noise level, needs less training data, and has stronger robustness and generalization capability.

3. The fault diagnosis during the change of the command signal can be realized, and the method is suitable for fault diagnosis of a normal state variable system.

4. And a fault diagnosis model based on sequential random forests is completely and autonomously constructed, so that the subjectivity and the cost of artificial parameter adjustment are reduced.

Example two:

fig. 3 is a schematic diagram of a real-time fault diagnosis system based on a sequential random forest according to an embodiment of the present invention. As shown in fig. 3, the system includes: the system comprises an acquisition module 10, an extraction module 20, a fault separation module 30 and a fault identification module 40.

Specifically, the obtaining module 10 is configured to obtain a residual signal to be detected of the nonlinear system to be diagnosed.

Optionally, the obtaining module 10 is further configured to: establishing a nominal model of the nonlinear system to be diagnosed, and acquiring a nominal system residual error signal of the nominal model; acquiring an observation residual signal of a nonlinear system to be diagnosed; and obtaining a residual signal to be measured based on the nominal system residual signal and the observation residual signal.

Specifically, the residual signal to be measured is determined by the following equation: e.g. of the type_l=e_s-g(e_n,t_r,t_c) (ii) a Wherein e is_lFor the residual signal to be measured, e_sTo observe the residual signal, e_nIs a nominal systemSystematic residual signal, g (e)_n,t_r,t_c) As a nominal residual injection function, t_r,t_cRespectively command signal change time instant and nominal residual injection duration.

And the extraction module 20 is configured to extract a time-frequency feature vector of the residual signal to be detected based on wavelet packet decomposition.

The fault separation module 30 is used for substituting the time-frequency characteristic vector into the trained random forest fault separator, and performing sequential probability ratio test on the nonlinear system to be diagnosed to obtain target fault information; the target failure information includes a target failure type and a target failure occurrence time.

Optionally, the fault separation module 30 is further configured to substitute the time-frequency feature vector into the trained random forest fault separator, and obtain probability output of each fault type corresponding to the current state according to a forest voting result; calculating sequential probability ratio statistics for each fault type based on the probability outputs; and performing sequential probability ratio test on the nonlinear system to be diagnosed based on the sequential probability ratio statistic of each fault type to obtain target fault information.

Specifically, the fault separation module 30 is further configured to calculate the sequential probability ratio statistic for each fault type by the following equation:

wherein the content of the first and second substances,

for the non-linear system to be diagnosed at t-t_sSequential probability ratio statistic, t, for the k-th fault type occurring at a time_sFor the truncation of the residual signal to be measured, P_k(X_t) To diagnose non-linearitiesProbability of occurrence of k-th fault type, P, in system at time t₀(X_t) The probability that the nonlinear system to be diagnosed is normal at the moment t is obtained.

And the fault identification module 40 is used for substituting the time-frequency characteristic vector and the target fault type into the trained regression random forest fault identifier to obtain fault size information corresponding to the target fault type.

The invention provides a real-time fault diagnosis system based on sequential random forests, which can realize that data is continuously increased until a fault is detected in the fault detection process by combining a random forest fault separator and sequential probability ratio detection, can realize real-time self-adaptive multi-sample fault diagnosis, and solves the technical problems that the real-time fault diagnosis cannot be realized and the missed diagnosis rate and the misdiagnosis rate are high in the prior art.

Optionally, fig. 4 is a schematic diagram of another real-time fault diagnosis system based on a sequential random forest according to an embodiment of the present invention. As shown in fig. 4, the system further includes: a first training module 50 and a second training module 60.

Specifically, the first training module 50 is configured to obtain a residual sample training set of the nonlinear system to be diagnosed; the residual sample training set comprises residual sample data in a normal mode and residual sample data in a fault mode; training a preset random forest fault separator based on a residual sample training set, and optimizing a structural parameter combination of the preset random forest fault separator by using a sparrow optimization algorithm to obtain a trained random forest fault separator corresponding to the optimal structural parameter combination.

A second training module 60, configured to obtain a fault type training set of the nonlinear system to be diagnosed; the fault type training set comprises residual sample data of the nonlinear system to be diagnosed under each fault type; and training the preset regression random forest fault identifier based on the fault type training set, and optimizing the structural parameter combination of the preset regression random forest fault identifier by using a sparrow optimization algorithm to obtain the trained regression random forest fault identifier corresponding to the optimal structural parameter combination.

The embodiment of the present invention further provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, and when the processor executes the computer program, the steps of the method in the first embodiment are implemented.

The embodiment of the invention also provides a computer readable medium with a non-volatile program code executable by a processor, wherein the program code causes the processor to execute the method in the first embodiment.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A real-time fault diagnosis method based on sequential random forests is characterized by comprising the following steps:

acquiring a residual signal to be detected of the nonlinear system to be diagnosed;

extracting a time-frequency characteristic vector of the residual signal to be detected based on wavelet packet decomposition;

substituting the time-frequency characteristic vector into a trained random forest fault separator, and carrying out sequential probability ratio inspection on the nonlinear system to be diagnosed to obtain target fault information; the target fault information comprises a target fault type and target fault occurrence time;

substituting the time-frequency characteristic vector and the target fault type into a trained regression random forest fault identifier to obtain fault size information corresponding to the target fault type;

substituting the time-frequency characteristic vector into a random forest fault separator after training, and carrying out sequential probability ratio inspection on the nonlinear system to be diagnosed to obtain target fault information, wherein the steps comprise:

substituting the time-frequency characteristic vector into a random forest fault separator after training, and obtaining the probability output of each fault type corresponding to the current state according to the forest voting result;

calculating sequential probability ratio statistics for each fault type based on the probability outputs;

performing sequential probability ratio test on the nonlinear system to be diagnosed based on the sequential probability ratio statistic of each fault type to obtain target fault information;

obtaining a residual signal to be detected of a nonlinear system to be diagnosed, comprising the following steps:

establishing a nominal model of the nonlinear system to be diagnosed, and acquiring a nominal system residual error signal of the nominal model;

acquiring an observation residual signal of the nonlinear system to be diagnosed;

obtaining the residual signal to be detected based on the nominal system residual signal and the observation residual signal;

the method further comprises the following steps: obtaining the residual signal to be measured based on the nominal system residual signal and the observation residual signal, including:

determining the residual signal to be measured by the following formula: e.g. of the type_l=e_s-g(e_n,t_r,t_c) (ii) a Wherein e is_lFor the residual signal to be measured, e_sFor the observed residual signal, e_nFor the nominal system residual signal, g (e)_n,t_r,t_c) As a nominal residual injection function, t_r,t_cRespectively changing time of the command signal and nominal residual injection duration;

the method further comprises the following steps:

acquiring a residual error sample training set of the nonlinear system to be diagnosed; the residual sample training set comprises residual sample data in a normal mode and residual sample data in a fault mode;

and training a preset random forest fault separator based on the residual error sample training set, and optimizing the structural parameter combination of the preset random forest fault separator by using a sparrow optimization algorithm to obtain the trained random forest fault separator corresponding to the optimal structural parameter combination.

2. The method of claim 1, further comprising:

acquiring a fault type training set of the nonlinear system to be diagnosed; the fault type training set comprises residual sample data of the nonlinear system to be diagnosed under each fault type;

and training a preset regression random forest fault identifier based on the fault type training set, and optimizing the structural parameter combination of the preset regression random forest fault identifier by using a sparrow optimization algorithm to obtain the trained regression random forest fault identifier corresponding to the optimal structural parameter combination.

3. The method of claim 1, wherein computing sequential probability ratio statistics for each fault type based on the probability outputs comprises:

calculating a sequential probability ratio statistic for each fault type by:

wherein the content of the first and second substances,

for the nonlinear system to be diagnosed at t-t_sSequential probability ratio statistic, t, for the k-th fault type occurring at a time_sIs the residual signal to be measuredInterception time interval of numbers, P_k(X_t) Probability P of occurrence of k-th fault type at t moment of the nonlinear system to be diagnosed₀(X_t) And the probability that the nonlinear system to be diagnosed is normal at the moment t is obtained.

4. A real-time fault diagnosis system based on sequential random forest is characterized by comprising: the system comprises an acquisition module, an extraction module, a fault separation module and a fault identification module; wherein the content of the first and second substances,

the acquisition module is used for acquiring a residual signal to be detected of the nonlinear system to be diagnosed;

the extraction module is used for extracting the time-frequency characteristic vector of the residual signal to be detected based on wavelet packet decomposition;

the fault separation module is used for substituting the time-frequency characteristic vector into a trained random forest fault separator and carrying out sequential probability ratio inspection on the nonlinear system to be diagnosed to obtain target fault information; the target fault information comprises a target fault type and target fault occurrence time;

the fault identification module is used for substituting the time-frequency characteristic vector and the target fault type into a trained regression random forest fault identifier to obtain fault size information corresponding to the target fault type;

the fault separation module is also used for substituting the time-frequency characteristic vector into a trained random forest fault separator, and obtaining the probability output of each fault type corresponding to the current state according to the forest voting result; calculating sequential probability ratio statistics for each fault type based on the probability outputs; performing sequential probability ratio test on the nonlinear system to be diagnosed based on the sequential probability ratio statistic of each fault type to obtain target fault information;

the obtaining module is further configured to: establishing a nominal model of the nonlinear system to be diagnosed, and acquiring a nominal system residual error signal of the nominal model;

the obtaining module is further configured to: obtaining the residual signal to be measured based on the nominal system residual signal and the observation residual signal, including:

the system further includes a first training module to: acquiring a residual error sample training set of the nonlinear system to be diagnosed; the residual sample training set comprises residual sample data in a normal mode and residual sample data in a fault mode; and training a preset random forest fault separator based on the residual error sample training set, and optimizing the structural parameter combination of the preset random forest fault separator by using a sparrow optimization algorithm to obtain the trained random forest fault separator corresponding to the optimal structural parameter combination.

5. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the method of any of the preceding claims 1 to 3 are implemented when the computer program is executed by the processor.

6. A computer-readable medium having non-volatile program code executable by a processor, wherein the program code causes the processor to perform the method of any of claims 1-3.