Disclosure of Invention
In view of this, the present invention provides a track slab deformation identification method based on track side vibration acceleration, which combines LMD, time domain feature extraction and time-frequency domain feature extraction, can effectively extract key information for detecting track slab deformation, quickly and accurately identify track slab deformation, and has great advantages in practical applications.
In order to achieve the purpose, the invention adopts the following technical scheme:
a rail plate deformation identification method based on rail side vibration acceleration comprises the following steps:
step S1, acquiring original rail side vibration signal data;
step S2, processing the vibration signal data of the rail side to convert the vibration signal data into a data subset with the same time length;
s3, acquiring a characteristic parameter vector matrix of the original rail side vibration signal data based on a local mean decomposition method;
s4, constructing a random forest model, and training based on the characteristic parameter vector matrix to obtain a trained random forest;
and step S5, acquiring real-time rail side vibration signal data, classifying according to the trained random forest, and completing the identification of the track slab defect information.
Further, the step S1 is specifically: and an optical fiber vibration acceleration sensor is arranged on the rail side, the optical fiber sensor is connected to a data acquisition unit through an optical cable, the data acquisition unit filters the measured data, and the analog signals are converted into digital signals through an A/D conversion circuit to obtain the original rail side vibration signal data.
Further, the step S2 is specifically: original rail side vibration signal data are converted into data subsets with the same time length through methods of data interception, wavelet threshold denoising and fixed window segmentation.
Further, the step S3 is specifically:
step S31, calculating all local extreme points N of the original signal x (t)iAnd deducing all adjacent local extreme points NiAnd Ni+1Average value of (d):
will correspond to t (N)i) And t (N)i+1) All mean value points m in betweeniConnecting with a straight line to obtain a local mean line, and smoothing the local mean line by using a sliding average method to obtain a local mean function m11(t);
Step S32: from adjacent local extrema NiAnd Ni+1Obtain the local amplitude ai:
In the same way, corresponding t (N)i) And t (N)i+1) All local amplitudes a in betweeniConnecting with a straight line to obtain a local amplitude line, and smoothing the local amplitude line by using a moving average method to obtain an envelope estimation function a11(t);
Step S33: the local mean function m11(t) separating from the original signal x (t) to obtain:
h11(t)=x(t)-m11(t)
step S34: h is to be11(t) and an envelope estimation function a11(t) phase division demodulation h11(t), obtaining:
to s11(t) repeating the steps of S31 and S32 to obtain S11Envelope estimation function a of (t)12(t);
Repeating the above iterative process until a12(t) 1, in which case s is specified11(t) is a frequency modulated signal;
let s1n(t) is obtained by n iterations, then s1nEnvelope estimation function a of (t)1(n+1)(t) satisfies a1(n+1)Where (t) is 1, the iterative process can be described as:
h11(t)=x(t)-m11(t)
h12(t)=s11(t)-m12(t)
h1n(t)=s1(n-1)(t)-m1n(t)
in the formula:
s11(t)=h12(t)/a11(t)
s12(t)=h12(t)/a12(t)
s1n(t)=h1n(t)/a1n(t)
the iteration termination condition is as follows:
step S35: multiplying the envelope estimation function obtained in the iteration process to obtain an envelope signal a1(t):
Step S36: the first PF of the original signal is derived from the envelope signal a1(t) with a pure frequency-modulated signal s1n(t) the multiplication yields:
PF1(t)=a1(t)s1n(t)
step S37: mixing PF1(t) separating from the original signal x (t) to obtain a residual signal u (t), and repeating the residual signal u (t) as the original signal k times until uk(t) is a monotonic function:
u1(t)=x(t)-PF1(t)
u2(t)=u1(t)-PF2(t)
...
uk(t)=uk-1(t)-PFk(t)
at this time, the original signal x (t) is decomposed into k component signals and 1 residual function uk(t):
Step S37: to quantify PFsCorrelation degree with original signals, introducing Pearson correlation coefficient:
in the formula: x represents raw data and Y represents PFs;
Step S38: calculating time domain characteristic parameters of the vibration signals;
step S39: calculating 1 time-frequency domain characteristic parameter of vibration signal, selecting characteristic PFsAs one of the impairment recognition features;
first, the energy E of the qth characteristic PF is calculatedq:
In the formula: l is the length of the characteristic PF;
the total energy of these valid signatures PF is calculated:
the energy entropy of the feature PF is:
in the formula:
is the percentage of the energy of the qth characteristic PF with respect to the total energy entropy;
and finally forming a characteristic parameter vector matrix through 5 time domain characteristics and 1 time-frequency domain characteristic extracted from the characteristic PF.
Further, the time-domain feature parameters include:
peak-to-peak: vp-p=max(PFq)-min(PFq)
crest factor: c ═ max (PF)q)/PFrms
In the formula: PF (particle Filter)
qRepresents the q-th characteristic PF,
the average value of the characteristic PFs is shown, and r is the number of the characteristic PFs.
Further, the step S4 is specifically:
step S41: randomly extracting m from original rail side vibration signal data in a releasing way by a Bootstrap resampling methodtreeThe self-help sample set is used as a training data set D, and data which are not obtained form an OOB data set;
step S42: for the extracted mtreeGenerating corresponding m respectively for each sample settreeClassifying and returning trees;
step S43: with t features, m is randomly drawn at each node of each treetryCalculating and selecting the characteristic with the maximum information gain as a classification characteristic by adopting an information gain principle;
step S44: the classification regression tree is not cut, and each tree is made to grow to a preset limit;
step S45: and forming a random forest by the generated trees.
Further, the step S43 is specifically:
firstly, calculating the information entropy:
wherein p isiThe proportion of each classification characteristic i in the classification system.
Defining the information gain ratio of the feature a to the training data set D as the ratio of the information gain of the feature a to the entropy of the training data set D:
compared with the prior art, the invention has the following beneficial effects:
the method can effectively capture vibration signals containing train vibration, track slab deformation, noise and environmental vibration information, can effectively eliminate the influence of factors such as environmental vibration, noise, time difference and the like on the identification effect by adopting preprocessing methods such as data interception, wavelet threshold denoising, fixed window segmentation and the like, can effectively extract key information for detecting track slab deformation by using the time domain and time frequency domain feature extraction method based on LMD, and can accurately identify the deformation of the track slab through an intelligent identification algorithm of a random forest model.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
Referring to fig. 1, the present invention provides a track slab deformation identification method based on track side vibration acceleration, including the following steps:
step S1: and (5) acquiring a rail side vibration signal. An optical fiber vibration acceleration sensor is arranged on the side of the track, the optical fiber sensor is connected to a data acquisition unit through an optical cable, the data acquisition unit filters measured data, analog signals are converted into digital signals through an A/D conversion circuit, and finally an embedded system of the acquisition unit sends the acquired vibration data to an upper computer through a wireless network;
step S2: and preprocessing vibration data. Converting original data into a data subset with the same time length by methods of data interception, wavelet threshold denoising and fixed window segmentation;
step S3: and extracting time domain and time-frequency domain characteristics. Separating the frequency modulation signal and the envelope signal from the original signal by a Local Mean Decomposition (LMD) method, and multiplying the signals to obtain an instantaneous frequency with the physical meaning of a Product Function (PFs); iterate until all Product Functions (PF) are separated. Finally, obtaining the time-frequency distribution of the original signal;
step S4: and establishing a random forest model architecture to identify the track slab defect information.
Preferably, in this embodiment, step S2 specifically includes:
and step S21, when the train passes through the sensor, the information captured by the sensor is the wheel track vibration signal containing the track slab deformation information, so that the original data needs to be intercepted.
Defining a vibration acceleration threshold delta, wherein the vibration acceleration threshold delta is set to be 0.01g by comparing vibration acceleration values acquired by a sensor when a train does not pass through and passes through;
when the vibration acceleration x (t) at the time t is larger than or equal to delta, recording the x (t) at the time. Recording x (t +1), x (t +2) and x (t +3) until x (t + n-1) is more than or equal to delta, x (t + n) is less than or equal to delta, and x (t + n +1) is less than delta;
and finally obtaining a vibration acceleration data set with the environment vibration information eliminated: x ═ X (t), X (t +1), X (t +2),. X (t + n-1) ].
Step S22: and denoising the new vibration acceleration set. And removing high-frequency noise in the vibration acceleration data set by using a wavelet decomposition method. Five-layer wavelet threshold denoising is carried out on the new vibration acceleration set by adopting a 'sym 3' wavelet function.
Step S23: and carrying out fixed window segmentation on the denoised signal. The duration of the vibration signal is different when trains of different speeds pass the sensor. In order to eliminate the influence of the time difference on the identification effect, a fixed window is adopted to carry out data segmentation on the de-noised signal.
In this embodiment, it is preferable to make the value of the fixed window length equal to the value of the sampling frequency: l ═ f, and the duration is set to 1 s. If the duration of a piece of data is 6s and 4s, respectively, it can be divided into 6 and 4 data subsets, respectively.
In this embodiment, step S3 specifically includes the following steps:
step S31, first calculate all local extreme points N of the original signal x (t)iAnd deducing all adjacent local extreme points NiAnd Ni+1Average value of (d):
will correspond to t (N)i) And t (N)i+1) All mean value points m in betweeniConnecting with a straight line to obtain a local mean line, and smoothing the local mean line by using a sliding average method to obtain a local mean function m11(t)。
Step S32: from adjacent local extrema NiAnd Ni+1Obtain the local amplitude ai:
In the same way, corresponding t (N)i) And t (N)i+1) All local amplitudes a in betweeniConnecting with a straight line to obtain a local amplitude line, and smoothing the local amplitude line by using a moving average method to obtain an envelope estimation function a11(t)。
Step S33: the local mean function m11(t) from the original signal x (t), we can obtain:
h11(t)=x(t)-m11(t)
step S34: h is to be11(t) and an envelope estimation function a11(t) phase division demodulation h11(t), it is possible to obtain:
to s11(t) repeating the steps of S31 and S32 to obtain S11Envelope estimation function a of (t)12(t) of (d). Repeating the above iterative process until a12(t) 1, in which case s is specified11(t) is a frequency modulated signal. Suppose s1n(t) is obtained by n iterations, then s1nEnvelope estimation function a of (t)1(n+1)(t) satisfies a1(n+1)Where (t) is 1, the iterative process can be described as:
h11(t)=x(t)-m11(t)
h12(t)=s11(t)-m12(t)
h1n(t)=s1(n-1)(t)-m1n(t)
in the formula:
s11(t)=h12(t)/a11(t)
s12(t)=h12(t)/a12(t)
s1n(t)=h1n(t)/a1n(t)
the iteration termination condition is as follows:
step S35: multiplying the envelope estimation function obtained in the iteration process to obtain an envelope signal a1(t):
Step S36: the first PF of the original signal is derived from the envelope signal a1(t) with a pure frequency-modulated signal s1n(t) the multiplication yields:
PF1(t)=a1(t)s1n(t)
step S37: mixing PF1(t) separating from the original signal x (t) to obtain a residual signal u (t), and repeating the residual signal u (t) as the original signal k times until uk(t) is a monotonic function:
u1(t)=x(t)-PF1(t)
u2(t)=u1(t)-PF2(t)
...
uk(t)=uk-1(t)-PFk(t)
at this time, the original signal x (t) is decomposed into k component signals and 1 residual function uk(t):
Step S37: to quantify PFsCorrelation degree with original signals, introducing Pearson correlation coefficient:
in the formula: x represents raw data and Y represents PFs。
The Pearson correlation coefficient tends to-1 or 1 when the correlation of the two variables increases, and tends to 0 when the correlation of the two variables decreases. If the correlation coefficient of the PF component with the original data is less than or equal to 0.2, the PF component is discarded and the remaining PF is called a feature PF.
Step S38: 5 time-domain characteristic parameters of the vibration signal are calculated.
Peak-to-peak: vp-p=max(PFq)-min(PFq)
crest factor: c ═ max (PF)q)/PFrms
In the formula: PF (particle Filter)
qRepresents the q-th characteristic PF,
the average value of the characteristic PFs is shown, and r is the number of the characteristic PFs.
Step S39: calculating 1 time-frequency domain characteristic parameter of vibration signal, selecting characteristic PFsAs one of the impairment recognition features.
First, the energy E of the qth characteristic PF is calculatedq:
In the formula: l is the length of the characteristic PF.
The total energy of these valid signatures PF is calculated:
the energy entropy of the feature PF is:
in the formula:
is the percentage of the energy of the qth characteristic PF with respect to the total energy entropy, and
and finally forming a characteristic parameter vector matrix through 5 time domain characteristics and 1 time-frequency domain characteristic extracted from the characteristic PF. Thus, 6 features can be extracted from the dataset.
In this embodiment, step S4 specifically includes the following steps:
step S41: from the original training dataset, m is randomly drawn with a back-put by the Bootstrap resampling methodtreeAnd (4) forming an OOB data set by using the self-help sample set as a training data set D and the data which is not obtained.
Step S42: for the extracted mtreeGenerating corresponding m respectively for each sample settreeA classification regression tree. In the invention, m is taken according to classification precisiontreeIs 1000.
Step S43: with t features, m is randomly drawn at each node of each treetryA characteristic
In general, the splitting feature
Preferably, in this embodiment, t is 6, so m is set
tryBy calculating the amount of information each feature contains, at m, 3
tryAnd selecting one feature with the most classification capability from the features to perform node splitting.
And calculating and selecting the feature with the maximum information gain as the classification feature by adopting an information gain principle.
Firstly, calculating the information entropy:
wherein p isiThe proportion of each classification characteristic i in the classification system.
Defining the information gain ratio of the feature a to the training data set D as the ratio of the information gain of the feature a to the entropy of the training data set D:
step S44: and (4) not cutting the classification regression tree, and enabling each tree to grow to the maximum. The maximum limit here means that there is only one sample point in all child nodes.
Step S45: and forming a random forest by the generated trees, classifying the data of the test set by using the random forest, determining the classification result according to the voting amount of the tree classifier, and finally finishing the identification of the track slab defect information.
The above description is only a preferred embodiment of the present invention, and all equivalent changes and modifications made in accordance with the claims of the present invention should be covered by the present invention.