CN113569928B

CN113569928B - Train running state detection data missing processing model and reconstruction method

Info

Publication number: CN113569928B
Application number: CN202110792198.1A
Authority: CN
Inventors: 张昌凡; 陈泓润; 何静; 曹源; 杨皓楠; 徐逸夫; 印玲
Original assignee: Hunan University of Technology
Current assignee: Hunan University of Technology
Priority date: 2021-07-13
Filing date: 2021-07-13
Publication date: 2024-01-30
Anticipated expiration: 2041-07-13
Also published as: CN113569928A

Abstract

The invention discloses a train running state detection data missing processing model and a reconstruction method, wherein a brand new variation self-coding-generation antagonism semantic fusion network (VAE-FGAN) is constructed for reconstructing missing data, firstly, a GRU module is introduced into an encoder, and the bottom layer characteristics and the high-level characteristics of the data are fused, so that the VAE-FGAN learns the correlation between measured data in an unsupervised training mode; secondly, introducing an SE-NET attention mechanism into the whole generation network to promote and increase the expression of the feature extraction network on the data features; finally, parameter sharing is achieved through transfer learning and pre-training. The invention not only can keep higher reconstruction precision, but also can well accord with the distribution rule of the measurement data, and solves the problems of poor model generalization capability and unstable training when the operation data of partial faults is very little or absent in the operation process of the high-speed train in the prior art.

Description

Train running state detection data missing processing model and reconstruction method

Technical Field

The invention relates to the technical field of data loss reconstruction, in particular to a train running state detection data loss processing model and a reconstruction method.

Background

The high-speed train plays an important role in a transportation system, and the safety guarantee of the high-speed train in operation cannot be wrong. The line of the high-speed train often encounters various complex environments such as mountain areas, tunnels and the like, network faults, transmission interruption, harmonic interference and the like can occur, so that a large number of data missing conditions exist in monitored data, relevant fault characteristic information of missing data segments cannot be obtained, and the later multi-source information fusion causes larger errors and is unfavorable for fault judgment. The traditional methods of the existing EM algorithm and KNN algorithm can not well simulate the correlation between complex data features and different devices in a high-speed train.

In recent years, a proposed method for generating the countermeasure network is quite popular, 202011072927.8, which is based on the generation of the reconstruction of the measurement data loss of the high-speed train of the countermeasure network, discloses a method based on the reconstruction of the measurement data loss of the high-speed train of the countermeasure network, but when the models train the discrete data of the high-speed train, samples distributed with the original data are difficult to ensure to be generated from random noise, nash equilibrium is difficult to be achieved, and gradient vanishes. Second, deep learning techniques require reliance on large-scale, high-quality, complete data to train the deep network structure. The method has the advantages that the number of useful data generated in the actual running process of the high-speed train is small, so that the generalization capability of the deep learning model is difficult to ensure, the feature and parameter sharing of learning under different data can be ensured through transfer learning, and the problem that the small sample running data of the high-speed train is difficult to train the deep network model is effectively solved.

Disclosure of Invention

Aiming at the defects that in the running process of a high-speed train, partial fault running data are very few, running dimension measurement data are lost due to complex and changeable working conditions, the existing generation model is used for processing the reconstruction of small sample data, the model generalization capability is poor and training is unstable, the invention provides a high-speed train measurement missing data processing model for generating an countermeasure network by migration under the small sample data, and solves the problem of inaccurate reconstruction caused by more small sample missing data measured by the high-speed train.

The invention solves the other technical problem of providing a reconstruction method for detecting the missing data of the train running state.

The aim of the invention is realized by the following technical scheme:

the train running state detection data missing processing model comprises a data acquisition module, a data preprocessing module, a variation self-coding-generation antagonism semantic fusion network (VAE-FGAN) module, a transfer learning parameter sharing module and a data missing part reconstruction module; the data acquisition module transmits the data to the data preprocessing module, the data preprocessing module transmits the processed data to the variation self-coding-generation antagonism semantic fusion network (VAE-FGAN) module, the variation self-coding-generation antagonism semantic fusion network (VAE-FGAN) module generates a sample and transmits the data to the transfer learning parameter sharing module to obtain missing data, and the transfer learning parameter sharing module transmits the obtained missing data to the data missing part reconstruction module to carry out reasonable interpolation and output a complete data result.

Further, the data acquisition module comprises one or more of a current sensor, a voltage sensor, a temperature sensor, a humidity sensor, a displacement sensor and an electrical frequency sensor.

Further, the variational self-encoding-generating antagonistic semantic fusion network (VAE-FGAN) in step S2 comprises an encoder E, a generator G and a discriminator D; and the encoder E captures the data characteristic information, the generator G generates a new sample, and the discriminator D classifies the data while judging the authenticity of the generated data.

Furthermore, the encoder E is introduced into the GRU network module, and the GRU network has unique advantages in learning data characteristics, so that the capacity of the encoder E for acquiring deep semantics of data is improved, and the quality of generated data is improved.

Further, the encoder E and the generator G are respectively provided with an attention mechanism SE-NET, and a weight is used to represent the importance of each channel in the next stage.

Further, the attention mechanism SE-NET comprises an SE module, a Squeeze operation, an expression operation and feature fusion. After the weight distribution and the Squeeze operation are carried out on each channel, the network is enabled to obtain a global description, the expression operation and the feature fusion enable the full-connection layer to well fuse all input feature information, and the Sigmoid function can well map the input to a 0-1 interval.

According to the high-speed train running state detection data missing processing model, a method for reconstructing high-speed train measurement missing data of a small sample data migration generation countermeasure network is provided, and the method comprises the following steps:

s1, collecting a high-speed train operation and maintenance data set, and preprocessing the collected discrete data.

S2, utilizing variation self-coding-generation of inter-data correlation features learned by a antagonism semantic fusion network (VAE-FGAN):

wherein the variational self-coding-generating antagonistic semantic fusion network (VAE-FGAN) learns the feature distribution of the input data by coding and reconstruction, and during the training process, the encoder E extracts and compresses the features of the samples in the complete data set and codes them to a potential space z through a linear network, wherein z is the information of important features of the potential captured data; the new samples generated are generated by generator G from the description of the latent variable z, which calculates the distance between the two distributions by using variational reasoning to continually approximate the expected distribution, selecting the KL-divergence as part of the loss function.

And S3, constructing a parameter sharing model by using transfer learning, and generating data of the missing part of the small sample characteristic data.

S4, interpolation is carried out on the missing part data, and a complete data result is output.

Further, the operation and maintenance data set in step S1 includes one or more of ac voltage, dc voltage, monitored output current, temperature of the collecting device, oil level, humidity of the collecting device, and power frequency of the receiver.

Further, the data preprocessing in step S1 includes space-time correction, registration and data dimension lifting, and after the collected discrete measurement data of the high-speed train is sectioned, the high dimension is mapped into a 2-D grid matrix form.

Further, the migration learning model adopts a variational self-coding-generation countermeasure semantic fusion network (VAE-FGAN) as a basic network structure, and a generator G is generated by using sample data _p Pretraining, and migrating the well-trained parameters to a generator G in a main training network _m Fine tuning using data of a small number of samples;

further, the sample data size ratio of the pre-training to the main training is 5-20: 1, a step of; preferably, the sample data size ratio of the pre-training to the main training is 10:1.

further, in the step S4, in the interpolation process of the missing data portion, a portion for interpolating the missing data is defined by the context similarity and the defined KL divergence, so that the final output measurement result is as close to the real portion as possible.

Compared with the prior art, the beneficial effects are that:

under the missing data reconstruction method, a brand new variation self-coding-generation countermeasure semantic fusion network (VAE-FGAN) is constructed for reconstructing missing data, and the GRU semantic fusion module is applied to the variation self-coding-generation countermeasure semantic fusion network (VAE-FGAN) so as to fuse the bottom layer features and the high-layer features of the data, thereby effectively improving the reconstruction precision of the model, and introducing an SE-NET attention mechanism in the whole generation network so as to improve and increase the expression of the feature extraction network to the data features; finally, parameter sharing is achieved through transfer learning and pre-training. The migration generation countermeasure network model can learn the relevant characteristics of the data from the measurement data of the small sample, can keep higher reconstruction precision under the condition of different deletion rates, and can well accord with the distribution rule of the measurement data.

Drawings

FIG. 1 is a small sample data down-change self-encoding-generation countering network missing data reconstruction framework;

FIG. 2 shows a network structure of an encoder, a decoder and a arbiter

Wherein k represents the convolution kernel size, c represents the channel number, and h represents the hidden layer number of the gate control loop network GRU;

FIG. 3 is a schematic diagram of a GRU-based semantic fusion module, which is a core part of the invention, and is used for highly integrating bottom-layer feature information and high-layer feature information;

FIG. 4 is a SE-NET attention mechanism added in the encoder and generator;

FIG. 5 shows the reconstruction effect of missing data of a specific value, and the reconstruction result very close to the normal value can be obtained by measuring the context characteristic relation of the data.

Fig. 6 is a view of the missing data reconstruction in the present invention, which can effectively observe the distribution characteristics of the reconstructed data and the real data.

Detailed Description

The present invention is further illustrated and described below with reference to examples, which are not intended to be limiting in any way. Unless otherwise indicated, the methods and apparatus used in the examples were conventional in the art and the starting materials used were all conventional commercially available.

Example 1

The embodiment provides a high-speed train running state detection data missing processing model.

Referring to fig. 1, the high-speed train running state detection data missing processing model comprises a data acquisition module, a data preprocessing module, a variation self-coding-generation countermeasure semantic fusion network (VAE-FGAN) module, a transfer learning parameter sharing module and a data missing part reconstruction module.

The data acquisition module comprises one or more of a current sensor, a voltage sensor, a temperature sensor, a humidity sensor, a displacement sensor and an electric frequency sensor.

The preprocessing module comprises multi-dimensional data space-time correction and registration, and realizes data dimension increase through mapping to a high dimension.

The variational self-coding-generating antagonism semantic fusion network (VAE-FGAN) module comprises a VAE encoder E and a generator G, wherein the encoder E, the generator G and a discriminator D form a VAE-GAN backbone network structure, a attention mechanism SE-NET is respectively added into the encoder E and the generator G, and the encoder E is combined with a GRU network model to obtain the encoder semantic fusion structure.

As shown in fig. 2, the activating functions of the encoder E, the generator G and the arbiter D of the VAE are all RuLU functions, and in order to improve the identifying performance of the arbiter, the activating functions are different from those of other convolution layers, and the LeakyReLU functions are selected.

As in fig. 3, the GRU module includes three quantities: output h at last moment _t-1 Input x at the current time _t Output h at the current time _t Z in the formula _t And r _t The more the value of the update gate is, the more information is brought in at the last moment, the relation is as follows:

z _t ＝σ(W _z ·[h _t-1 ,x _t ])

r _t ＝σ(W _z ·[h _t-1 ,x _t ])

wherein sigma is a sigmoid function by which data is transformed into values in the range 0-1, W _z The case is represented for the weight of the current state.

Deep features between data are extracted by utilizing the ability of GRU to learn between context data. The data of the first layer is used as the output of the last step of the GRU module, the data of the second layer is used as the input of the current step of the GRU module, and useful data information is reserved and then output; and then carrying out feature semantic fusion on the output data and the data of Layer 1 and Layer 2, and outputting the data to the next GRU module. The structural design integrates the relativity between data, fully combines the bottom layer characteristic information and the high-layer characteristics, and improves the authenticity of the missing data reconstructed by the whole model.

As shown in FIG. 4, the SE-NET attention mechanism mainly comprises an SE module, a Squeeze operation, an expression operation and feature fusion. After the weight distribution and the Squeeze operation are carried out on each channel, the network is enabled to obtain a global description, the expression operation and the feature fusion enable the full-connection layer to well fuse all input feature information, and the Sigmoid function can well map the input to a 0-1 interval.

Example 2

The method for reconstructing high-speed train measurement missing data of an countermeasure network by providing migration under small sample data according to the high-speed train operation state detection data missing processing model of embodiment 1 includes the steps of:

s1, acquiring a high-speed train operation and maintenance data set through a data acquisition module, and preprocessing acquired discrete data:

the preprocessing comprises space-time correction, registration and data dimension lifting processes, and the acquired discrete measurement data of the high-speed train is segmented and intercepted and then mapped into a 2-D grid matrix form in a high-dimension mode.

S3, constructing a parameter sharing model by using transfer learning to generate small sample characteristic dataMissing part of data, by using sample data pair generator G _p Pretraining, and migrating the well-trained parameters to a generator G in a main training network _m Fine tuning is performed using data from a small number of samples. In the transfer learning, one VAE-FGAN is trained first, parameters are transferred to the other VAE-FGAN in a parameter freezing mode, and a small amount of data is used for fine adjustment, and the two VAE-FGAN structures are identical, but the trained data are different.

The sample data size of the pre-training and the main training is preferably 10:1.

s4, reasonably interpolating the missing part data, and outputting a complete data result:

in the interpolation process of the missing data part, firstly, a binary mask matrix M consistent with the dimension of the measured data input by the model is established to properly describe the characteristic data containing the missing sample, and for the missing part of the data sample, the element value of the corresponding mask matrix M part is 0, and the complete part is 1. The Hadamard product (Hadamard product) operation is carried out on the measurement data X and M, and the data missing of different degrees is expressed through the matrix operation.

Secondly, to ensure that the un-missing parts can remain unchanged, the reconstructed data is similar to the original measured data. Defining a context-constrained similarity L _r The method is used for continuously generating the data which is most matched with the complete data part by the generator, and ensures that the reconstructed data has consistent context relation with the complete data.

L _r (z)＝||Xe M-G(z)e M|| ₂

Wherein L is _r (z) a similarity loss function that is context constrained; x is measurement data containing missing values; g (z) is the generated data sample, M binary mask matrix. It must be noted that we only calculate the non-missing part of the data.

The arbiter loss is used to ensure that the reconstructed data is as close to real as possible, defining the loss L _d The method comprises the following steps:

L _d (z)＝-D(G(z))

wherein G represents the generator generated data; d is denoted as the arbiter network output, i.e. the KL distance between the generated data reconstructed sample and the real sample.

In summary, the reconstructed missing data loss function consists of similarity loss and discriminant loss:

L(z)＝L _r +λL _d (z)

wherein L (z) is a loss function for the data reconstruction; l (L) _r Is a similarity loss; lambda is the hyper-parameter.

Example 3

In the embodiment, the equipment operation and maintenance data of a certain high-speed train for 32 days adopt five characteristics (maximum value, minimum value, maximum value, minimum value and average value of positioning direct-current voltage and positioning alternating-current voltage) of the same equipment, and the selection of the same equipment improves the strong relevance of the data characteristics. In order to verify the interpolation effect of the model on missing data, the sample data sequence is disturbed, a pre-training model is trained by adopting three characteristic samples of alternating voltage maximum value, minimum value and average value, and fine adjustment and interpolation verification are carried out on a main training model by adopting the direct current maximum value and the minimum value. After the migration of the main training model is carried out through the parameters of the pre-training model, 100 pieces of sample data are adopted for parameter fine adjustment, random deletion and interpolation of different degrees are carried out on the remaining 250 pieces of sample data in the main training model, and the generalization capability of the model under the small sample data is checked. It should be noted that the data of the training model must be guaranteed to be complete data without missing values, and that the trimmed data and the verification data must not be duplicated in order to evaluate the interpolation capability of the model to process the missing data.

The program hardware environment is CPU processor Intel (R) Xeon (R) E-2124G CPU frequency 3.41GHz; the GPU is NVIDIA GeForce GTX 1660, with platform versions Python 3.7.7 and torch 1.4.0.

The study rate of the experiment on the pre-trained encoder E and the pre-trained arbiter D is set to be 0.0001, the study rate of the generator G is set to be 0.00002, and the study rate of the main training model is set to be 0.00002.

In the missing data reconstruction evaluation, two indexes of average absolute error (mean absolute error, MAE) and average absolute error percent (mean absoluate percentage error, MAPE) are adopted to evaluate the model reconstruction effect, and the calculation formula is as follows:

wherein n is a number of data, x _i Representing the measurement data of the original high-speed train,representing the complete data after reconstruction. The result change of the two indexes determines the reconstruction effect of the missing data, and the smaller the values of MAE and MAPE, the better the reconstruction effect.

As described above, the data used in the experiment are all complete data of the operation and maintenance of the high-speed train, and are used for verifying whether the model is fit for the situation of data loss caused by the high-speed train in a complex operation environment. Therefore, the experiment adopts a Hadamard product operation to be carried out on the binary mask matrix and the complete data to represent the missing data. The uncertainty and uncontrollability of the measurement data missing position of the high-speed train in the actual running process are considered, so that the generated mask matrix is randomly set, wherein 1 represents completeness and 0 represents missing. The number of missing measurement data is controlled by controlling the number of mask matrices 0. If the experiment adopts 250 samples to achieve the 20% missing effect, the missing amount of the randomly generated mask slightly floats up and down at 50 sampling points.

As shown in fig. 5, assuming that the measurements numbered 2,5,8,9, 14 in the high speed train system are all missing due to communication failure, the model rebuilds the data in this context as shown in fig. 6 based on the context characteristics and a priori knowledge of the data. The VAE-FGAN model can obtain a reconstruction result very close to a normal value through measuring the context characteristic relation of the data under the condition of specific missing data.

As shown in FIG. 6, 250 samples are taken for each feature according to the model of the invention, wherein 50 samples are randomly missing, the degree of difference between the reconstructed data and the original data is reflected by the coincidence degree of the curve and the point, when the error between the reconstructed data and the original data is zero, the curve and the point coincide, the reconstruction effect of the maximum value and the minimum value of the positioning direct current voltage has the same distribution change rule as the original data, the reconstruction data of the minimum value of the positioning direct current voltage has high fitting with the original measured data, and the error at the top end value is very small. The VAE-GAN semantic fusion model provided by the invention can greatly restore original data information distribution and has high reconstruction accuracy by learning the characteristic rule among the data of the same equipment.

It is to be understood that the above examples of the present invention are provided by way of illustration only and not by way of limitation of the embodiments of the present invention. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the invention are desired to be protected by the following claims.

Claims

1. The train running state detection data missing processing device is characterized by comprising a data acquisition module, a data preprocessing module, a variation self-coding-generation countermeasure semantic fusion network module, a transfer learning parameter sharing module and a data missing part reconstruction module; the data acquisition module transmits the data to the data preprocessing module, the data preprocessing module transmits the processed data to the variation self-coding-generation countermeasure semantic fusion network module, the variation self-coding-generation countermeasure semantic fusion network module generates a sample and transmits the data to the transfer learning parameter sharing module to obtain missing data, and the transfer learning parameter sharing module transmits the obtained missing data to the data missing part reconstruction module to carry out reasonable interpolation and output a complete data result;

the variation self-coding-generating countermeasure semantic fusion network module comprises a coder E and a generator G of the VAE, wherein the coder E, the generator G and a discriminator D form a VAE-GAN backbone network structure, a attention mechanism SE-NET is respectively added into the coder E and the generator G, and the coder E is combined with a GRU network model to obtain a coder semantic fusion structure;

the activating functions of the encoder E, the generator G and the discriminator D of the VAE are RuLU functions, and in order to improve the identifying performance of the discriminator, the activating functions are different from activating functions of other convolution layers, and the LeakyReLU functions are selected;

the GRU module includes three quantities: output at last momentInput +.>Output at present moment->?>And->The more the value of the update gate is, the more information is brought in at the last moment, the relation is as follows:

wherein sigma is a sigmoid function, data is converted into a numerical value in a range of 0-1 by the function,W _z the weight of the current state represents the situation;

extracting deep features among data by utilizing the capability of GRU for learning context data, taking the data of a first layer as the output of the last step of the GRU module, taking the data of a second layer as the input of the current step of the GRU module, and outputting useful data information after being reserved; and then carrying out feature semantic fusion on the output data and the data of Layer 1 and Layer 2, and outputting the data to the next GRU module.

2. The train operation state detection data loss processing device according to claim 1, wherein the data acquisition module includes one or more of a current sensor, a voltage sensor, a temperature sensor, a humidity sensor, a displacement sensor, and an electrical frequency sensor.

3. A method for reconstructing train operation state detection missing data, characterized in that the train operation state detection missing data processing apparatus according to claim 1 or 2 is used, comprising the steps of:

s1, collecting a high-speed train operation and maintenance data set, and preprocessing collected discrete data;

s2, utilizing variation self-coding-generating inter-data correlation characteristics against semantic fusion network learning:

the method comprises the steps that a variation self-coding-generation countermeasure semantic fusion network learns the characteristic distribution of input data through coding and reconstruction, and in the training process, an encoder E extracts and compresses characteristics of a sample in a complete data set and codes the sample to a potential space z through a linear network, wherein z is information of important characteristics of potential captured data; the generated new samples are generated by a generator G according to the description of the latent variable z, and the generator G calculates the distance between the two distributions by using variational reasoning to continuously make the posterior distribution approach to the expected distribution and selecting KL divergence as a part of the loss function;

s3, constructing a parameter sharing model by using a transfer learning model, and generating data of a small sample characteristic data missing part;

4. A method for reconstructing missing data of train operation status detection according to claim 3, wherein said operation and maintenance data set in step S1 includes one or more of ac voltage, dc voltage, monitored output current, temperature of the collecting device, oil level, humidity of the collecting device, and frequency of power supply of the receiver.

5. The method for reconstructing missing data of train operation state detection according to claim 3, wherein the data preprocessing in step S1 comprises space-time correction, registration and data dimension lifting, and the collected discrete measurement data of the high-speed train is segmented and intercepted, and then the high-dimensional map is formed into a 2-D grid matrix.

6. The method for reconstructing missing data of train operation state detection according to claim 3, wherein the transfer learning model adopts a variational self-coding-generation countermeasure semantic fusion network as a basic network structure, and the generator is configured by using a sample data pairPre-training and migrating the well-trained parameters to a generator in the main training network>Fine tuning is performed using data from a small number of samples.

7. The method for reconstructing missing data of train operation state detection according to claim 6, wherein the sample data amount ratio of the pre-training to the main training is 5-20: 1.