CN114781473A - Method, device and equipment for predicting state of rail transit equipment and storage medium - Google Patents

Method, device and equipment for predicting state of rail transit equipment and storage medium Download PDF

Info

Publication number
CN114781473A
CN114781473A CN202210203142.2A CN202210203142A CN114781473A CN 114781473 A CN114781473 A CN 114781473A CN 202210203142 A CN202210203142 A CN 202210203142A CN 114781473 A CN114781473 A CN 114781473A
Authority
CN
China
Prior art keywords
data
monitoring data
state
prediction
predicted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210203142.2A
Other languages
Chinese (zh)
Inventor
胡祖翰
徐余明
刘利平
石先明
刘留
王凯
苏昭阳
郑胜洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Railway Siyuan Survey and Design Group Co Ltd
Original Assignee
China Railway Siyuan Survey and Design Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Railway Siyuan Survey and Design Group Co Ltd filed Critical China Railway Siyuan Survey and Design Group Co Ltd
Priority to CN202210203142.2A priority Critical patent/CN114781473A/en
Publication of CN114781473A publication Critical patent/CN114781473A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/20Administration of product repair or maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • G06Q50/40

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Marketing (AREA)
  • Software Systems (AREA)
  • General Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Health & Medical Sciences (AREA)
  • Educational Administration (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Medical Informatics (AREA)
  • Development Economics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a method, a device, equipment and a storage medium for predicting the state of rail transit equipment; the method comprises the following steps: acquiring monitoring data of target equipment in a preset time period; wherein the monitoring data comprises: sensing data of the target device and sensing data of an environment external to the target device; screening a characteristic data set from the monitoring data according to the state to be predicted of the target equipment, wherein the characteristic data set is a set of the monitoring data of which the relevance with the state to be predicted of the target equipment reaches a preset range; processing the characteristic data set by using a prediction model to obtain a prediction result corresponding to the state to be predicted; wherein the prediction model is constructed based on a random forest regression algorithm.

Description

Rail transit equipment state prediction method, device, equipment and storage medium
Technical Field
The invention relates to the technical field of intelligent operation and maintenance of urban rail transit, in particular to a method, a device, equipment and a storage medium for predicting the state of rail transit equipment.
Background
Urban rail transit including subways is developed rapidly in China, and the rail transit becomes an important component of urban construction in China. According to data display of 'overview of local and urban rail transit lines in China in 2020', published by the China urban rail transit Association, the total mileage of urban rail transit operation in China is about 7978km, the number of traffic cities is 45, the mileage of newly added operation lines is about 1241km and the number of the newly added operation lines is 36 by 31 days of 12 and 31 days in 2020. With the increasing of new lines in various places, the quantity of related equipment is greatly increased, and a facility equipment system bears heavy load.
At present, most of rail transit operation and maintenance services in China adopt a management mode of fault maintenance and planned maintenance, which lags behind the development requirements of equipment intellectualization and complexity, not only easily causes the problems of insufficient equipment maintenance and excessive maintenance, but also causes frequent fault occurrence and resource waste, and brings potential safety hazards.
Disclosure of Invention
The embodiment of the invention provides a method, a device, equipment and a storage medium for predicting the state of rail transit equipment. The technical scheme of the embodiment of the invention is realized as follows:
the embodiment of the invention provides a method for predicting the state of rail transit equipment, which comprises the following steps:
acquiring monitoring data of target equipment in a preset time period; wherein the monitoring data comprises: sensing data of the target device and sensing data of an external environment of the target device;
screening a characteristic data set from the monitoring data according to the state to be predicted of the target equipment, wherein the characteristic data set is a set of the monitoring data of which the relevance with the state to be predicted of the target equipment reaches a preset range;
processing the characteristic data set by using a prediction model to obtain a prediction result corresponding to the state to be predicted; wherein the prediction model is constructed based on a random forest regression algorithm.
In the foregoing solution, the screening out a feature data set from the monitoring data according to the state to be predicted of the target device includes:
and performing data cleaning on the monitoring data to obtain effective monitoring data, wherein the data cleaning at least comprises the following steps: repeating the value processing;
and screening the characteristic data set from the effective monitoring data based on the mutual information corresponding to the effective monitoring data according to an entropy method.
In the foregoing solution, the repeated value processing includes:
according to the Pearson coefficient, the monitoring data with the correlation coefficient lower than a preset value with the state to be predicted are eliminated;
screening the monitoring data subjected to rejection processing according to the sliding window and a preset Euclidean distance threshold value;
and downsampling the screened monitoring data to obtain the effective monitoring data.
In the foregoing solution, the processing the feature data set by using a prediction model to obtain a prediction result corresponding to the state to be predicted includes:
randomly generating k feature data subsets with the same capacity as the feature data set according to an ensemble learning method and the feature data set; k is an integer greater than 1;
constructing k regression tree models corresponding to the feature data subsets one by one according to each feature data subset;
respectively inputting the k characteristic data subsets into the corresponding regression tree models to obtain k groups of prediction data;
and determining the prediction result according to the average value of the k groups of prediction data.
In the above scheme, the method further comprises: and determining the position information of a monitoring point according to the structure information of the target equipment, wherein the monitoring point is used for acquiring the monitoring data.
The embodiment of the invention also provides a device for predicting the state of the rail transit equipment, which comprises:
the data acquisition unit is used for acquiring monitoring data of the target equipment in a preset time period; wherein the monitoring data comprises: sensing data of the target device and sensing data of an external environment of the target device;
the data processing unit is used for screening out a characteristic data set from the monitoring data according to the to-be-predicted state of the target equipment, wherein the characteristic data set is a set of monitoring data, the relevance of which with the to-be-predicted state of the target equipment reaches a preset range;
the prediction unit is used for processing the characteristic data set by using a prediction model to obtain a prediction result corresponding to the state to be predicted; wherein the prediction model is constructed based on a random forest regression algorithm.
In the above solution, the data processing unit includes:
the data cleaning unit is used for cleaning the monitoring data to obtain effective monitoring data, wherein the data cleaning at least comprises the following steps: repeating the value processing;
and the characteristic processing unit is used for screening the characteristic data set from the effective monitoring data based on the mutual information corresponding to the effective monitoring data according to an entropy method.
In the above scheme, the data cleaning unit is specifically configured to, according to a pearson coefficient, reject the monitoring data whose correlation coefficient with the state to be predicted is lower than a preset value; screening the monitoring data after the elimination processing according to the sliding window and a preset Euclidean distance threshold value; and downsampling the screened monitoring data to obtain the effective monitoring data.
In the foregoing solution, the prediction unit is further configured to: randomly generating k feature data subsets with the same capacity as the feature data set according to an ensemble learning method and the feature data set; k is an integer greater than 1; constructing k regression tree models which correspond to the feature data subsets one by one according to each feature data subset; respectively inputting the k characteristic data subsets into the corresponding regression tree models to obtain k groups of prediction data; and determining the prediction result according to the average value of the k groups of prediction data.
In the above scheme, the apparatus further comprises: and the determining unit is used for determining the position information of a monitoring point according to the structure information of the target equipment, and the monitoring point is used for acquiring the monitoring data.
An embodiment of the present invention further provides an electronic device, where the electronic device at least includes: a processor and a storage medium configured to store executable instructions, wherein: the processor is configured to execute stored executable instructions configured to perform the rail transit equipment state prediction method provided by the above-described embodiments.
The embodiment of the invention also provides a computer-readable storage medium, which stores executable instructions, and when the executable instructions are executed by a processor, the method for predicting the state of the rail transit equipment provided by the embodiment of the invention is realized.
According to the embodiment of the invention, on one hand, through predicting the running state trend of the target equipment, predictive maintenance can be carried out, the fault reason can be positioned, the quality is ensured, the cost is saved, the efficiency is improved, and the problems of low maintenance efficiency, long equipment downtime, high operation cost, safety and the like caused by the traditional maintenance method are solved. On the other hand, the effectiveness of the data is improved and the efficiency of training and calculating the prediction model is improved by screening out the characteristic data set from the monitoring data.
Drawings
Fig. 1 is a schematic view of an implementation scenario of a method for predicting a state of a rail transit device according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a prediction apparatus of an urban rail transit operation device according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of positions of bridge measuring points provided by an embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating steps of establishing a fault prediction model of a rail transit operating device according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of a fault prediction model of a rail transit operating device according to an embodiment of the present invention;
FIG. 6 is a schematic diagram illustrating the effect of the prediction based on the regression tree algorithm according to the embodiment of the present invention;
FIG. 7 is a schematic structural diagram of a device for predicting a status of a rail transit apparatus according to an embodiment of the present invention
Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail with reference to the accompanying drawings, the described embodiments should not be construed as limiting the present invention, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.
In the following description, references to the terms "first \ second \ third" are only to distinguish similar objects and do not denote a particular order or importance, but rather "first \ second \ third" may, where permissible, be interchanged in a particular order or sequence so that embodiments of the invention described herein may be practiced in other than the order shown or described herein.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein is for the purpose of describing embodiments of the invention only and is not intended to be limiting of the invention.
The following describes a method for predicting the state of rail transit equipment provided by the embodiment of the invention. Referring to fig. 1, fig. 1 is a schematic flow chart of a method for predicting a state of a rail transit device according to an embodiment of the present invention; the rail transit equipment state prediction method provided by the embodiment of the invention comprises the following steps:
step S110: acquiring monitoring data of target equipment in a preset time period; wherein the monitoring data comprises: sensing data of the target device and sensing data of an external environment of the target device;
step S120: screening a characteristic data set from the monitoring data according to the state to be predicted of the target equipment, wherein the characteristic data set is a set of the monitoring data of which the relevance with the state to be predicted of the target equipment reaches a preset range;
step S130: processing the characteristic data set by using a prediction model to obtain a prediction result corresponding to the state to be predicted; wherein the prediction model is constructed based on a random forest regression algorithm.
In one embodiment, the target device is a target rail transit device, including but not limited to and: bridges, rails, viaducts, etc. The states to be predicted include, but are not limited to: the operational status of the target device. For example: when the target device is a bridge, the state to be predicted includes, but is not limited to, a bridge amplitude acceleration state and the like.
In one embodiment, the sensing data of the target device comprises target device operation state data; the sensing data of the external environment of the target device includes state data of the external environment of the target device within a preset range, including but not limited to temperature, humidity, dust concentration, wind speed, and the like of the external environment.
In an embodiment, the preset time period is a time period from a preset historical time to a current time. The prediction result is the parameter value change condition of the state to be predicted in a period of time after the current moment. The preset historical time is the time before the current time, and can be set by a user at will. The preset range can also be set by a user according to the prediction requirement.
In an embodiment, the correlation between the monitored data and the to-be-predicted state of the target device may be a correlation between the monitored data and the to-be-predicted state parameter calculated based on a correlation analysis algorithm (e.g., pearson correlation coefficient, entropy method, etc.), and if the correlation coefficient between the monitored data and the to-be-predicted state parameter is within a preset range, the monitored data is characteristic data and belongs to the characteristic data set.
In an embodiment, the relevance of the monitoring data to the to-be-predicted state of the target device may also be determined by a change of the historical monitoring data, that is, if the change of the monitoring data brings a significant change of the to-be-predicted state, the monitoring data is the feature data, and belongs to the feature data set. Significant changes here include: and the variable quantity of the parameter corresponding to the state to be predicted reaches a preset threshold value. In one embodiment, the preset threshold is determined based on the type of the target device and the state to be predicted. The preset threshold value is different according to different target equipment and states to be predicted. In another embodiment, the preset threshold may be arbitrarily set by the user according to the predicted demand.
In one embodiment, the predictive model consists of a series of regression trees. The Regression Tree may be a CART Regression Tree (Classification And Regression Tree).
In some embodiments, the processing the feature data set by using a prediction model to obtain a prediction result corresponding to the state to be predicted includes:
randomly generating k feature data subsets with the same capacity as the feature data set according to an ensemble learning method and the feature data set; k is an integer greater than 1;
constructing k regression tree models which correspond to the feature data subsets one by one according to each feature data subset;
respectively inputting the k characteristic data subsets into the corresponding regression tree models to obtain k groups of prediction data;
and determining the prediction result according to the average value of the k groups of prediction data.
In an embodiment, the ensemble learning method includes, but is not limited to, Bagging (Bootstrap aggregation algorithm) ensemble method. In this embodiment, the accuracy of the prediction is improved by using an integration algorithm.
Specifically, k sets of data (k subsets of feature data) are randomly extracted from the feature data set by using an ensemble learning method, and the data amount of each set of data in the k sets of data is the same. K regression tree models are composed based on the k sets of data. Respectively inputting the k characteristic data subsets into the corresponding regression tree models to obtain k groups of prediction data; and comprehensively averaging the prediction results of the k regression tree models by adopting an averaging method to obtain a final prediction result.
In one embodiment, constructing a regression tree model based on the subset of feature data includes:
and placing the data in the feature data subset in a root node, selecting the optimal features from the data, dividing the root node into two internal nodes, dividing the features by the internal nodes, and finally dividing the training set into a limited number of subsets. The subset partitioning process is as follows: for all input feature vectors, select the jth feature vector x(j)As a division feature, s is selected as a division point, defining two regions:
Figure BDA0003530368960000071
wherein R is1(j, s) represents the left sub-tree of the feature vector partition point, R2(j, s) represents the right subtree. Solving:
Figure BDA0003530368960000072
the optimal division variable j and division point s can be obtained, wherein yiRepresenting predictive modelsOutput value, c1Output variable representing the left sub-tree, c2An output variable representing the right sub-tree. And sequentially dividing the characteristic vector set into two subsets, and repeating the division process until a stop condition is met. And (3) processing the constructed regression tree by using a pruning algorithm, and preventing the overfitting phenomenon of the regression tree. The pruning algorithm includes, but is not limited to: PEP (Pessimistic Error Pruning).
In some embodiments, the screening out the feature data set from the monitoring data according to the to-be-predicted state of the target device includes:
and performing data cleaning on the monitoring data to obtain effective monitoring data, wherein the data cleaning at least comprises the following steps: repeating the value processing;
and screening the characteristic data set from the effective monitoring data based on the mutual information corresponding to the effective monitoring data according to an entropy method.
The mutual information corresponding to the effective monitoring data comprises the mutual information between the data corresponding to the state to be predicted and other monitoring effective monitoring data except the data corresponding to the state to be predicted.
Specifically, an entropy method is adopted to define mutual information of two groups of data (data corresponding to a state to be predicted and other monitoring effective data), and the mutual information can be represented by a formula
Figure BDA0003530368960000073
Is represented by the formula, wherein xi,yiThe data variables are respectively characteristic X and Y, N is sample capacity, p is a probability density function, and the probability density function can be obtained by a ksdensity function. The mutual information reflects the size of the common information between the two groups of characteristics, and the larger the mutual information is, the stronger the correlation between the two is, otherwise, the weaker the correlation is. When the characteristics are irrelevant, the mutual information is close to 0, when the characteristics have a functional relation, the mutual information is close to infinity, the correlation is obtained by adopting an entropy method, the problem that the Pearson correlation coefficient cannot process nonlinearity can be solved, nonlinear elements in the data are screened out together, the effectiveness of monitoring data for prediction is improved, and the accuracy of prediction is improved.
In some embodiments, the data cleansing further comprises: default processing, exception processing and denoising processing.
In one embodiment, the default value processing includes: and completing the missing value. For example: mean interpolation, homogeneous mean interpolation, modeling prediction, high-dimensional mapping, multiple interpolation, maximum likelihood estimation, compressed sensing, matrix complementation and the like.
The embodiment adopts a method of data cleaning and characteristic data set screening, improves the effectiveness and accuracy of data, reduces the training error and the training time of the prediction model, and enables the prediction model to have better robustness.
In some embodiments, the iterative value processing comprises:
according to the Pearson coefficient, the monitoring data with the correlation coefficient of the state to be predicted lower than a preset value are rejected;
screening the monitoring data after the elimination processing according to the sliding window and a preset Euclidean distance threshold value;
and downsampling the screened monitoring data to obtain the effective monitoring data.
In an embodiment, the preset value may be set by a user, and the value of the preset value is generally 0.2.
Specifically, a pearson coefficient between the data corresponding to the state to be predicted and the other monitored data, that is, a pearson coefficient between the data type corresponding to the state to be predicted and the other monitored data type is calculated, for example: if the state to be predicted is the acceleration state of the bridge, the pearson coefficient between the acceleration of the bridge and the vibration frequency or the like is calculated. And removing the monitoring data with the Pearson coefficient lower than 0.2. Arranging the screened different types of monitoring data in columns, and setting a sliding window with the size of k; sliding a window from the first row of data, calculating Euclidean distances between different types of monitoring data (namely characteristics) in the first row, then making a difference between the Euclidean distances and different characteristics of the remaining k-1 rows, setting a minimum threshold, and deleting the first row of data if the difference is within the minimum threshold; sliding the window, and sequentially executing the above steps until the knot is formedAnd (4) bundling. Assuming that the number of features is n before screening and m after screening, when Euclidean distances of different features are calculated, calculation is needed before screening
Figure BDA0003530368960000091
Then, after screening, calculation is required
Figure BDA0003530368960000092
Second, the computation time is reduced on the order of the square. Because the initial data volume is very large, after repeated value processing, down sampling is needed, and the sampling principle is to sample by multiples of the original sampling interval without damaging the original running state trend. Through related screening, repeated value processing and down sampling, the monitoring data volume can be reduced by 261 times on the original basis, the calculation time is greatly shortened, training can be carried out more timely, and a prediction result is given.
In some embodiments, the method further comprises: and determining the position information of a monitoring point according to the structure information of the target equipment, wherein the monitoring point is used for acquiring the monitoring data.
According to the structure and the using condition of the target equipment, monitoring points are arranged on the target equipment and/or around the target equipment, so that the obtained inspection data can more comprehensively cover the monitoring data influencing the state of the target equipment, and more accurate prediction can be carried out.
In another embodiment, determining the location information of the monitoring point according to the structure information of the target device further comprises: and determining the position information of the monitoring point according to the structural information of the target equipment and the historical fault maintenance information of the target equipment.
In an embodiment, the method further comprises: and respectively storing the monitoring data under the condition that the state to be predicted is normal and the monitoring data under the condition that the state to be predicted is abnormal in the characteristic data set so as to facilitate the subsequent state analysis of the target equipment.
In some embodiments, the method further comprises: and determining whether the state to be predicted of the target equipment is abnormal or not based on the prediction result.
Specifically, the prediction result is compared with the currently and actually acquired monitoring data corresponding to the prediction result, and if the error between the prediction result and the actually acquired monitoring data exceeds a preset abnormal threshold, the abnormal state to be predicted of the target device is determined.
According to the method, on one hand, through predicting the running state trend of the target equipment, predictive maintenance can be carried out, the fault reason can be positioned, the quality is ensured, the cost is saved, the efficiency is improved, and the problems of low maintenance efficiency, long equipment downtime, high operation cost, safety and the like caused by the traditional maintenance method are solved. On the other hand, the effectiveness of the data is improved and the efficiency of training and calculating the prediction model is improved by screening out the characteristic data set from the monitoring data.
A specific example is provided below in connection with the above embodiments:
at present, the data volume collected by the urban rail transit equipment state monitoring system is large, and the data types are many. The traditional data processing has low prediction precision and cannot meet the current requirements. In order to solve the above problems and improve the management level of subway maintenance work, it is necessary to continuously enhance information-based means, and introduce the latest internet and computer technologies to manage the equipment maintenance of subway companies. However, in the current fault diagnosis, after a fault occurs, the equipment is mainly dispatched to a corresponding maintenance library, and an engineer analyzes fault data in the system, judges the type and degree of the fault, and makes a maintenance decision. This conventional maintenance method results in inefficient maintenance, undesirably long downtime of the equipment, resulting in increased operating costs and, most seriously, significant safety concerns. Based on the method, the urban rail transit key equipment fault prediction method based on the random forest algorithm is designed, and the traditional operation and maintenance fault repairing and planning repairing are converted into a prediction repairing mode.
The urban rail transit operation equipment prediction method based on the random forest algorithm is applied to an urban rail transit operation equipment prediction device shown in figure 2, and the device comprises the following steps: the device comprises a data acquisition module, a data processing module, a data storage module and a fault prediction module. Take bridge equipment as an example, wherein, the data acquisition module includes: the device comprises an acceleration detection module, a pressure detection module, a cable force detection module, a vibration frequency detection module, a displacement detection module, a deflection detection module, a crack detection module, a temperature detection module, a humidity detection module, a wind speed detection module and the like.
The urban rail transit key equipment prediction method based on the random forest algorithm comprises the following steps:
step 1, acquiring data of rail transit operation equipment.
Specifically, when the arrival traffic operation equipment is bridge equipment, the data acquisition module is used for acquiring real-time data of a monitoring sensor of the rail transit bridge equipment. The monitoring of the bridge equipment comprises real-time monitoring of the internal structure of the bridge and real-time monitoring of the external environment of the bridge, wherein the monitoring of the internal structure of the bridge comprises acceleration, pressure, cable force, vibration, structure temperature, deflection and cracks; the monitoring of the external environment of the bridge comprises temperature, humidity and wind speed. In this example, a bridge in a certain area is taken as an example, the selected time period is set to be 3 months, five different measuring point positions are selected in total, the upper side of the main span middle, the upper side of the main span 1/4 equi-spaced points, and the middle of the pier beam are selected, as shown in fig. 3, and table 1 is an example of statistical measurement data of a bridge in a certain area in 3 months at measuring point 1. The sampling frequency of the structure displacement, the temperature and the support displacement in the measurement data within 3 months is 1 time/s, 7776853 sampling data points are counted, the sampling frequency of the acceleration, pressure and wind speed sensors is 50Hz, and the number of the sampling data reaches 388500231.
Table 1 statistical measurement data example
Figure BDA0003530368960000111
And 2, processing data of the rail transit operation equipment. And carrying out data processing on the acquired data by using a data processing module, wherein the data processing comprises data cleaning and feature extraction. And the data cleaning comprises missing value processing, abnormal value processing, repeated value processing and denoising processing on the acquired data. The feature extraction comprises the following steps: from the data-cleaned collected data, data characteristics (i.e. the type of data, such as frequency data, pressure data, etc.) which have a significant influence on the state of the rail transit operating equipment are screened out.
During the data acquisition process, the value of the data is constant over a period of time due to the actual sampling interval being too short. For the numerical repeat value processing of data cleaning, in this example, all collected data are analyzed first, and the analysis result is that data of each sensor of the bridge conforms to normal distribution, and most of different features (different types of data) have a certain linear relationship, so before the repeat value is processed, a pearson coefficient is calculated for the features, and a specific calculation formula is as follows:
Figure BDA0003530368960000112
wherein alpha isiAnd betaiThe data variables are respectively the state parameter to be predicted and the data variables of other characteristics of the bridge, and m is the sample data volume. And calculating to obtain the Pearson coefficients of the target parameter and other parameters, and removing the features with the Pearson coefficients lower than 0.2. Then, repeated value processing is carried out, and the steps are as follows: (1) arranging the screened features in columns, and setting a sliding window with the size of k; (2) sliding a window from the first row of data, calculating Euclidean distances between different features of the first row, then making a difference between the Euclidean distances and the different features of the remaining k-1 rows, setting a minimum threshold value, and deleting the first row of data if the difference value is within the minimum threshold value; (3) and then sliding the window, and sequentially executing the steps until the end. Assuming that the number of features before screening is n and the number after screening is m, when the Euclidean distances of different features in the step (2) are calculated, calculation is needed before screening
Figure BDA0003530368960000121
Then, after screening, calculation is required
Figure BDA0003530368960000122
Second, the computation time is reduced in the order of squares. Because the initial data volume is large, the down-sampling is carried out after the steps (1) to (3), and the sampling principle is a multiple of the original sampling intervalSampling is carried out without destroying the original running state trend. Through related screening, repeated value processing and down sampling, the data volume is reduced by 261 times on the original basis, the calculation time is greatly shortened, training can be carried out more timely, and a prediction result is given.
Aiming at feature extraction: and (4) screening characteristics of the rail transit operation equipment, judging data change of the target sensor according to the state of the target bridge equipment, and screening the characteristics. For example, to predict the bridge amplitude acceleration state, the vibration frequency will increase as the bridge amplitude performance state decreases. And extracting other characteristic state values by adopting a similar method. The example adopts an entropy method, mutual information of two groups of data is firstly defined, and the mutual information can be expressed by formula
Figure BDA0003530368960000123
Is represented by the formula (I) in which xi,yiThe data variables are respectively characteristic X and Y, N is sample capacity, p is a probability density function, and the probability density function can be obtained by a ksdensity function. The mutual information reflects the size of the common information between the two groups of characteristics, and the larger the mutual information is, the stronger the correlation between the two is, otherwise, the weaker the correlation is. When the characteristics are irrelevant, the mutual information is close to 0, when the characteristics have a functional relation, the mutual information is close to infinity, the correlation is obtained by adopting an entropy method, the problem that the Pearson correlation coefficient cannot process nonlinearity can be solved, and nonlinear elements in the acquired data are screened out together.
And respectively storing the abnormal data and the normal data output by each detection module in the abnormal state and the normal state by using a data storage module.
Step 3, predicting the fault of the rail transit operation equipment, as shown in fig. 4, and establishing a fault prediction model of the rail transit operation equipment, wherein the fault prediction model comprises the following steps:
step S410: determining the input and the output of the model: inputting the characteristics which have obvious influence on the amplitude acceleration in the collected data after data processing and characteristic screening into an operation equipment fault prediction model, wherein the characteristics comprise vibration frequency, deflection and vertical displacement; and the output of the operation equipment fault prediction model is the bridge amplitude acceleration.
Step S420: constructing a CART regression tree: and placing all the screened acquired data in a root node, selecting optimal features from the data, dividing the root node into two internal nodes, dividing the features by the internal nodes, and finally dividing the acquired data into a limited number of subsets. The subset partitioning process is as follows: for all input feature vectors, select the jth feature vector x(j)As a division feature, s is selected as a division point, defining two regions:
Figure BDA0003530368960000131
wherein R is1(j, s) represents the left sub-tree of the feature vector partition point, R2(j, s) represents the right subtree. Solving:
Figure BDA0003530368960000132
the optimal division variable j and division point s can be obtained, wherein yiRepresenting the output value of the prediction model, c1Output variable representing the left sub-tree, c2An output variable representing the right sub-tree. And sequentially dividing the characteristic vector set into two subsets, and repeating the division process until a stop condition is met.
Step S430: and optimizing the constructed regression tree by adopting a pruning algorithm. PEP pruning is adopted to prevent the regression tree from having overfitting phenomenon.
Step S440: constructing a random forest, and realizing the fault prediction of rail transit operation equipment: the random forest is composed of a series of regression trees, the accuracy of the algorithm is improved according to a Bagging integration method, and fig. 5 is a schematic structural diagram of a fault prediction model of rail transit operation equipment. The method comprises the following specific steps: randomly extracting k groups of data from the extracted bridge monitoring index sample set in a place back mode to serve as a training set of each subtree, wherein the data volume of each group of data is the same and serves as input data of a regression tree. And combining the extracted k training sets into k regression trees, and splitting each regression tree according to the method in the step S220 to obtain corresponding regression prediction data, so as to obtain k groups of results. Comprehensively averaging the prediction results of the k regression trees by using an averaging method to obtain a final prediction result, namely
Figure BDA0003530368960000133
The predicted results are shown in FIG. 6.
The method provided by the example fully excavates the potential information of the acquired sensing data, finds the value in the data through experience, predicts the running state trend of the equipment, positions the failure reason through predictive maintenance, ensures the quality, saves the cost, improves the efficiency, and solves the problems of low maintenance efficiency, long equipment downtime, high operation cost, safety and the like brought by the traditional maintenance method.
The following continues to describe the rail transit equipment state prediction apparatus provided in the embodiments of the present invention, and in some embodiments, the rail transit equipment state prediction apparatus may be implemented in a software module. Referring to fig. 7, fig. 7 is a schematic structural diagram of a rail transit device state prediction apparatus according to an embodiment of the present invention, and a rail transit device state prediction apparatus 700 according to an embodiment of the present invention includes:
the data acquisition unit 710 is configured to acquire monitoring data of a target device within a preset time period; wherein the monitoring data comprises: sensing data of the target device and sensing data of an external environment of the target device;
a data processing unit 720, configured to screen a feature data set from the monitoring data according to the to-be-predicted state of the target device, where the feature data set is a set of monitoring data whose association with the to-be-predicted state of the target device reaches a preset range;
the prediction unit 730 is configured to process the feature data set by using a prediction model to obtain a prediction result corresponding to the state to be predicted; wherein the prediction model is constructed based on a random forest regression algorithm.
In some embodiments, the data processing unit comprises:
the data cleaning unit is used for cleaning the monitoring data to obtain effective monitoring data, wherein the data cleaning at least comprises the following steps: repeating the value processing;
and the characteristic processing unit is used for screening the characteristic data set from the effective monitoring data based on the mutual information corresponding to the effective monitoring data according to an entropy method.
In some embodiments, the data cleaning unit is specifically configured to, according to a pearson coefficient, reject the monitoring data whose correlation coefficient with the state to be predicted is lower than a preset value; screening the monitoring data subjected to rejection processing according to the sliding window and a preset Euclidean distance threshold value; and downsampling the screened monitoring data to obtain the effective monitoring data.
In some embodiments, the prediction unit is further configured to: randomly generating k feature data subsets with the same capacity as the feature data set according to an ensemble learning method and the feature data set; k is an integer greater than 1; constructing k regression tree models corresponding to the feature data subsets one by one according to each feature data subset; respectively inputting the k characteristic data subsets into the corresponding regression tree models to obtain k groups of prediction data; and determining the prediction result according to the average value of the k groups of prediction data.
In some embodiments, the apparatus further comprises: and the determining unit is used for determining the position information of the monitoring point according to the structure information of the target equipment, and the monitoring point is used for acquiring the monitoring data.
An embodiment of the present invention further provides an electronic device, where the electronic device at least includes: a processor and a storage medium configured to store executable instructions, wherein:
the processor is configured to execute stored executable instructions configured to perform the rail transit equipment state prediction method provided by the embodiment of the invention.
It should be noted that fig. 8 is a schematic structural diagram of an electronic device provided in an embodiment of the present application, and as shown in fig. 8, the device 800 at least includes: a processor 810, a communication interface 820, and a memory 830, wherein:
the processor 810 generally controls the overall operation of the device 800.
The communication interface 820 may enable the device to communicate with other devices over a network.
The Memory 830 is configured to store instructions and applications executable by the processor 810, and may also cache data to be processed or already processed by the processor 810 and modules of the device 800 (e.g., image data, audio data, voice communication data, and video communication data), and may be implemented by a FLASH Memory (FLASH) or a Random Access Memory (RAM).
It should be noted that, in the embodiment of the present application, if the rail transit equipment state prediction method is implemented in the form of a software functional module and is sold or used as an independent product, the rail transit equipment state prediction method may also be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a server to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a magnetic disk, or an optical disk. Thus, embodiments of the present application are not limited to any specific combination of hardware and software.
Correspondingly, the embodiment of the application provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the steps in the rail transit equipment state prediction method provided by the above embodiment.
Here, it should be noted that: the above description of the storage medium and device embodiments is similar to the description of the method embodiments above, with similar advantageous effects as the method embodiments. For technical details not disclosed in the embodiments of the storage medium and apparatus of the present application, reference is made to the description of the embodiments of the method of the present application for understanding.
Of course, the apparatus in the embodiment of the present application may have other similar protocol interaction implementation cases, and those skilled in the art can make various corresponding changes and modifications according to the embodiment of the present application without departing from the spirit and the spirit of the present application, but these corresponding changes and modifications should fall within the scope of the claims appended to the method of the present application.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be appreciated that reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present application. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. It should be understood that, in the various embodiments of the present application, the sequence numbers of the above-mentioned processes do not imply any order of execution, and the order of execution of the processes should be determined by their functions and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application. The above-mentioned serial numbers of the embodiments of the present application are merely for description, and do not represent the advantages and disadvantages of the embodiments.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element identified by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the modules is only one logical functional division, and in actual implementation, there may be other division ways, such as: multiple modules or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or modules may be electrical, mechanical or other.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules; the network module can be located in one place or distributed on a plurality of network modules; some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
The above description is only an embodiment of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think of the changes or substitutions within the technical scope of the present application, and shall cover the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A rail transit equipment state prediction method is characterized by comprising the following steps:
acquiring monitoring data of target equipment in a preset time period; wherein the monitoring data comprises: sensing data of the target device and sensing data of an external environment of the target device;
screening a characteristic data set from the monitoring data according to the state to be predicted of the target equipment, wherein the characteristic data set is a set of the monitoring data of which the relevance with the state to be predicted of the target equipment reaches a preset range;
processing the characteristic data set by using a prediction model to obtain a prediction result corresponding to the state to be predicted; wherein the prediction model is constructed based on a random forest regression algorithm.
2. The method of claim 1, wherein the screening out feature data sets from the monitored data according to the to-be-predicted state of the target device comprises:
and performing data cleaning on the monitoring data to obtain effective monitoring data, wherein the data cleaning at least comprises the following steps: repeating the value processing;
and screening the characteristic data set from the effective monitoring data based on the mutual information corresponding to the effective monitoring data according to an entropy method.
3. The method of claim 2, wherein the iterative value processing comprises:
according to the Pearson coefficient, the monitoring data with the correlation coefficient of the state to be predicted lower than a preset value are rejected;
screening the monitoring data after the elimination processing according to the sliding window and a preset Euclidean distance threshold value;
and downsampling the screened monitoring data to obtain the effective monitoring data.
4. The method according to claim 1, wherein the processing the feature data set by using a prediction model to obtain a prediction result corresponding to the state to be predicted comprises:
randomly generating k feature data subsets with the same capacity as the feature data set according to an ensemble learning method and the feature data set; k is an integer greater than 1;
constructing k regression tree models which correspond to the feature data subsets one by one according to each feature data subset;
respectively inputting the k characteristic data subsets into the corresponding regression tree models to obtain k groups of prediction data;
and determining the prediction result according to the average value of the k groups of prediction data.
5. The method of claim 1, further comprising: and determining the position information of a monitoring point according to the structural information of the target equipment, wherein the monitoring point is used for acquiring the monitoring data.
6. A rail transit equipment state prediction device is characterized by comprising:
the data acquisition unit is used for acquiring monitoring data of the target equipment in a preset time period; wherein the monitoring data comprises: sensing data of the target device and sensing data of an external environment of the target device;
the data processing unit is used for screening out a characteristic data set from the monitoring data according to the to-be-predicted state of the target equipment, wherein the characteristic data set is a set of monitoring data, the relevance of which with the to-be-predicted state of the target equipment reaches a preset range;
the prediction unit is used for processing the characteristic data set by using a prediction model to obtain a prediction result corresponding to the state to be predicted; wherein the prediction model is constructed based on a random forest regression algorithm.
7. The apparatus of claim 6, wherein the data processing unit comprises:
the data cleaning unit is used for cleaning the monitoring data to obtain effective monitoring data, wherein the data cleaning at least comprises the following steps: repeating the value processing;
and the characteristic processing unit is used for screening the characteristic data set from the effective monitoring data based on the mutual information corresponding to the effective monitoring data according to an entropy method.
8. The device according to claim 7, wherein the data cleansing unit is configured to remove the monitoring data having a correlation coefficient with the state to be predicted that is lower than a predetermined value according to a pearson coefficient; screening the monitoring data subjected to rejection processing according to the sliding window and a preset Euclidean distance threshold value; and downsampling the screened monitoring data to obtain the effective monitoring data.
9. An electronic device, characterized in that the device comprises at least: a processor and a storage medium configured to store executable instructions, wherein:
the processor is configured to execute stored executable instructions configured to perform the rail transit equipment condition prediction method provided in any one of the preceding claims 1 to 5.
10. A computer-readable storage medium having computer-executable instructions stored therein, the computer-executable instructions being configured to perform the rail transit equipment condition prediction method provided by any one of claims 1 to 5.
CN202210203142.2A 2022-03-03 2022-03-03 Method, device and equipment for predicting state of rail transit equipment and storage medium Pending CN114781473A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210203142.2A CN114781473A (en) 2022-03-03 2022-03-03 Method, device and equipment for predicting state of rail transit equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210203142.2A CN114781473A (en) 2022-03-03 2022-03-03 Method, device and equipment for predicting state of rail transit equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114781473A true CN114781473A (en) 2022-07-22

Family

ID=82422709

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210203142.2A Pending CN114781473A (en) 2022-03-03 2022-03-03 Method, device and equipment for predicting state of rail transit equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114781473A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116519021A (en) * 2023-06-29 2023-08-01 西北工业大学 Inertial navigation system fault diagnosis method, system and equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116519021A (en) * 2023-06-29 2023-08-01 西北工业大学 Inertial navigation system fault diagnosis method, system and equipment
CN116519021B (en) * 2023-06-29 2023-09-15 西北工业大学 Inertial navigation system fault diagnosis method, system and equipment

Similar Documents

Publication Publication Date Title
CN111368890A (en) Fault detection method and device and information physical fusion system
CN105677791B (en) For analyzing the method and system of the operation data of wind power generating set
CN108921301A (en) A kind of machine learning model update method and system based on self study
CN109753591A (en) Operation flow predictability monitoring method
CN104778622A (en) Method and system for predicting TPS transaction event threshold value
CN114357594A (en) Bridge abnormity monitoring method, system, equipment and storage medium based on SCA-GRU
CN114066262A (en) Method, system and device for estimating cause-tracing reasoning of abnormal indexes after power grid dispatching and storage medium
CN115204536A (en) Building equipment fault prediction method, device, equipment and storage medium
CN114970926A (en) Model training method, enterprise operation risk prediction method and device
CN114037140A (en) Prediction model training method, prediction model training device, prediction model data prediction method, prediction model data prediction device, prediction model data prediction equipment and storage medium
CN114781473A (en) Method, device and equipment for predicting state of rail transit equipment and storage medium
CN116739376A (en) Highway pavement preventive maintenance decision method based on data mining
CN110110339A (en) A kind of hydrologic forecast error calibration method and system a few days ago
CN111145535B (en) Travel time reliability distribution prediction method under complex scene
CN109635008B (en) Equipment fault detection method based on machine learning
CN116756825A (en) Group structural performance prediction system for middle-small span bridge
CN107590747A (en) Power grid asset turnover rate computational methods based on the analysis of comprehensive energy big data
CN116149895A (en) Big data cluster performance prediction method and device and computer equipment
CN115858606A (en) Method, device and equipment for detecting abnormity of time series data and storage medium
CN106681791A (en) Incremental virtual machine anomaly detection method based on symmetric neighbor relation
CN112395167A (en) Operation fault prediction method and device and electronic equipment
CN116627093B (en) Nitrile glove processing control method, system, equipment and storage medium
CN116957361B (en) Ship task system health state detection method based on virtual-real combination
CN115190038B (en) State determination method and device
WO2022059183A1 (en) Information processing device, information processing method, and information processing program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination