CN112699172A

CN112699172A - Data processing method and device for railway vehicle

Info

Publication number: CN112699172A
Application number: CN202110013882.5A
Authority: CN
Inventors: 梁建英; 范龙庆; 高世萍; 尚永涛; 于伟凯; 王斌儒
Original assignee: CRRC Qingdao Sifang Co Ltd
Current assignee: CRRC Qingdao Sifang Co Ltd
Priority date: 2021-01-06
Filing date: 2021-01-06
Publication date: 2021-04-23

Abstract

The application provides a data processing method and device of a railway vehicle, which are applied to a Docker container, can acquire a plurality of running data of the railway vehicle from a target storage position, perform multi-thread analysis on the running data based on a preset analysis rule to obtain corresponding analysis data, perform asynchronous storage on the analysis data to write the analysis data into a MongoDB database, and process target data required by a data processing model in the MongoDB database by using a preset data processing model to obtain a processing result corresponding to the target data. That is, the distributed column-type database MongoDB and the multithreading technology are adopted, so that the problems of rapid storage and reading of multi-field, large-batch and large-volume data are solved; by adopting a Docker container technology and a micro-service architecture, the problems of system resource isolation and scheduling are solved, and data multithread analysis, asynchronous access and automatic operation of a diagnosis model are realized; and a method of packaging a Docker container and binding the Docker container with a CPU serial number of a workstation is adopted, so that data encryption is realized, and data security is guaranteed.

Description

Data processing method and device for railway vehicle

Technical Field

The present application relates to the field of trains, and in particular, to a data processing method and apparatus for a rail vehicle.

Background

In the working process of the rail vehicle, a rail vehicle fault prediction and Health Management system (PHM system) becomes an important technical means for guaranteeing the safe operation of the train, and the PHM system comprehensively analyzes train operation data, overhaul data and environment data by receiving train operation state data and utilizing technical means such as big data, artificial intelligence and the like, so that train state monitoring, fault diagnosis, fault prediction early warning and Health assessment are realized.

The traditional PHM system is constructed based on a big data distributed cluster, the hardware investment is large, and if the PHM system is constructed by adopting the distributed cluster aiming at a small number of rail vehicles, the investment cost is high, and the resource waste is easily caused.

Meanwhile, the running state evaluation of a small number of rail vehicles usually needs to meet the timeliness requirement, and two rows of running data in one day need to be processed and completed within 1.5 hours by taking a business car as an example. However, the data of the running state of the rail vehicle includes data in various formats, such as data in SDR, MVB, DBT, SBV formats, and the data volume of the data is usually large, and the parsing and storing process is difficult to be completed on time by using the conventional RMDBS relational database storage technology. For example, the SDR data recording frequency is about 2000 fields per 100 milliseconds, the MVB data recording frequency is about 6000 fields per 200 milliseconds, and each train of rail vehicles generates 3.6GB of raw data per day (calculated as 16 hours).

Therefore, how to build a PHM system suitable for a small number of rail vehicles to rapidly store and process a large amount of operation data of the small number of rail vehicles is a technical problem to be solved in the field.

Disclosure of Invention

In order to solve the technical problems, the application provides a method and a device for data processing of a rail vehicle, a PHM system which can be suitable for a small number of rail vehicles is built, the rapid storage and processing of large-volume data of the small number of rail vehicles are realized, and the timeliness requirement of monitoring of the running state of the rail vehicle is met.

In order to achieve the purpose, the technical scheme is as follows:

in one aspect, an embodiment of the present application provides a method for data processing of a rail vehicle, where the method includes:

obtaining a plurality of operating data of the rail vehicle from a target storage location;

performing multi-thread analysis on the plurality of running data based on a preset analysis rule to obtain corresponding analysis data;

asynchronously storing the analysis data to write into a MongoDB database;

and processing target data required by the data processing model in the MongoDB database by using a preset data processing model to obtain a processing result corresponding to the target data so as to obtain the running state of the railway vehicle.

Optionally, before asynchronously storing the parsed data, the method further includes:

based on a preset processing rule, performing data preprocessing on part of data in the analysis data; the data preprocessing comprises at least one of field compression, value conversion, function calculation and label calculation.

Optionally, the method further includes:

and responding to the data query request, and acquiring data or a processing result corresponding to the data query request from the MongoDB database.

Optionally, the number of threads for analyzing the plurality of running data is determined according to the calculation resources and/or the memory resources of the Docker container; and the thread number for asynchronously storing the analysis data is determined according to at least one of the connection number of the MongoDB database, the calculation resource and the memory resource of the Docker container.

Optionally, the format of the operation data includes at least one of the following unstructured data types: MVB, SBR, DBT, SBV, and the analysis data is JsonObject data object.

In another aspect, an embodiment of the present application provides an apparatus for data processing of a rail vehicle, where the apparatus is applied to a Docker container, and the apparatus includes:

an operation data acquisition unit for acquiring a plurality of operation data of the rail vehicle from the target storage location;

the analysis unit is used for carrying out multi-thread analysis on the plurality of running data based on a preset analysis rule to obtain corresponding analysis data;

the storage unit is used for asynchronously storing the analysis data so as to write the analysis data into a MongoDB database;

and the processing unit is used for processing target data required by the data processing model in the MongoDB database by using a preset data processing model to obtain a processing result corresponding to the target data so as to acquire the running state of the railway vehicle.

Optionally, the apparatus further comprises:

the preprocessing unit is used for preprocessing part of data in the analysis data based on a preset processing rule before the analysis data is asynchronously stored; the data preprocessing comprises at least one of field compression, value conversion, function calculation and label calculation.

Optionally, the apparatus further comprises:

and the acquisition unit is used for responding to the data query request and acquiring the data or the processing result corresponding to the data query request from the MongoDB database.

According to the technical scheme, the data processing method and the data processing device are applied to a Docker container, multiple operation data of a railway vehicle can be obtained from a target storage position, multiple threads of the operation data are analyzed based on preset analysis rules to obtain corresponding analysis data, the analysis data are asynchronously stored to be written into a MongoDB database, target data required by a data processing model in the MongoDB database are processed by the aid of a preset data processing model, and a processing result corresponding to the target data is obtained to obtain the operation state of the railway vehicle. That is, the problem of fast storage and reading of multi-field, large-batch and large-volume data is solved by adopting a distributed column-type database MongDB and a multithreading technology; by adopting a Docker container technology and a micro-service architecture, the problems of system resource isolation and scheduling are solved, and data multithread analysis, asynchronous access and automatic operation of a diagnosis model are realized; and a method of packaging a Docker container and binding the Docker container with a CPU serial number of a workstation is adopted, so that data encryption is realized, and data security is guaranteed.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a flowchart of a data processing method for a rail vehicle according to an embodiment of the present disclosure;

fig. 2 is a schematic diagram of a data processing device of a rail vehicle according to an embodiment of the present application.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present application more comprehensible, embodiments accompanying the present application are described in detail below with reference to the accompanying drawings.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application, but the present application may be practiced in other ways than those described herein, and it will be apparent to those of ordinary skill in the art that the present application is not limited by the specific embodiments disclosed below.

As described in the background art, currently, in the working process of a rail vehicle, operation data of the rail vehicle needs to be processed, so as to perform PHM on the rail vehicle, the evaluation of the operation state of the rail vehicle usually needs to meet the requirement of timeliness, and two rows of operation data of one day need to be processed and completed within 1.5 hours, taking a bus as an example. However, offline data analysis of rail vehicles requires data in various formats, such as SDR, MVB, DBT, and SBV formats, which are generally large in data volume, and it is difficult to perform parsing and storing processes on time by using conventional RMDBS relational database storage technology. For example, the SDR data recording frequency is about 2000 fields per 100 milliseconds, the MVB data recording frequency is about 6000 fields per 200 milliseconds, and each train of rail vehicles generates 3.6GB of raw data per day (calculated as 16 hours).

The rapid storage and processing of large quantities of operating data for rail vehicles is a technical problem to be solved in the art. At present, the PHM system of the railway vehicle is built based on a distributed computing cluster, however, the PHM system is built by adopting the distributed computing cluster aiming at a small amount of railway vehicles, the investment cost is high, the resource waste is easy to cause, meanwhile, the current service station is based on a Windows system, the use habit of operation and maintenance service station personnel is considered, the stand-alone PHM system needs to be built and deployed under a Windows operating system, the cost of host resources for direct deployment is very high, and in addition, the aggregation query efficiency of a relational database is very low, and the business query statistical requirements cannot be met.

In order to solve the above problems, in the embodiments of the present application, a method and an apparatus for processing data of a rail vehicle are provided, where the method and apparatus are applied to a Docker container, and may obtain a plurality of operation data of the rail vehicle from a target storage location, perform multi-thread parsing on the plurality of operation data based on a preset parsing rule to obtain corresponding parsing data, asynchronously store the parsing data to write in a MongoDB database, and process target data required by a data processing model in the MongoDB database by using a preset data processing model to obtain a processing result corresponding to the target data, so as to obtain an operation state of the rail vehicle. That is, the problem of fast storage and reading of multi-field, large-batch and large-volume data is solved by adopting a distributed column-type database MongDB and a multithreading technology; by adopting a Docker container technology and a micro-service architecture, the problems of system resource isolation and scheduling are solved, and data multithread analysis, asynchronous access and automatic operation of a diagnosis model are realized; and a method of packaging a Docker container and binding the Docker container with a CPU serial number of a workstation is adopted, so that data encryption is realized, and data security is guaranteed.

Various non-limiting embodiments of the present application are described in detail below with reference to the accompanying drawings.

Exemplary method

Referring to fig. 1, a flowchart of a data processing method for a rail vehicle according to an embodiment of the present application, which may be applied to a Docker container, includes the following steps:

s101, acquiring a plurality of operation data of the rail vehicle from a target storage position.

At present, due to the particularity of the operation tasks undertaken by some rail vehicles (such as a business vehicle), the operation data cannot be transmitted back in real time, and the operation states of the rail vehicles cannot be monitored and evaluated in real time.

Therefore, in the embodiment of the application, the maintainer downloads the operation data of the rail vehicle from the rail vehicle train by using the encrypted USB flash disk, and stores the operation data to the target storage position, so that the quick copy and the safety of the operation data are ensured. The format of the operating data of the rail vehicle may comprise at least one of the following unstructured data types: MVB, SBR, DBT, SBV, the operational data probably comprises 160 data files per day.

The target storage position can be a storage position corresponding to the Docker container, the Docker container can be operated on a single-machine version PC Server, and the target storage position can be an appointed directory of the single-machine version PC Server, wherein the PC Server can be based on a Windows operating system, so that a worker can operate the system without converting operation habits, and the operation convenience is improved. The Docker container runs on a single-machine version PC Server Server, can encapsulate codes and data, and can be bound with the serial number of the Server, so that data encryption is realized, and data security is guaranteed. The Docker container can call the image mirror image file in the windows environment, automatic deployment and operation can be realized without secondary environment deployment, and convenience is improved.

The Docker container can be provided with an acquisition program, and can automatically and uninterruptedly scan a target storage position after the PC Server host is started to acquire running data from the target storage position. Specifically, a thread may be arranged in the Docker container to scan a target storage location, a directory list of the operation data is cached in the cache controller, then the operation data is asynchronously read into the cache according to the target list, the asynchronously read operation data may use a Spring-Boot micro-service program architecture, the asynchronously read operation data may use a dichotomy algorithm, and when the asynchronous reading is completed, the read time log is recorded in the Docker log container.

And S102, carrying out multi-thread analysis on the plurality of running data based on a preset analysis rule to obtain corresponding analysis data.

In the embodiment of the application, the obtained running data is analyzed in a multi-thread manner through the analysis rule configured in advance by the algorithm rule, so as to obtain the corresponding analysis data, the preset analysis rule can be an Xml file, and the Xml file can be configured in the cache. The running data is analyzed under the preset analysis rule, so that the analysis efficiency and accuracy can be improved. For example, the binary data 01 value can find the analysis functions of the operation data one by pre-configuring the algorithm, the analysis method of the operation data is quickly determined, the analysis functions corresponding to the operation data do not need to be matched in a traversal mode, the analysis efficiency is improved, and particularly when the rail vehicle comprises the parameters of the analysis functions, the analysis efficiency can be greatly improved.

The operation data can be analyzed in a multi-thread mode, the operation data are analyzed among the multiple threads concurrently according to a preset analysis rule, extra waiting time is not needed, and the analysis efficiency and accuracy can be greatly improved. Meanwhile, the multithreading technology is adopted to analyze the running data, so that hardware resources can be effectively utilized to the maximum extent, idle resources are guaranteed to be unavailable, and the potential of the hardware resources is fully exerted. The number of threads for analyzing the plurality of operation data may be determined according to the calculation resource and/or the memory resource of the Docker container. The computing resources comprise CPU core number, and the memory resources comprise cache.

The operation data can be read into the cache in advance, and the operation data in the cache can be analyzed when the operation data is analyzed. After determining the number of threads for analyzing the multiple pieces of operating data, the threads may be allocated according to the file names of the operating data, so as to perform analysis processing on the operating data by using multiple threads, to obtain analysis data corresponding to the multiple pieces of operating data, where the format of the analysis data may be a json object data object, and the analysis processing on the operating data may be analysis processing on a big data file and multiple fields.

In the embodiment of the application, the storage and the release of the cache can be realized by the cache control logic, and the execution and the destruction of the thread can be realized by the thread control logic, so that the thread task is scheduled. The multithread analysis operation data can use a Spring-Boot micro-service program structure, and a log can be generated after the analysis of the operation data.

S103, the analysis data is asynchronously stored to be written into the MongoDB database.

In the embodiment of the application, the analysis data can exist asynchronously to be written into the MongoDB database, the asynchronous storage of the analysis data can be executed in a multi-thread mode, the analysis data are stored in a plurality of threads concurrently, extra waiting time is not needed, and the efficiency and the accuracy of data storage can be greatly improved. Meanwhile, the multithreading technology is adopted to store the analysis data, so that hardware resources can be effectively utilized to the maximum extent, no idle resources are guaranteed, and the potential of the hardware resources is fully exerted. The number of threads for analyzing the plurality of running data may be determined according to the computational resources of the Docker container, or may be determined according to the computational resources and the memory resources of the Docker container. The computing resources comprise CPU core number, and the memory resources comprise cache. The number of threads for asynchronously storing the analysis data is determined according to at least one of the connection number of the MongoDB database, the calculation resource and the memory resource of the Docker container.

As a possible implementation, the analysis data may be acquired, the analysis data may be distributed to threads that can perform asynchronous storage, the threads convert the analysis data into file objects required by the MongoDB database using a pre-configured storage rule, and the generated file objects required by the MongoDB database may be written into the MongoDB database. Specifically, the JsonObject data object obtained after data analysis may be monitored, the file objects required by the MongoDB database may be generated by cycling the JsonObject data object, and the file objects required by the generated MongoDB database may be written into the MongoDB database. The pre-configured storage rule may be an xml file, and may be configured in a cache.

As another possible implementation manner, before the analysis data is asynchronously stored, the HIA may perform data preprocessing on part of the data in the analysis data based on a pre-configured processing rule, specifically, the analysis data may be acquired, the analysis data is distributed to a thread capable of performing preprocessing, the thread may perform preprocessing on the analysis data by using the pre-configured processing rule, then convert the analysis data into a file object required by the MongoDB database, and write the generated file object required by the MongoDB database into the MongoDB database. The pre-configured processing rule may be an xml file, and may be configured in a cache. The data preprocessing may include at least one of field compression, value conversion, function calculation, tag calculation. The preprocessed analysis data generally has smaller data size, and the rapid storage of the data is easy to realize.

The multithreading storage analysis data can use a Spring-Boot micro-service program structure, and a log can be generated after the analysis data is stored. The MongoDB database is based on distributed file storage, and can improve the data query efficiency.

And S104, processing target data required by the data processing model in the MongoDB database by using a preset data processing model to obtain a processing result corresponding to the target data so as to acquire the running state of the rail vehicle.

In the embodiment of the present application, the data processing model may be constructed by Python language, and the Xml language may be used to configure the algorithm rule. The data processing model can be provided with a plurality of models, such as a bogie axle temperature model, an air conditioner state model, a motor state model and other working condition mechanism models, so as to realize the functions of axle temperature mean value early warning, air conditioner fault reminding, motor fault reminding and the like.

The target data are the execution data corresponding to each model, one model can call a plurality of data, but all data in the MongoDB database cannot be called normally, the data processing model is used for calculating the target data corresponding to the model to obtain the processing result corresponding to the target data, and the processing result is stored in the prediction result table of the MongoDB database to obtain the running state of the railway vehicle, wherein the running state can comprise normal or abnormal results, so that the running state is monitored, and an analysis report can be generated after the running state of the railway vehicle is obtained.

In addition, when the service personnel needs to inquire or analyze the data or the processing result, the data or the processing result corresponding to the data inquiry request can be obtained from the MongoDB database in response to the data inquiry request, so that the data or the processing result can be displayed by using the web application server, and thus, data display, data analysis and data mining functions are provided for business personnel, and user experience is provided. In order to ensure that the front-end service does not occupy too much CPU and memory by the background data processing program when in use, the web application server can be deployed by using a Docker container.

The target data includes bogie axle temperature data corresponding to a bogie axle temperature model, air conditioner operation data corresponding to an air conditioner state model, motor operation data corresponding to a motor state model, and the like.

The target data can be processed by using a Spring-Boot micro-service program architecture, and a log can be generated after the target data is processed.

In addition, the pre-configured analysis rule, storage rule, processing rule and the like can realize secondary data configuration development, and the flexibility of data processing is improved. The micro-service degree, the web application server and the MongoDB all use Docker containers to isolate computing resources and internal memory and ensure that each component operates independently. In addition, the Docker container and each program in the Docker container can be added into a starting item of Windows, and the host is automatically started after being started, so that the use difficulty of users is reduced.

The embodiment of the application provides a data processing method of a railway vehicle, which is applied to a Docker container, can acquire a plurality of running data of the railway vehicle from a target storage position, performs multi-thread analysis on the running data based on a preset analysis rule to obtain corresponding analysis data, asynchronously stores the analysis data to write the analysis data into a MongoDB database, and processes the target data required by a data processing model in the MongoDB database by utilizing a preset data processing model to obtain a processing result corresponding to the target data so as to acquire the running state of the railway vehicle. That is, the distributed column-type database MongoDB and the multithreading technology are adopted, so that the problems of rapid storage and reading of multi-field, large-batch and large-volume data are solved; by adopting a Docker container technology and a micro-service architecture, the problems of system resource isolation and scheduling are solved, and data multithread analysis, asynchronous access and automatic operation of a diagnosis model are realized; and a method of packaging a Docker container and binding the Docker container with a CPU serial number of a workstation is adopted, so that data encryption is realized, and data security is guaranteed.

Exemplary devices

Referring to fig. 2, a schematic diagram of a data processing device of a rail vehicle according to an embodiment of the present application is provided. Data processing apparatus of a rail vehicle may comprise:

an operation data acquisition unit 201 for acquiring a plurality of operation data of the rail vehicle from the target storage location;

the analysis unit 202 is configured to perform multi-thread analysis on the multiple pieces of operating data based on a preconfigured analysis rule to obtain corresponding analysis data;

the storage unit 203 is used for asynchronously storing the analysis data so as to write the analysis data into a MongoDB database;

the processing unit 204 is configured to process target data required by the data processing model in the MongoDB database by using a preset data processing model, and obtain a processing result corresponding to the target data, so as to obtain an operating state of the rail vehicle.

In some embodiments, the apparatus further comprises:

In some embodiments, the number of threads for analyzing the plurality of operation data is determined according to the calculation resources and/or the memory resources of the Docker container; and the thread number for asynchronously storing the analysis data is determined according to at least one of the connection number of the MongoDB database, the calculation resource and the memory resource of the Docker container.

In some embodiments, the format of the operational data includes at least one of the following unstructured data types: MVB, SBR, DBT, SBV, and the analysis data is JsonObject data object.

The embodiment of the application provides a data processing device of a railway vehicle, which is applied to a Docker container, can acquire a plurality of running data of the railway vehicle from a target storage position, performs multi-thread analysis on the running data based on a preset analysis rule to obtain corresponding analysis data, asynchronously stores the analysis data to write the analysis data into a MongoDB database, and processes the target data required by a data processing model in the MongoDB database by utilizing a preset data processing model to obtain a processing result corresponding to the target data so as to acquire the running state of the railway vehicle. That is, the distributed column-type database MongoDB and the multithreading technology are adopted, so that the problems of rapid storage and reading of multi-field, large-batch and large-volume data are solved; by adopting a Docker container technology and a micro-service architecture, the problems of system resource isolation and scheduling are solved, and data multithread analysis, asynchronous access and automatic operation of a diagnosis model are realized; and a method of packaging a Docker container and binding the Docker container with a CPU serial number of a workstation is adopted, so that data encryption is realized, and data security is guaranteed.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to some descriptions of the method embodiment for relevant points.

The foregoing is merely a preferred embodiment of the present application and, although the present application discloses the foregoing preferred embodiments, the present application is not limited thereto. Those skilled in the art can now make numerous possible variations and modifications to the disclosed embodiments, or modify equivalent embodiments, using the methods and techniques disclosed above, without departing from the scope of the claimed embodiments. Therefore, any simple modification, equivalent change and modification made to the above embodiments according to the technical essence of the present application still fall within the protection scope of the technical solution of the present application without departing from the content of the technical solution of the present application.

Claims

1. A data processing method of a railway vehicle is applied to a Docker container and comprises the following steps:

asynchronously storing the analysis data to write into a MongoDB database;

2. The method of claim 1, further comprising, prior to asynchronously storing the parsed data:

3. The method of claim 1, further comprising:

4. The method according to any one of claims 1 to 3, wherein the number of threads for parsing the plurality of operation data is determined according to computing resources and/or memory resources of the Docker container; and the thread number for asynchronously storing the analysis data is determined according to at least one of the connection number of the MongoDB database, the calculation resource and the memory resource of the Docker container.

5. The method of any of claims 1-3, wherein the format of the operational data comprises at least one of the following unstructured data types: MVB, SBR, DBT, SBV, and the analysis data is JsonObject data object.

6. A data processing device of a railway vehicle is applied to a Docker container and comprises the following components:

7. The apparatus of claim 6, further comprising:

8. The apparatus of claim 6, further comprising:

9. The apparatus according to any one of claims 6 to 8, wherein the number of threads for parsing the plurality of operation data is determined according to a computing resource and/or a memory resource of the Docker container; and the thread number for asynchronously storing the analysis data is determined according to at least one of the connection number of the MongoDB database, the calculation resource and the memory resource of the Docker container.

10. The apparatus of any of claims 6-8, wherein the format of the operational data comprises at least one of the following unstructured data types: MVB, SBR, DBT, SBV, and the analysis data is JsonObject data object.