CN113807704A - Intelligent algorithm platform construction method for urban rail transit data - Google Patents

Intelligent algorithm platform construction method for urban rail transit data Download PDF

Info

Publication number
CN113807704A
CN113807704A CN202111101607.5A CN202111101607A CN113807704A CN 113807704 A CN113807704 A CN 113807704A CN 202111101607 A CN202111101607 A CN 202111101607A CN 113807704 A CN113807704 A CN 113807704A
Authority
CN
China
Prior art keywords
algorithm
model
rail transit
data
urban rail
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111101607.5A
Other languages
Chinese (zh)
Inventor
刘占英
李峰
张振义
陈瑞军
孟伟君
刘芽
楚研
王彪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hohhot Urban Transportation Investment And Construction Group Co ltd
Original Assignee
Hohhot Urban Transportation Investment And Construction Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hohhot Urban Transportation Investment And Construction Group Co ltd filed Critical Hohhot Urban Transportation Investment And Construction Group Co ltd
Priority to CN202111101607.5A priority Critical patent/CN113807704A/en
Publication of CN113807704A publication Critical patent/CN113807704A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/32Indexing scheme for image data processing or generation, in general involving image mosaicing

Abstract

The invention discloses a method for constructing an intelligent algorithm platform of urban rail transit data, wherein the intelligent algorithm platform provides services aiming at an algorithm model, and the services of the algorithm model comprise: s1, acquiring urban rail transit data to be predicted and corresponding fields, and performing auxiliary calibration cleaning on the urban rail transit data; s2, performing feature engineering on the cleaned urban rail transit data to obtain a feature training set, and training according to the feature training set to obtain various algorithm models; s3, extracting various algorithm models related to the fields from the algorithm platform according to the corresponding fields to form a set, and carrying out algorithm loading on the set formed by the various algorithm models to obtain an algorithm integration model; and S4, inputting the cleaned urban rail transit data into the algorithm integration model to obtain prediction result data. Common algorithms are provided and preset in the system. The algorithm can be directly selected for training operation without paying attention to model development.

Description

Intelligent algorithm platform construction method for urban rail transit data
Technical Field
The invention relates to the field of public transportation, in particular to an intelligent algorithm platform construction method for urban rail transit data.
Background
At present, the urban rail transit industry is rapidly developed, the data volume of information is continuously expanded, and data processing is developed to the current era of big data processing from original single data processing and multi-data processing.
With the rise of big data technology, the advantages of data are larger and larger, the influence range is wider and wider, and how to utilize the data and extract valuable information from mass data is the work core of algorithm engineers. However, in the work of algorithm engineers, due to various requirements in engineering, engineering projects need to be established for different projects, and different computing environments are provided for algorithms. This results in the work center of gravity of the algorithm engineer often needing to be placed on the engineering rather than the algorithm. The repeated labor of an algorithm engineer is more, and the working efficiency is greatly influenced. Therefore, how to separate the algorithm from the engineering so that the attention of the algorithm engineer is focused on the algorithm rather than the engineering, thereby improving the work efficiency of the algorithm engineer and avoiding unnecessary repeated labor, which has become a difficult problem to be solved in the field.
Urban rail transit is an important link on a closed-loop chain of a traffic platform, and the working efficiency and the user experience are the core competitiveness of traffic services. With the increase of the passenger volume, the increase of the track lines and the complexity of the traffic scene, various algorithms of the track traffic scene face increasing challenges under the goals of being faster (the algorithms need to be iterated quickly and come on line quickly), better (the business depends on the machine learning algorithm more and more to generate a forward effect), and more accurate (various predictions of the algorithms, such as the passenger flow and the like, need to be accurately close to a true value).
Disclosure of Invention
The invention aims to provide an intelligent algorithm platform construction method for urban rail transit data, which provides prediction services of various machine learning algorithm models for users of urban rail transit, wherein data to be predicted can be data directly from other platforms in an urban rail transit system, and prediction results can be directly applied to other platforms in the urban rail transit system, so that the use by the users is facilitated. The method is used for solving the problems that the existing algorithm engineer needs tedious engineering development and can not focus limited energy on iteration of an algorithm strategy.
An intelligent algorithm platform construction method for urban rail transit data, wherein the intelligent algorithm platform is provided with services aiming at an algorithm model, and the services of the algorithm model comprise:
s1, acquiring urban rail transit data to be predicted and corresponding fields, and performing auxiliary calibration and cleaning on the urban rail transit data to obtain cleaned urban rail transit data and corresponding fields;
s2, performing feature engineering on the cleaned urban rail transit data to obtain a feature training set, and training according to the feature training set to obtain various algorithm models, wherein the feature engineering comprises statistical feature engineering, graph feature engineering and depth feature engineering;
s3, extracting various algorithm models related to the fields from the algorithm platform according to the corresponding fields to form a set, and carrying out algorithm loading on the set formed by the various algorithm models, wherein the algorithm loading is to load and match the algorithm models according to a specific sequence set to form an algorithm integration model;
and S4, inputting the cleaned urban rail transit data into the algorithm integration model to obtain prediction result data.
Further, the algorithm integration model comprises a multi-stage processing algorithm model, and the multi-stage processing is processed according to a priority order: and setting the priority of a primary processing algorithm model to be more than … and more than the priority of an N-level processing algorithm model, wherein N is an integer more than or equal to 1.
Further, each level of processing algorithm model comprises a plurality of optimal algorithm models, each optimal algorithm model is obtained by calling a pre-trained evaluation model for optimal evaluation on a plurality of algorithm models in a corresponding algorithm model class, the algorithm model class is a plurality of algorithm models which are used for processing data by using different methods to obtain the same or substantially the same result, the processing comprises dimensionality reduction, association, clustering, classification and regression, and the evaluation model is used for evaluating the capability of the algorithm models for processing and predicting the data.
Further, the processing is classification, and the discriminant model specifically includes the following steps:
s301, acquiring a test data set corresponding to urban rail transit data to be predicted from a preset sample database;
s302, inputting the test data set into a plurality of algorithm models in the algorithm model class to obtain a plurality of test result sets, establishing a confusion matrix for the test result sets, and calculating corresponding evaluation indexes through the confusion matrix, wherein the evaluation indexes comprise accuracy, precision, recall, ROC curve, AUC and F1 harmonic mean value;
and S303, taking the algorithm model with the optimal evaluation index as an optimal algorithm model.
Further, the number of the optimal algorithm models is at least two, and the optimal algorithm models are connected in parallel or in series to form the processing of corresponding stages: a plurality of optimal algorithm models in the primary treatment are connected in parallel or in series to form primary stage treatment; a plurality of optimal algorithm models in the N-stage treatment are connected in parallel or in series to form N-stage treatment, wherein N is an integer larger than or equal to 1.
Further, the algorithm platform comprises a 3D model and image class, an operation and maintenance efficiency verification class and a driving index verification class:
the 3D models and image classes include: generating an image splicing algorithm, an OCR algorithm and a GANS picture;
the operation and maintenance performance verification class comprises: a model drift and update algorithm, an index measurement algorithm and an anomaly detection algorithm;
the driving index verification category includes: a time-series sequence classification prediction algorithm, a time-series sequence regression prediction algorithm, a multi-objective optimization algorithm and a clustering algorithm.
Further, the urban rail transit data is the data of the passenger flow volume of the urban rail transit at the time t when the urban rail transit enters or leaves the station;
the corresponding fields are: the value of the passenger flow volume of the urban rail transit at the time of the t +1 window when the urban rail transit enters or leaves the station;
the algorithm set comprises the following steps: the first-stage processing is a clustering algorithm model, and the second-stage processing is a long-time memory neural network model;
performing clustering model processing on the spatial distribution characteristics, and extracting line characteristics, station characteristics and section passenger flow characteristics of different subway stations, wherein the three characteristics are spatial characteristics; performing clustering model processing on the time distribution characteristics, extracting the daily passenger flow distribution characteristics in one week, dividing the daily passenger flow distribution characteristics into a plurality of time periods, and extracting the passenger flow distribution characteristics of each time period, wherein the two distribution characteristics are time characteristics;
and inputting urban rail transit passenger flow data, and memorizing a neural network model by combining time characteristics and space characteristics to a long-term time to obtain a value for predicting the arrival or departure passenger flow of the urban rail transit at the t +1 th time window.
Further, the urban rail transit data is a rail transit vehicle inspection position image:
the corresponding fields are: a suspected fault map of the component;
the algorithm set comprises the following steps: the first-stage processing is a classification algorithm model, and the second-stage processing is a fault detection model;
and performing classification model processing on the rail transit vehicle inspection part image to obtain a label labeled rail transit vehicle image classified according to structure and function, and inputting a test sample of the inspection part into the fault detection model for detection to obtain a suspected fault image set.
Further, the urban rail transit data is rail transit comprehensive monitoring alarm data and parameter configuration:
the corresponding fields are: warning short messages;
the algorithm set comprises the following steps: the first-stage treatment is a classification algorithm model, the second-stage treatment is a purification algorithm model, and the third-stage treatment is a decision algorithm model;
taking the collected alarm data and equipment and parameter configuration as input to carry out classification algorithm model processing, classifying the alarm data belonging to the same equipment or monitored object, and taking the classified alarm data as the input of a next-stage purification algorithm model; the purification algorithm model processes and refines the alarm data according to the sequence of grade and time to generate simple and clean data, and then the simple and clean data is input as a next-stage decision algorithm model; the decision algorithm model generates the short message content by simplifying the pure data, and finally generates an output message, wherein the generation comprises the determination of alarm content, suggestion, processing measure, feedback content request and sending object.
An algorithmic service means comprising:
one or more memories for storing the data to be transmitted,
one or more processors for executing a program to perform,
a plurality of modules stored in the memory and executed by the processor, the modules comprising:
the data receiving and calibrating module is used for acquiring urban rail transit data to be predicted and corresponding fields, and performing auxiliary calibration and cleaning on the urban rail transit data to obtain cleaned urban rail transit data and corresponding fields;
the characteristic extraction module is used for carrying out characteristic engineering on the cleaned urban rail transit data to obtain a characteristic training set, and training according to the characteristic training set to obtain various algorithm models, wherein the characteristic engineering comprises statistical characteristic engineering, graph characteristic engineering and depth characteristic engineering;
the algorithm extraction and loading module is used for extracting various algorithm models related to the fields from an algorithm platform according to the corresponding fields to form a set, and carrying out algorithm loading on the set formed by the various algorithm models, wherein the algorithm loading is to load and match the algorithm models according to a specific sequence set to form an algorithm integration model;
and the prediction result returning module is used for inputting the cleaned urban rail transit data into the algorithm integration model to obtain prediction result data.
Currently, supervised learning based on training samples with explicit labels or results remains a major model training approach, both in the traditional machine learning domain and in the modern popular deep learning domain. Especially in the field of deep learning, more data is needed to improve the model effect. At present, there are some large-scale public data sets, such as ImageNet, COCO, etc. in the image field, which can provide references. For most enterprise developers, the AI model application service needs to be customized by using actual business data in the professional field to ensure that the AI model application service can be better applied to business. Therefore, the collection and labeling of the business scene data are essential important links in the actual AI model development process.
According to the invention, by establishing an intelligent algorithm platform, including providing integrated data access, cleaning conversion, model construction and model application services, the data prediction processing is more accurate, consistent, complete and efficient.
The intelligent algorithm platform comprises the service of an algorithm model and the algorithm model training, the algorithm model training aims at a rear-end complaint engineer, various machine learning algorithms are improved for users to dynamically select and use, complex data calculation is automatically applied to the analysis process of mass data, and the whole process of investigation, development, online evaluation and algorithm effect evaluation of the algorithm engineer is covered: data processing, feature engineering, model training, model evaluation, model publishing, online prediction and effect evaluation.
The service of the algorithm model is provided with the service aiming at the machine learning algorithm model aiming at the front-end user; acquiring prediction data uploaded by a service use user; calling a corresponding machine learning algorithm model according to the prediction request, and executing prediction service; and sending the prediction result to the service use user.
The invention has the following beneficial effects:
1. and aiming at different urban rail transit data, providing a matched algorithm model, providing a common algorithm and presetting the common algorithm in the system. The algorithm can be directly selected for training operation without paying attention to model development;
2. and (4) directly training the model by using a preset algorithm without encoding. Common frames (such as TensorFlow, PyTorch and the like) are supported, a user does not need to configure the algorithm frame by himself, and development cost is saved.
3. The intelligent algorithm middle platform can carry various algorithm capabilities, greatly promotes the efficiency of links such as data science, algorithm modeling and training, model version control, model deployment and the like in the algorithm development and application process, and exerts the due value of the algorithm capabilities in the set of framework.
Drawings
FIG. 1 is a schematic diagram of the service method of the algorithmic model of the present invention;
FIG. 2 is a schematic service flow diagram of an algorithmic model of the present invention;
FIG. 3 is a schematic diagram of an algorithm service device of the present invention;
FIG. 4 is a schematic view of three general types of tool boxes for characterizing engineering according to the present invention;
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited to these examples.
In the description of the present invention, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "longitudinal", "lateral", "horizontal", "inner", "outer", "front", "rear", "top", "bottom", and the like indicate orientations or positional relationships that are based on the orientations or positional relationships shown in the drawings, or that are conventionally placed when the product of the present invention is used, and are used only for convenience in describing and simplifying the description, but do not indicate or imply that the device or element referred to must have a particular orientation, be constructed in a particular orientation, and be operated, and thus should not be construed as limiting the invention.
In the description of the present invention, it should also be noted that, unless otherwise explicitly specified or limited, the terms "disposed," "open," "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
The intelligent algorithm platform is a one-stop data modeling analysis platform based on a cloud platform, big data computing service, a deep machine learning framework and related software and hardware support. The functions of the algorithm platform can comprise several functional modules of data import and export, data preprocessing, data statistical analysis, data mining, model development, model training, model release and model management.
The system architecture of the intelligent algorithm platform can be divided into service of the algorithm model and algorithm model training, the service of the algorithm model can provide uniform prediction service, a user can conveniently manage the prediction service, and basic management and scheduling functions such as creation, expansion, contraction, deletion and the like are provided.
The algorithm model training provides two-classification evaluation, multi-classification evaluation, regression model evaluation and clustering model evaluation, and satisfies evaluation analysis of classification, regression and clustering algorithm models. The support model services run on the CPU and GPU. Seamless docking of a PMML (Predictive Model Markup Language) Model, a TensorFlow Model and a custom Model is supported. The method supports the realization of the safe release of a new model through a deployment mechanism (cut flow according to percentage), version control, quick rollback and other mechanisms, and provides an intelligent operation and maintenance monitoring graph.
The algorithm model training also comprises a human in-loop stage, wherein the human in-loop stage comprises the steps of randomly extracting a small number of samples from the verification data set, carrying out the human in-loop labeling and forming sample training, using the labeled small number of samples to train and feed back to the model, and adjusting the model according to the feedback.
The algorithm model training also comprises the steps of managing the type and the corresponding version of the running algorithm, configuring a starting version and a historical available version list for the algorithms of different types in advance, monitoring the running condition of the algorithm in real time in the running process, and rolling back to the last stable version if an abnormality occurs.
The overall architecture of the services of the algorithmic model can be roughly divided into two layers, an API layer and an execution layer. API layer: the online prediction service API has two types: a forecast service API and a forecast request API. There are different designs on the frame, according to the different characteristics and requirements of each type. The prediction service API: responsible for creating, deploying, deleting, modifying, etc. the forecast service. Prediction request API: and processing the prediction request sent by the client and returning a prediction result. An execution layer: all computing resources are managed through cloud management applications, and each node in a cluster of the cloud management applications can be a cloud server or a physical machine.
In one example, the algorithm platform comprises a prediction platform, which facilitates user management of prediction services, and provides basic management and scheduling functions of creation, expansion, contraction, deletion and the like. The prediction platform has a generic prediction runtime: providing a unified predictive API service. The prediction platform supports model formats of various frames, the operation on a CPU and a GPU and the acceleration of various models.
An intelligent algorithm platform construction method for urban rail transit data, wherein the intelligent algorithm platform is provided with services aiming at an algorithm model, and the services of the algorithm model comprise:
s1, acquiring urban rail transit data to be predicted and corresponding fields, and performing auxiliary calibration and cleaning on the urban rail transit data to obtain cleaned urban rail transit data and corresponding fields;
s2, performing feature engineering on the cleaned urban rail transit data to obtain a feature training set, and training according to the feature training set to obtain various algorithm models, wherein the feature engineering comprises statistical feature engineering, graph feature engineering and depth feature engineering;
s3, extracting various algorithm models related to the fields from the algorithm platform according to the corresponding fields to form a set, and carrying out algorithm loading on the set formed by the various algorithm models, wherein the algorithm loading is to load and match the algorithm models according to a specific sequence set to form an algorithm integration model;
and S4, inputting the cleaned urban rail transit data into the algorithm integration model to obtain prediction result data.
Assisted calibration, also known as active learning algorithms, typically consists of two modules:
the first module is an anomaly detection module. The module finds a series of data which are in urban rail transit data and are least similar to normal data in mode, and the data are used as data of possible abnormal samples;
the second module is an exception retrieval module. The module searches fault samples of urban rail transit data by means of an efficient time sequence search algorithm and pushes abnormal samples to operation and maintenance personnel for calibration.
The realization of this function has solved the condition of extravagant a large amount of manpowers in data label manufacture process in the track traffic intelligence application development process, changes existing artifical mark into automatic mark, has promoted track traffic intelligence application development efficiency greatly.
Feature engineering
Due to the characteristics of the rail transit industry, data has the characteristics of structuralization, high coupling, strong principle and the like, and a large amount of valuable data can be generated and can be directly used by other service systems in the implementation process of the traditional characteristic engineering method. Therefore, the feature engineering is carried out on functional modules (statistical features, graph features and depth features) of three major tool boxes.
Feature construction refers to the artificial construction of new features from raw data. In the process, developers find out some characteristics which have the physical significance of rail transit or can express the logical significance of business from the original data. Typically, new features are created using mixed or combined attributes, or by decomposing or slicing the original features.
The object of feature extraction is raw data (raw data), whose purpose is to automatically construct new features, converting raw features into a set of variables with obvious physical or statistical significance or kernel. For example, the number of values of a certain feature in the original data is reduced by transforming the feature values.
Feature selection, which aims to select a group of feature subsets with the most statistical significance from a feature set, thereby achieving the effect of reducing dimensions.
Further, the algorithm integration model comprises a multi-stage processing algorithm model, and the multi-stage processing is processed according to a priority order: and setting the priority of a primary processing algorithm model to be more than … and more than the priority of an N-level processing algorithm model, wherein N is an integer more than or equal to 1.
Further, each level of processing algorithm model comprises a plurality of optimal algorithm models, each optimal algorithm model is obtained by calling a pre-trained evaluation model for optimal evaluation on a plurality of algorithm models in a corresponding algorithm model class, the algorithm model class is a plurality of algorithm models which are used for processing data by using different methods to obtain the same or substantially the same result, the processing comprises dimensionality reduction, association, clustering, classification and regression, and the evaluation model is used for evaluating the capability of the algorithm models for processing and predicting the data.
Further, the processing is classification, and the discriminant model specifically includes the following steps:
s301, acquiring a test data set corresponding to urban rail transit data to be predicted from a preset sample database;
s302, inputting the test data set into a plurality of algorithm models in the algorithm model class to obtain a plurality of test result sets, establishing a confusion matrix for the test result sets, and calculating corresponding evaluation indexes through the confusion matrix, wherein the evaluation indexes comprise accuracy, precision, recall, ROC curve, AUC and F1 harmonic mean value;
the F1 harmonic mean is the harmonic value of the precision rate and the recall rate, and is closer to the two smaller ones so that the F value is the maximum when the precision rate and the recall rate are close. The evaluation index of many recommendation systems is based on the F value. 2/F1 is 1/Precision +1/Recall, Precision is Precision and Recall is Recall.
For the classification problem, the evaluation indexes used are: accuracy, Precision of Matrix, Recall, F beta Score, and AUCKS.
For the regression problem, the evaluation indexes used were: mean Absolute Error, Mean Squared Error, Root Mean Squared Error: root mean square error, Coefficients of determination coefficients.
For the clustering problem, the used evaluation indexes are: landed index, mutual information, contour coefficient.
And S303, taking the algorithm model with the optimal evaluation index as an optimal algorithm model.
Further, the number of the optimal algorithm models is at least two, and the optimal algorithm models are connected in parallel or in series to form the processing of corresponding stages: a plurality of optimal algorithm models in the primary treatment are connected in parallel or in series to form primary stage treatment; a plurality of optimal algorithm models in the N-stage treatment are connected in parallel or in series to form N-stage treatment, wherein N is an integer larger than or equal to 1.
Further, the algorithm platform comprises a 3D model and image class, an operation and maintenance efficiency verification class and a driving index verification class:
the 3D models and image classes include: generating an image splicing algorithm, an OCR algorithm and a GANS picture;
the operation and maintenance performance verification class comprises: a model drift and update algorithm, an index measurement algorithm and an anomaly detection algorithm;
the driving index verification category includes: a time-series sequence classification prediction algorithm, a time-series sequence regression prediction algorithm, a multi-objective optimization algorithm and a clustering algorithm.
Example 1
The urban rail transit data is the passenger flow volume data of the urban rail transit at the time t;
the corresponding fields are: the value of the passenger flow volume of the urban rail transit at the time of t + 1;
the algorithm set comprises the following steps: the first-stage processing is a clustering algorithm model, and the second-stage processing is a long-time memory neural network model;
performing clustering model processing on the spatial distribution characteristics, and extracting line characteristics, station characteristics and section passenger flow characteristics of different subway stations, wherein the three characteristics are spatial characteristics; performing clustering model processing on the time distribution characteristics, extracting the daily passenger flow distribution characteristics in one week, dividing the daily passenger flow distribution characteristics into a plurality of time periods, and extracting the passenger flow distribution characteristics of each time period, wherein the two distribution characteristics are time characteristics;
and inputting urban rail transit passenger flow data, and memorizing a neural network model according to time characteristics and space characteristics to long and short time to obtain a value for predicting the urban rail transit arrival or departure passenger flow of the t +1 time period window.
Example 2
The urban rail transit data is a rail transit vehicle inspection position image:
the corresponding fields are: a suspected fault map of the component;
the algorithm set comprises the following steps: the first-stage processing is a classification algorithm model, and the second-stage processing is a fault detection model;
and performing classification model processing on the rail transit vehicle inspection part image to obtain a label labeled rail transit vehicle image classified according to structure and function, and inputting a test sample of the inspection part into the fault detection model for detection to obtain a suspected fault image set.
Example 3
The urban rail transit data is rail transit comprehensive monitoring alarm data and parameter configuration:
the corresponding fields are: warning short messages;
the algorithm set comprises the following steps: the first-stage treatment is a classification algorithm model, the second-stage treatment is a purification algorithm model, and the third-stage treatment is a decision algorithm model;
taking the collected alarm data and equipment and parameter configuration as input to carry out classification algorithm model processing, classifying the alarm data belonging to the same equipment or monitored object, and taking the classified alarm data as the input of a next-stage purification algorithm model; the purification algorithm model processes and refines the alarm data according to the grade, the time sequence, the repeated judgment and the like to generate simple and clean data, and then the simple and clean data is input as a next-stage decision algorithm model; the decision algorithm model generates the short message content by simplifying the pure data, and finally generates an output message, wherein the generation comprises the determination of alarm content, suggestion, processing measure, feedback content request and sending object.
Example 4
The present embodiment aims to provide an algorithm service device, including:
one or more memories for storing the data to be transmitted,
one or more processors for executing a program to perform,
a plurality of modules stored in the memory and executed by the processor, the modules comprising:
the data receiving and calibrating module is used for acquiring urban rail transit data to be predicted and corresponding fields, and performing auxiliary calibration and cleaning on the urban rail transit data to obtain cleaned urban rail transit data and corresponding fields;
the characteristic extraction module is used for carrying out characteristic engineering on the cleaned urban rail transit data to obtain a characteristic training set, and training according to the characteristic training set to obtain various algorithm models, wherein the characteristic engineering comprises statistical characteristic engineering, graph characteristic engineering and depth characteristic engineering;
the algorithm extraction and loading module is used for extracting various algorithm models related to the fields from an algorithm platform according to the corresponding fields to form a set, and carrying out algorithm loading on the set formed by the various algorithm models, wherein the algorithm loading is to load and match the algorithm models according to a specific sequence set to form an algorithm integration model;
and the prediction result returning module is used for inputting the cleaned urban rail transit data into the algorithm integration model to obtain prediction result data.
The foregoing is only a preferred embodiment of the present invention, and the present invention is not limited thereto in any way, and any simple modification, equivalent replacement and improvement made to the above embodiment within the spirit and principle of the present invention still fall within the protection scope of the present invention.

Claims (10)

1. An intelligent algorithm platform construction method for urban rail transit data is characterized in that the intelligent algorithm platform is provided with services aiming at an algorithm model, and the services of the algorithm model comprise:
s1, acquiring urban rail transit data to be predicted and corresponding fields, and performing auxiliary calibration and cleaning on the urban rail transit data to obtain cleaned urban rail transit data and corresponding fields;
s2, performing feature engineering on the cleaned urban rail transit data to obtain a feature training set, and training according to the feature training set to obtain various algorithm models, wherein the feature engineering comprises statistical feature engineering, graph feature engineering and depth feature engineering;
s3, extracting various algorithm models related to the fields from the algorithm platform according to the corresponding fields to form a set, and carrying out algorithm loading on the set formed by the various algorithm models, wherein the algorithm loading is to load and match the algorithm models according to a specific sequence set to form an algorithm integration model;
and S4, inputting the cleaned urban rail transit data into the algorithm integration model to obtain prediction result data.
2. The method for constructing the intelligent algorithm platform for the urban rail transit data according to claim 1, wherein the algorithm integration model comprises a multi-stage processing algorithm model, and the multi-stage processing is performed according to a priority order: and setting the priority of a primary processing algorithm model to be more than … and more than the priority of an N-level processing algorithm model, wherein N is an integer more than or equal to 1.
3. The method as claimed in claim 2, wherein each level of processing algorithm model comprises a plurality of optimal algorithm models, each optimal algorithm model is obtained by calling a pre-trained evaluation model for optimal evaluation on a plurality of algorithm models in a corresponding algorithm model class, the algorithm model class is a plurality of algorithm models which process data by using different methods to obtain the same or substantially the same result, the processing comprises dimensionality reduction, association, clustering, classification and regression, and the evaluation model is used for evaluating the capability of the algorithm model on data processing and prediction.
4. The method for constructing the intelligent algorithm platform of the urban rail transit data according to claim 3, wherein the processing is classification, and the discriminant model specifically comprises the following steps:
s301, acquiring a test data set corresponding to urban rail transit data to be predicted from a preset sample database;
s302, inputting the test data set into a plurality of algorithm models in the algorithm model class to obtain a plurality of test result sets, establishing a confusion matrix for the test result sets, and calculating corresponding evaluation indexes through the confusion matrix, wherein the evaluation indexes comprise accuracy, precision, recall, ROC curve, AUC and F1 harmonic mean value;
and S303, taking the algorithm model with the optimal evaluation index as an optimal algorithm model.
5. The method for constructing the intelligent algorithm platform for the urban rail transit data according to claim 3, wherein the number of the optimal algorithm models is at least two, and the optimal algorithm models are connected in parallel or in series to form the processing of the corresponding stages: a plurality of optimal algorithm models in the primary treatment are connected in parallel or in series to form primary stage treatment; a plurality of optimal algorithm models in the N-stage treatment are connected in parallel or in series to form N-stage treatment, wherein N is an integer larger than or equal to 1.
6. The method for constructing the intelligent algorithm platform of the urban rail transit data according to claim 1, wherein the algorithm platform comprises a 3D model and image class, an operation and maintenance effectiveness verification class and a driving index verification class:
the 3D models and image classes include: an image splicing algorithm, an OCR algorithm and a GANS picture generation algorithm;
the operation and maintenance performance verification class comprises: a model drift and update algorithm, an index measurement algorithm and an anomaly detection algorithm;
the driving index verification category includes: a time-series sequence classification prediction algorithm, a time-series sequence regression prediction algorithm, a multi-objective optimization algorithm and a clustering algorithm.
7. The method for constructing the urban rail transit data intelligent algorithm platform according to claim 1, wherein the urban rail transit data is urban rail transit inbound or outbound passenger flow volume data at time t;
the corresponding fields are: the value of the passenger flow volume of the urban rail transit at the time of the t +1 window when the urban rail transit enters or leaves the station;
the algorithm set comprises the following steps: the first-stage processing is a clustering algorithm model, and the second-stage processing is a long-time memory neural network model;
performing clustering model processing on the spatial distribution characteristics, and extracting line characteristics, station characteristics and section passenger flow characteristics of different subway stations, wherein the three characteristics are spatial characteristics; performing clustering model processing on the time distribution characteristics, extracting the daily passenger flow distribution characteristics in one week, dividing the daily passenger flow distribution characteristics into a plurality of time periods, and extracting the passenger flow distribution characteristics of each time period, wherein the two distribution characteristics are time characteristics;
and inputting urban rail transit passenger flow data, and memorizing a neural network model by combining time characteristics and space characteristics to a long-term time to obtain a value for predicting the arrival or departure passenger flow of the urban rail transit at the t +1 th time window.
8. The method for constructing the intelligent algorithm platform for the urban rail transit data according to claim 1, wherein the urban rail transit data is a rail transit vehicle viewing location image:
the corresponding fields are: a suspected fault map of the component;
the algorithm set comprises the following steps: the first-stage processing is a classification algorithm model, and the second-stage processing is a fault detection model;
and performing classification model processing on the rail transit vehicle inspection part image to obtain a label labeled rail transit vehicle image classified according to structure and function, and inputting a test sample of the inspection part into the fault detection model for detection to obtain a suspected fault image set.
9. The method for constructing the intelligent algorithm platform of the urban rail transit data according to claim 1, wherein the urban rail transit data is rail transit comprehensive monitoring alarm data and parameter configuration:
the corresponding fields are: warning short messages;
the algorithm set comprises the following steps: the first-stage treatment is a classification algorithm model, the second-stage treatment is a purification algorithm model, and the third-stage treatment is a decision algorithm model;
taking the collected alarm data and equipment and parameter configuration as input to carry out classification algorithm model processing, classifying the alarm data belonging to the same equipment or monitored object, and taking the classified alarm data as the input of a next-stage purification algorithm model; the purification algorithm model processes and refines the alarm data according to the sequence of grade and time to generate simple and clean data, and then the simple and clean data is input as a next-stage decision algorithm model; the decision algorithm model generates the short message content by simplifying the pure data, and finally generates an output message, wherein the generation comprises the determination of alarm content, suggestion, processing measure, feedback content request and sending object.
10. An algorithmic service device comprising:
one or more memories for storing the data to be transmitted,
one or more processors for executing a program to perform,
a plurality of modules stored in the memory and executed by the processor, the modules comprising:
the data receiving and calibrating module is used for acquiring urban rail transit data to be predicted and corresponding fields, and performing auxiliary calibration and cleaning on the urban rail transit data to obtain cleaned urban rail transit data and corresponding fields;
the characteristic extraction module is used for carrying out characteristic engineering on the cleaned urban rail transit data to obtain a characteristic training set, and training according to the characteristic training set to obtain various algorithm models, wherein the characteristic engineering comprises statistical characteristic engineering, graph characteristic engineering and depth characteristic engineering;
the algorithm extraction and loading module is used for extracting various algorithm models related to the fields from an algorithm platform according to the corresponding fields to form a set, and carrying out algorithm loading on the set formed by the various algorithm models, wherein the algorithm loading is to load and match the algorithm models according to a specific sequence set to form an algorithm integration model;
and the prediction result returning module is used for inputting the cleaned urban rail transit data into the algorithm integration model to obtain prediction result data.
CN202111101607.5A 2021-09-18 2021-09-18 Intelligent algorithm platform construction method for urban rail transit data Pending CN113807704A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111101607.5A CN113807704A (en) 2021-09-18 2021-09-18 Intelligent algorithm platform construction method for urban rail transit data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111101607.5A CN113807704A (en) 2021-09-18 2021-09-18 Intelligent algorithm platform construction method for urban rail transit data

Publications (1)

Publication Number Publication Date
CN113807704A true CN113807704A (en) 2021-12-17

Family

ID=78895984

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111101607.5A Pending CN113807704A (en) 2021-09-18 2021-09-18 Intelligent algorithm platform construction method for urban rail transit data

Country Status (1)

Country Link
CN (1) CN113807704A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114330562A (en) * 2021-12-31 2022-04-12 大箴(杭州)科技有限公司 Small sample refinement classification and multi-classification model construction method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114330562A (en) * 2021-12-31 2022-04-12 大箴(杭州)科技有限公司 Small sample refinement classification and multi-classification model construction method
CN114330562B (en) * 2021-12-31 2023-09-26 大箴(杭州)科技有限公司 Small sample refinement classification and multi-classification model construction method

Similar Documents

Publication Publication Date Title
Wu et al. Literature review and prospect of the development and application of FMEA in manufacturing industry
Al-Janabi et al. A new method for prediction of air pollution based on intelligent computation
Saldivar et al. Self-organizing tool for smart design with predictive customer needs and wants to realize Industry 4.0
CN103745229A (en) Method and system of fault diagnosis of rail transit based on SVM (Support Vector Machine)
KR20200052806A (en) Operating method of deep learning based climate change prediction system
US11954923B2 (en) Method for rating a state of a three-dimensional test object, and corresponding rating system
CN115097788A (en) Intelligent management and control platform based on digital twin factory
CN111444169A (en) Transformer substation electrical equipment state monitoring and diagnosis system and method
CN114267178B (en) Intelligent operation maintenance method and device for station
CN112530559A (en) Intelligent medical material allocation system for sudden public health event
CN114297935A (en) Airport terminal building departure optimization operation simulation system and method based on digital twin
CN117495210B (en) Highway concrete construction quality management system
KR20220072311A (en) Method for designing intelligent integrated logistics platform
CN116843071A (en) Transportation network operation index prediction method and device for intelligent port
CN113807704A (en) Intelligent algorithm platform construction method for urban rail transit data
Do Amaral et al. Energy Digital Twin Applications: A Review
CN116611813B (en) Intelligent operation and maintenance management method and system based on knowledge graph
CN115438190B (en) Power distribution network fault auxiliary decision knowledge extraction method and system
Schachinger et al. An advanced data analytics framework for energy efficiency in buildings
Gürbüz et al. Classification rule discovery for the aviation incidents resulted in fatality
Bond et al. A hybrid learning approach to prognostics and health management applied to military ground vehicles using time-series and maintenance event data
CN116956994A (en) Service platform capacity expansion prediction method and device
Pattnaik et al. A survey on machine learning techniques used for software quality prediction
KR20200002433A (en) Statistical quality control system and method using big data analysis
CN114265891A (en) Intelligent workshop system and method based on multi-source data fusion and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination