CN107704971A - A kind of data processing method and device of real-time estimate airport security number - Google Patents

A kind of data processing method and device of real-time estimate airport security number Download PDF

Info

Publication number
CN107704971A
CN107704971A CN201711027543.2A CN201711027543A CN107704971A CN 107704971 A CN107704971 A CN 107704971A CN 201711027543 A CN201711027543 A CN 201711027543A CN 107704971 A CN107704971 A CN 107704971A
Authority
CN
China
Prior art keywords
data
prediction
airport security
airport
extraction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711027543.2A
Other languages
Chinese (zh)
Inventor
王殿胜
薄满辉
权珩
张凯伦
籍焱
谢世局
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Travelsky Mobile Technology Co Ltd
Original Assignee
China Travelsky Mobile Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Travelsky Mobile Technology Co Ltd filed Critical China Travelsky Mobile Technology Co Ltd
Priority to CN201711027543.2A priority Critical patent/CN107704971A/en
Publication of CN107704971A publication Critical patent/CN107704971A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Tourism & Hospitality (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention discloses a kind of data processing method of real-time estimate airport security number, including:To carrying out including data cleansing, structuring processing and the pretreatment of Data Integration using data;Feature extraction is carried out to pretreated data;The multiple prediction submodels of feature input of extraction are subjected to airport security number prediction, and export prediction result respectively;The prediction result of multiple prediction submodel outputs is assessed respectively, each prediction submodel weighted value is assigned according to assessment result and forms built-up pattern, and the prediction result of the prediction of airport security number and output matching is carried out according to built-up pattern.The present invention is predicted using Multi-Model Combination, and the prediction result of usage time series model, random forest and GBDT is assessed, and then is carried out assignment to the weight of every kind of model and formed combination forecasting, reaches higher predictablity rate.Concurrently, the present invention also provides a kind of processing unit of real-time estimate airport security number.

Description

A kind of data processing method and device of real-time estimate airport security number
Technical field
The present invention relates to data processing field, more particularly to a kind of data processing method of real-time estimate airport security number And device.
Background technology
Airport possesses huge passenger throughput, and the airport services such as safety check, security protection, accident are emergent are intended to lead to Following passenger throughput of prediction is crossed, and allocates manpower and materials in advance accordingly, and is preferably passenger facilities and reply burst feelings Condition.
International Air Transport Association (IATA) is following 20 years, it is expected that following demand of passenger transport average growth rate per annum is up to 3.7% The demand of travelling by air will double.The Air transportation service quality in China is still not enough to and big country of civil aviaton status phase at present Match somebody with somebody, compared with developed country of civil aviaton, a certain distance in service level and quality all be present, still can not fully adapt to reform opens Put the needs with socio-economic development.
With the growth of airport passenger amount, to ensure safety of civil aviation Effec-tive Function, safety check assume responsibility for great safety guarantee Pressure, while be also that passenger is unsatisfied with one of maximum several links.Therefore the safety check number on airport is precisely predicted, will be helpful to Airport Service Source Optimized Operation is realized, improves airport security efficiency.It can pass through following two methods in present airport service To reduce passenger's safety check stand-by period:One kind is increase Mag & Bag Service stock number;Another kind is to carry out intelligent tune to Service Source Degree, i.e., according to safety check queue number dispatch service resource.And from the point of view of combining actual conditions, intelligent scheduling is carried out to Service Source Be solve passenger's safety check stand-by period length effective way, development and construction to following wisdom airport have certain value and Meaning.
Chinese invention patent " is used to analyze data processing method and the device that the passenger of airport Mag & Bag Area reaches " (application Number:201610607353.7) disclose a kind of can be predicted according to the real time data of check-in sales counter to safety check queue number Technical scheme;Chinese invention patent " towards airport building dynamic resource allocation and the data integration model of intelligent scheduling " (application number:201510067010.1) disclose one kind can for real-time number of passengers come realize airport building resource dynamic Configuration and the technical scheme of the data integration of intelligent scheduling;There are some in existing using flight planning to be pre- according to being carried out to passenger flow Survey.
Technical scheme disclosed by above-mentioned two patent and it is existing in the technical scheme that is used to be predicted airport passenger flow deposit The defects of be:First, on data use, there is certain limitation;Second, airport passenger flow estimation is ground in existing The data used are studied carefully either from check-in sales counter real time data or based on being flight planning data, these data are all not enough to Security staff is accurately predicted;Third, most of airports are using the flight planning amount of leaving the port as according to security check trip at present Guest's number is predicted, and this method can not be realized precisely and in real time to be grasped to passenger flow, it is very difficult to is realized to airport security resource Dynamic dispatching;Fourth, the real time data for having use value rack platform in existing airport carries out safety check number prediction, do not account for going through The influence of history data, safety check number precision of prediction can not be ensured.
For the above mentioned problem in correlation technique, still lack a kind of multiple machine learning algorithms of combination at present to handle airport The method of passenger's True Data, airport security number precisely is predicted by establishing model, namely lack a kind of real-time estimate machine The data processing method and device of field safety check number.
The content of the invention
The technical problem of the solution of the present invention is to combine multiple machine learning algorithms for above-mentioned existing one kind that lacks to locate The method for managing airport passenger real time data and historical data, provides a kind of data processing side of real-time estimate airport security number Method, airport security number is precisely predicted by establishing built-up pattern.Concurrently, the present invention also provides a kind of real-time estimate airport The processing unit of safety check number.
To solve above-mentioned technical problem, the technical scheme that the present invention takes is as follows:A kind of real-time estimate airport security number Data processing method, including:
It is described to use data to carrying out including data cleansing, structuring processing and the pretreatment of Data Integration using data Including airport passenger historical data, passenger ticket reservation data, agent's seat reservation system data, departure from port data, passenger ticket data and airport Following flight planning data;
Pretreated data are carried out with feature extraction, the feature extraction includes the history of extraction airport security flow of the people Feature, flight feature and passenger's future stroke characteristic;
The multiple prediction submodels of feature input of extraction are subjected to airport security number prediction, and output prediction knot respectively Fruit, the multiple forecast model include RAMA models, linear regression model (LRM), random forest regression model and GDBT models;
The prediction result of multiple prediction submodel outputs is assessed respectively, each prediction submodule is assigned according to assessment result Type weighted value and formation built-up pattern, and according to built-up pattern progress airport security number prediction and the prediction knot of output matching Fruit.
It is expanded on further as to above-mentioned technical proposal:
In the above-mentioned technical solutions, described pair is included using data progress data cleansing:Remove the exception used in data Record and/or increase using the missing values in data;
Described pair carries out structuring processing using data and includes:To being standardized and being stored using data, and/or To using data sliding-model control and store, wherein, standardization is included field according to business and model requirements Value is mapped to continuous type section, and sliding-model control becomes discrete type field including continuous type field according to cut-point;
Described pair carries out integration using data and includes:According to the demand of prediction airport security flow of the people come by required difference Data source carries out integration and makes the wide table of a data.
In the above-mentioned technical solutions, it is described that the progress feature extraction of pretreated data is included:
The history sometime put is extracted with the statistical nature of the airport security flow of the people at time point, the statistical nature bag Include average, median, standard deviation, maximum and the minimum value of flow of the people;Or before extraction predicted time node in specific duration Airport security flow of the people sum;And/or the Flight Information of extraction airport planning;
The statistical nature of extraction, flow of the people sum and Flight Information are stored with eigenmatrix.
In the above-mentioned technical solutions, the multiple prediction submodels of feature input by extraction be predicted including:
Feature according to extraction carries out vectorization processing by each prediction submodel input demand and creates eigenmatrix, and often One eigenmatrix includes training set subcharacter matrix and test set subcharacter matrix;
Model instruction will be carried out in the algorithm bag of the prediction submodel of matching in training set subcharacter input Spark MLlib White silk forms training pattern;
The training pattern of test set subcharacter input matching is tested and exports test and prediction result.
In the above-mentioned technical solutions, the prediction result to each prediction submodel output is assessed, and is tied according to assessing Fruit, which assigns each prediction submodel weighted value and forms built-up pattern, to be included:
The prediction result of each prediction submodel output is assessed using mean square error, and exports assessment result;
The importance and its weighted value of each prediction submodel are determined according to assessment result corresponding to each prediction submodel;
Weighted value according to each prediction submodel forms the built-up pattern for prediction.
According to another aspect of the present invention, a kind of processing unit of real-time estimate airport security number is provided, including:
Pretreatment module, for carrying out including data cleansing, structure to the use data predicted for airport security number Change processing and the pretreatment of Data Integration;
Characteristic extracting module, for including history feature, the boat of airport security flow of the people to the extraction of pretreated data Class's feature and passenger's future stroke characteristic;
Prediction module, for the multiple prediction submodels of feature input of extraction to be carried out into airport security number prediction, and divide Prediction result is not exported;
Processing module, for assessing respectively the prediction result of multiple prediction submodel outputs, according to assessment result Assign each prediction submodel weighted value and form built-up pattern, and the prediction of airport security number and output are carried out according to built-up pattern The prediction result of matching.
It is expanded on further as to said apparatus, the pretreatment module includes:
Data cleansing module, for removing using the exception record in data and/or increasing using the missing values in data;
Structuring processing module, for being standardized and stored using data, and/or to being carried out using data Sliding-model control simultaneously stores;
Data Integration module, required different data sources are carried out for the demand according to prediction airport security flow of the people Integration makes the wide table of a data.
It is expanded on further as to said apparatus, the characteristic extracting module includes:
Fisrt feature extraction module, for extracting system of the history sometime put with the airport security flow of the people at time point Count feature;
Second feature extraction module, for extracting the airport security flow of the people before predicted time node in specific duration;
Flight characteristic extracting module, for extracting the Flight Information of airport planning.
It is expanded on further as to said apparatus, the prediction module includes:
Processing unit, vectorization processing is carried out by each prediction submodel input demand for the feature according to extraction;
Creating unit, it is special including training set subcharacter matrix and test collected works for the feature-modeling according to vectorization processing Levy the eigenmatrix of matrix;
Training unit, for the algorithm for predicting submodel for the matching for inputting training set subcharacter in Spark MLlib Model training is carried out in bag and forms training pattern;
Test cell, test set subcharacter is inputted to the training pattern matched and is tested and is exported test and predict knot Fruit.
It is expanded on further as to said apparatus, the processing module includes
Assessment unit, for assessing respectively the prediction result of multiple prediction submodel outputs;
First processing units, for assigning each prediction submodel weighted value according to assessment result and forming built-up pattern;
Second processing unit, for carrying out the prediction result of the prediction of airport security number and output matching according to built-up pattern
The beneficial effect of the data processing method of the real-time estimate airport security number of the present invention is:Present invention selection pair The history feature that plays an important role of airport passenger flow of the people prediction, flight arrange an order according to class and grade feature and passenger's future stroke characteristic be used for it is pre- Survey, improve prediction effect;Meanwhile the present invention be predicted using Multi-Model Combination, usage time series model, random forest with And GBDT prediction result is assessed, and then assignment is carried out to the weight of every kind of model and forms combination forecasting, reached Higher predictablity rate.
Brief description of the drawings
Accompanying drawing described herein is used for providing a further understanding of the present invention, forms the part of the application, this hair Bright schematic description and description is used to explain the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 is the flow chart of present invention prediction airport security number data processing method;
Fig. 2 is the structural representation of the device of real-time estimate airport security number according to embodiments of the present invention.
Embodiment
The invention will now be described in further detail with reference to the accompanying drawings.
The embodiment described by reference to accompanying drawing is exemplary, it is intended to for explaining the application, and it is not intended that right The limitation of the application.Describe the present invention in detail below with reference to accompanying drawing and in conjunction with the embodiments.It should be noted that do not rushing In the case of prominent, the feature in embodiment and embodiment in the application can be mutually combined.
It should be noted that term " first " in description and claims of this specification and above-mentioned accompanying drawing, " Two " etc. be for distinguishing similar object, without for describing specific order or precedence.
Before illustrating specific embodiment again, explanation first is illustrated to following technology herein, so as to convenient hereafter to specific Embodiment illustrates.Herein, the prediction submodel needed to use has:RAMA models (autoregressive moving-average model), Linear regression model (LRM), random forest regression model and GDBT (gradient lifts decision tree) model;
Arma modeling (Auto-Regressive and Moving Average Model) is the weight of search time sequence Method is wanted, is formed by autoregression model (abbreviation AR models) with " being mixed " based on moving average model (abbreviation MA models), it is main Will be based on to steadily time series be analyzed to establish model immediately, its form is simple, data are fitted it is more convenient, It is easy to the structure and inwardness of analyze data, carries out most preferably forecasting and controlling under the meaning of minimum variance, it is a kind of essence The higher Short-term Forecasting Model of exactness, because exponent number can adjust according to situation, also seem that comparison is flexible, but required go through History data volume is larger (typically more than 50), and existing historical data amount can meet completely, and the present invention uses airport history Passenger's flow of the people individually carries out time series analysis, excavates airport passenger flow of the people variation tendency and is predicted.
Linear regression is to utilize regression analysis in mathematical statistics, complementary between two or more variable to determine A kind of statistical analysis technique of quantitative relationship, with quite varied regression models, simple regression be it is most simple and sane, But often weak during the behavior of complication system, therefore traditional polynary of the more common of the Predicting Technique based on multiple regression is described Regression model be usually linear due to the dependency relation between insignificant variable that may be present and each independent variable, can lead There is serious morbid state in the normal equation group for causing to return, the stability of regression equation is had influence on, so multiple linear regression faces A basic problem be find " Optimal Regression Equation.
Random forest returns and gradient lifting decision tree recurrence is all the machine learning side returned with multiple decision trees Method, for random forest, feature and data that every decision tree is got all are random, and multiple decision trees are transported parallel OK, there is relatively good generalization ability.
For GBDT as a kind of decision Tree algorithms of iteration, final prediction result is tired out by the regression tree of each step pilot process Add acquisition, there is higher predictablity rate.
Embodiment 1
The present invention provides a kind of data processing method of real-time estimate airport security number, and Fig. 1 is present invention prediction airport The flow chart of safety check number data processing method, as shown in figure 1, the step of data processing method includes:
Step S102:To carrying out including data cleansing, structuring processing and the pretreatment of Data Integration using data;
Step S104:Feature extraction is carried out to pretreated data;
Step S106:The multiple prediction submodels of feature input of extraction are subjected to airport security number prediction, and it is defeated respectively Go out prediction result,
Step S108:The prediction result of multiple prediction submodel outputs is assessed respectively, assigned according to assessment result Each prediction submodel weighted value and formation built-up pattern, and carry out the prediction of airport security number and output matching according to built-up pattern Prediction result.
By the step S102 to step S106 of the present embodiment, using to carrying out including data cleansing, structure using data Change processing and the pretreatment of Data Integration;Feature extraction is carried out to pretreated data;The feature input of extraction is multiple pre- Survey submodel and carry out airport security number prediction, and export prediction result respectively, to the prediction knot of multiple prediction submodel outputs Fruit is assessed respectively, assigns each prediction submodel weighted value according to assessment result and forms built-up pattern, and according to combination die Type carries out the mode of the prediction result of the prediction of airport security number and output matching, solves in correlation technique only with single pre- Model, single features are surveyed to be predicted airport security number and cause that predictablity rate is low, single model predictive error is larger And the problem of being limited using data, the accuracy rate of prediction is lifted by way of multi model combination forecast, prevents single model The deviation of prediction
In the present embodiment, need to illustrate and be expanded on further is:In step s 102, airport trip is included using data Objective historical data, passenger ticket reservation data, agent's seat reservation system data, departure from port data, passenger ticket data and the flight in airport future Planning data, and pass through the safety check flow of the people history number at airport passenger historical data can learn airport each time point daily According to can be learnt following 24 hours by passenger ticket reservation and agent's seat reservation system data, departure from port data, passenger ticket data etc. Passenger's scale that interior each time point will arrive at the airport, it can be learnt i.e. by the flight planning in 24 hours futures of airport The information such as flight departure time, type will be included in the Flight Information of field takeoff, the Flight Information;In step S104, base It is to predict airport security flow of the people in following 24 hours in the purpose of the method for the present embodiment, and airport security flow of the people is main Feature for safety check flow of the people history feature, flight feature etc.;Common, history feature is arranged an order according to class and grade based on the daily flight in airport Table is basicly stable, so safety check flow of the people has a certain degree of periodicity, i.e. same time in two cycles in general The difference of point airport security flow of the people will not be very big, and history feature can characterize the steady of airport passenger flow of the people to a certain extent Qualitative features, i.e. history can relatively, so there is this feature higher reference to anticipate with the airport passenger flow of the people at time point Justice;And the flight and the related letter of flight that flight feature includes the daily Flight Information of airport planning namely passenger will take Breath, including flight number, departure place, arrival ground, Proposed Departure time, actual time of departure, acknowledgement of consignment type, flight number etc.;It is real In border, usual flight more period of arranging an order according to class and grade has a safety check flow of the people peak for the previous period, so flight feature can Using the key character predicted as airport security flow of the people and there is direct influence to prediction airport future flow of the people;So feature Extraction includes extracting history feature, flight feature and the passenger's future stroke characteristic of airport security flow of the people;In step s 106, To form prediction accurately built-up pattern, predict that submodel includes RAMA models (autoregressive moving average mould in the present embodiment Type), linear regression model (LRM), random forest regression model and GDBT (gradient lifts decision tree) model, certainly, it is above-mentioned enumerate it is pre- It is the prediction submodel that service efficiency is preferably and more conventional in practice to survey submodel, in practice, yet with by selecting other Model, it might even be possible to establish process to disassemble model by writing algorithm, be allowed to refine, also can reach lifting predictablity rate Purpose, it is therefore appreciated that the present embodiment is desirable to by forming a kind of built-up pattern of optimization, the built-up pattern bag Containing multiple prediction submodels and corresponding weight is arranged, the feature for influenceing airport security future flow of the people is applied to built-up pattern And the number of progress safety check is needed to predict airport future based on this, namely be desirable to improve the prediction of airport security number accurately Property.And the accuracy rate of prediction can be lifted by the way of multi model combination forecast, and should be predicted and make using single model Into prediction deviation
Preferably, in the optional embodiment of the present embodiment, in step s 102,
Described pair carries out data cleansing using data and includes step:S102-1:Remove use data in exception record and/ Or increase using the missing values in data;
Described pair carries out structuring processing using data and includes step S102-2:To being standardized simultaneously using data Storage, and/or to carrying out sliding-model control using data and storing,
Described pair carries out integration using data and includes step S102-3:According to prediction airport security flow of the people demand come by Required different data sources carry out integration and make the wide table of a data, by being integrated into the wide table of data, facilitate model algorithm Feature extraction.
It should be noted that for step S102-1, in the embodiment of the present embodiment, exception can be first removed Record increases missing values again, also can first increase missing values and remove exception record again, can also also carry out simultaneously, and first removes exception Record is to prevent its algorithm of interference model, and increases missing values, is to make the record of missing minority attribute to be applied to In model;For step S102-2, in the embodiment of the present embodiment, standardization is included according to business and mould Field value is mapped to continuous type section by type demand, for example, field value is mapped into [0,1] area according to business and model requirements Between, and sliding-model control is become into discrete type field including continuous type field according to cut-point.
As preferred, in the optional embodiment of the present embodiment, in step S104, to described to pretreated Data, which carry out feature extraction, to be accomplished by the following way:
Step S104-1:The history sometime put is extracted with the statistical nature of the airport security flow of the people at time point, institute Stating statistical nature includes average, median, standard deviation, maximum and the minimum value of flow of the people;
Step S104-2:Airport security flow of the people sum before extraction predicted time node in specific duration;
Step S104-3:Extract the Flight Information of airport planning;
Step S104-4, the statistical nature of extraction, flow of the people sum and Flight Information are stored with eigenmatrix.
It should be noted that for step S104-1 to step S104-4, in the embodiment of the present embodiment, carry Take the history sometime put with it is specific before the statistical nature of the airport security flow of the people at time point, extraction predicted time node when The Flight Information of airport security flow of the people sum and extraction airport planning in length, no precedence relation, can hold one by one OK, it can also synchronously perform, and extract after feature by the input quantity demand of each prediction submodel, it is necessary to which feature is carried out into vectorization Handle and constitutive characteristic matrix stores, such as, if the feature of extraction is applied in random forest or GBDT algorithms Just need to carry out One-Hot codings;It is emphasized that in the present embodiment, the selection of feature is not limited only to be to choose airport The history sometime put is the same as in specific duration before the statistical nature of the airport security flow of the people at time point, predicted time node Airport security flow of the people sum and the Flight Information of airport planning, the selection of feature can be further features, and further feature Combination, the purpose that the present embodiment Feature Selection can be achieved is combined by further feature.
As preferred, in the optional embodiment of the present embodiment, in step S106, the feature by extraction is defeated Enter multiple prediction submodels to be predicted and comprise the following steps:
Step S106-1:Feature according to extraction carries out vectorization processing by each prediction submodel input demand and creates spy Matrix is levied, and each eigenmatrix includes training set subcharacter matrix and test set subcharacter matrix;
Step S106-2:By the algorithm bag of the prediction submodel of the matching in training set subcharacter input Spark MLlib Middle progress model training forms training pattern;
Step S106-3:The training pattern of test set subcharacter input matching is tested and exports test and predict and is tied Fruit.
It is described to each prediction submodel in step S108 preferably, in the optional embodiment of the present embodiment The prediction result of output is assessed, and assigning each prediction submodel weighted value according to assessment result and forming built-up pattern is included such as Lower step:
Step S108-1:The prediction result of each prediction submodel output is assessed using mean square error, and exports and comments Estimate result;
Step S108-2:According to it is each prediction submodel corresponding to assessment result determine it is each prediction submodel importance and its Weighted value;
Step S108-3:Weighted value according to each prediction submodel forms the built-up pattern for prediction.
It should be noted that step S108-2 to step S108-3, when forming built-up pattern, usually using according to prediction The weighted value that the importance of model determines determines its proportionality coefficient in proportion, and the proportionality coefficient is by the ratio between each weighted value abbreviation Simplify and go out, can be percentage, can also be ratio of integers.
Through the above description of the embodiments, those skilled in the art can be understood that according to above-mentioned implementation The method of example can add the mode of required general hardware platform to realize by software, naturally it is also possible to by hardware, but a lot In the case of the former be more preferably embodiment.Based on such understanding, technical scheme is substantially in other words to existing The part that technology contributes can be embodied in the form of software product, and the computer software product is stored in a storage In medium (such as ROM/RAM, magnetic disc, CD), including some instructions to cause a station terminal equipment (can be mobile phone, calculate Machine, server, or network equipment etc.) method that performs each embodiment of the present invention.
Embodiment 2
A kind of device of real-time estimate airport security number is additionally provided in the present embodiment, and the device is above-mentioned for realizing Embodiment and preferred embodiment, repeating no more for explanation was carried out.As used below, term " module ", " list Member " can be the combination of the software and/or hardware of realizing predetermined function.Although device described by following examples preferably with Software is realized, but hardware, or software and hardware combination realization and may and be contemplated.
Fig. 2 is the structural representation of the device of real-time estimate airport security number according to embodiments of the present invention, such as Fig. 2 institutes State, the device includes:
Pretreatment module 22, for carrying out including data cleansing, knot to the use data predicted for airport security number Structure processing and the pretreatment of Data Integration;
Characteristic extracting module 24, it is of coupled connections with pretreatment module 22, for including machine to the extraction of pretreated data History feature, flight feature and the passenger's future stroke characteristic of field safety check flow of the people;
Prediction module 26, it is of coupled connections with characteristic extracting module 24, for the feature of extraction to be inputted into multiple prediction submodules Type carries out airport security number prediction, and exports prediction result respectively;
Processing module 28, it is of coupled connections with prediction module 26, for the prediction result point to multiple prediction submodel outputs Do not assessed, assign each prediction submodel weighted value according to assessment result and form built-up pattern, and enter according to built-up pattern Row airport security number is predicted and the prediction result of output matching.
Preferably, pretreatment module 22 involved in the present embodiment can include:Data cleansing module, make for removing With the exception record in data and/or increase using the missing values in data;Structuring processing module, with data cleansing module coupling Connection is closed, for being standardized and store using data, and/or to carrying out sliding-model control using data and depositing Storage;Data Integration module, it is of coupled connections with structuring processing module and/or data cleansing module, for according to prediction airport peace Required different data sources are carried out integration and make the wide table of a data by the demand of inspection flow of the people.
Preferably, the characteristic extracting module 24 involved by the present embodiment can include:Fisrt feature extraction module, for carrying Take the statistical nature of the history sometime put with the airport security flow of the people at time point;Second feature extraction module, it is and described Fisrt feature extraction module is of coupled connections, for extracting the airport security flow of the people before predicted time node in specific duration;Boat Class's characteristic extracting module, is of coupled connections with fisrt feature extraction module and/or second feature extraction module, for extracting airport rule The Flight Information drawn.
Preferably, the prediction module 26 involved by the present embodiment can include:Processing unit, for the feature according to extraction Vectorization processing is carried out by each prediction submodel input demand;Creating unit, it is of coupled connections with processing unit, for according to vector Changing the feature-modeling of processing includes the eigenmatrix of training set subcharacter matrix and test set subcharacter matrix;Training unit, with Creating unit is of coupled connections, for the algorithm for predicting submodel for the matching for inputting training set subcharacter in Spark MLlib Model training is carried out in bag and forms training pattern;Test cell, it is of coupled connections with training unit, by the input of test set subcharacter The training pattern matched somebody with somebody is tested and exports test and prediction result.
Preferably, the processing module 28 involved by the present embodiment can include:Assessment unit, for multiple prediction submodules The prediction result of type output is assessed respectively;First processing units, it is of coupled connections with assessment unit, for according to assessment result Assign each prediction submodel weighted value and form built-up pattern;Second processing unit, is of coupled connections with first processing units, is used for The prediction result of the prediction of airport security number and output matching is carried out according to built-up pattern.
It should be noted that above-mentioned modules, each unit can be realized by software or hardware, for rear Person, it can be accomplished by the following way, but not limited to this:Above-mentioned module is respectively positioned in same processor;Or above-mentioned module point Wei Yu not be in multiple processors.
It is not intended to limit the scope of the present invention above, all implementation according to the technology of the present invention essence to more than Example any modification, equivalent variations and the modification made, in the range of still falling within technical scheme.

Claims (10)

  1. A kind of 1. data processing method of real-time estimate airport security number, it is characterised in that, including:
    It is described to be included using data to carrying out including data cleansing, structuring processing and the pretreatment of Data Integration using data Airport passenger historical data, passenger ticket reservation data, agent's seat reservation system data, departure from port data, passenger ticket data and airport future Flight planning data;
    Pretreated data are carried out with feature extraction, the history that the feature extraction includes extraction airport security flow of the people is special Sign, flight feature and passenger's future stroke characteristic;
    The multiple prediction submodels of feature input of extraction are subjected to airport security number prediction, and export prediction result respectively, institute Stating multiple forecast models includes RAMA models, linear regression model (LRM), random forest regression model and GDBT models;
    The prediction result of multiple prediction submodel outputs is assessed respectively, each prediction submodel power is assigned according to assessment result Weight values and formation built-up pattern, and according to built-up pattern progress airport security number prediction and the prediction result of output matching.
  2. A kind of 2. data processing method of real-time estimate airport security number according to claim 1, it is characterised in that,
    Described pair carries out data cleansing using data and includes:Remove using the exception record in data and/or increase using data In missing values;
    Described pair carries out structuring processing using data and includes:To being standardized and being stored using data, and/or to making Sliding-model control is carried out with data and is stored, wherein, standardization includes being reflected field value according to business and model requirements Continuous type section is mapped to, sliding-model control becomes discrete type field including continuous type field according to cut-point;
    Described pair carries out integration using data and includes:According to the demand of prediction airport security flow of the people come by required different pieces of information Source carries out integration and makes the wide table of a data.
  3. A kind of 3. data processing method of real-time estimate airport security number according to claim 1, it is characterised in that, institute State includes to the progress feature extraction of pretreated data:
    The statistical nature of the history sometime put with the airport security flow of the people at time point is extracted, the statistical nature includes people Average, median, standard deviation, maximum and the minimum value of flow;Or the airport before extraction predicted time node in specific duration Safety check flow of the people sum;And/or the Flight Information of extraction airport planning;
    The statistical nature of extraction, flow of the people sum and Flight Information are stored with eigenmatrix.
  4. A kind of 4. data processing method of real-time estimate airport security number according to claim 1, it is characterised in that, institute State by the multiple prediction submodels of the feature of extraction input be predicted including:
    Feature according to extraction carries out vectorization processing by each prediction submodel input demand and creates eigenmatrix, and each spy Sign matrix includes training set subcharacter matrix and test set subcharacter matrix;
    Model training shape will be carried out in the algorithm bag of the prediction submodel of matching in training set subcharacter input Spark MLlib Into training pattern;
    The training pattern of test set subcharacter input matching is tested and exports test and prediction result.
  5. A kind of 5. data processing method of real-time estimate airport security number according to claim 1, it is characterised in that, institute State and the prediction result of each prediction submodel output is assessed, each prediction submodel weighted value and shape are assigned according to assessment result Include into built-up pattern:
    The prediction result of each prediction submodel output is assessed using mean square error, and exports assessment result;
    The importance and its weighted value of each prediction submodel are determined according to assessment result corresponding to each prediction submodel;
    Weighted value according to each prediction submodel forms the built-up pattern for prediction.
  6. A kind of 6. processing unit of real-time estimate airport security number, it is characterised in that, including:
    Pretreatment module, for the use data predicted for airport security number include with data cleansing, at structuring The pretreatment of reason and Data Integration;
    Characteristic extracting module, it is special for including the history feature of airport security flow of the people, flight to the extraction of pretreated data Sign and passenger's future stroke characteristic;
    Prediction module, for the multiple prediction submodels of feature input of extraction to be carried out into airport security number prediction, and it is defeated respectively Go out prediction result;
    Processing module, for assessing respectively the prediction result of multiple prediction submodel outputs, assigned according to assessment result Each prediction submodel weighted value and formation built-up pattern, and carry out the prediction of airport security number and output matching according to built-up pattern Prediction result.
  7. 7. device according to claim 6, it is characterised in that, the pretreatment module includes:
    Data cleansing module, for removing using the exception record in data and/or increasing using the missing values in data;
    Structuring processing module, for discrete to being standardized and being stored using data, and/or to being carried out using data Change and handle and store;
    Data Integration module, required different data sources are integrated for the demand according to prediction airport security flow of the people Make the wide table of a data.
  8. 8. device according to claim 6, it is characterised in that, the characteristic extracting module includes:
    Fisrt feature extraction module, it is special with the statistics of the airport security flow of the people at time point for extracting the history sometime put Sign;
    Second feature extraction module, for extracting the airport security flow of the people before predicted time node in specific duration;
    Flight characteristic extracting module, for extracting the Flight Information of airport planning.
  9. 9. device according to claim 6, it is characterised in that the prediction module includes:
    Processing unit, vectorization processing is carried out by each prediction submodel input demand for the feature according to extraction;
    Creating unit, for including training set subcharacter matrix and test set subcharacter square according to the feature-modeling of vectorization processing The eigenmatrix of battle array;
    Training unit, for training set subcharacter to be inputted in the algorithm bag for predicting submodel of the matching in Spark MLlib Carry out model training and form training pattern;
    Test cell, test set subcharacter is inputted to the training pattern matched and is tested and exports test and prediction result.
  10. 10. device according to claim 6, it is characterised in that, the processing module includes
    Assessment unit, for assessing respectively the prediction result of multiple prediction submodel outputs;
    First processing units, for assigning each prediction submodel weighted value according to assessment result and forming built-up pattern;
    Second processing unit, for carrying out the prediction result of the prediction of airport security number and output matching according to built-up pattern.
CN201711027543.2A 2017-10-27 2017-10-27 A kind of data processing method and device of real-time estimate airport security number Pending CN107704971A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711027543.2A CN107704971A (en) 2017-10-27 2017-10-27 A kind of data processing method and device of real-time estimate airport security number

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711027543.2A CN107704971A (en) 2017-10-27 2017-10-27 A kind of data processing method and device of real-time estimate airport security number

Publications (1)

Publication Number Publication Date
CN107704971A true CN107704971A (en) 2018-02-16

Family

ID=61176394

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711027543.2A Pending CN107704971A (en) 2017-10-27 2017-10-27 A kind of data processing method and device of real-time estimate airport security number

Country Status (1)

Country Link
CN (1) CN107704971A (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109002988A (en) * 2018-07-18 2018-12-14 平安科技(深圳)有限公司 Risk passenger method for predicting, device, computer equipment and storage medium
CN109214612A (en) * 2018-11-20 2019-01-15 广东机场白云信息科技有限公司 One kind being based on XGBOOST airport traffic spatial and temporal distributions prediction technique
CN110222905A (en) * 2019-06-14 2019-09-10 智慧足迹数据科技有限公司 A kind of method and device for predicting flow of the people
CN110675626A (en) * 2019-09-27 2020-01-10 汉纳森(厦门)数据股份有限公司 Traffic accident black point prediction method, device and medium based on multidimensional data
CN110717616A (en) * 2019-08-30 2020-01-21 中国南方航空股份有限公司 Civil aviation unit human resource prediction method, electronic equipment and storage medium
CN110751340A (en) * 2019-10-29 2020-02-04 广东机场白云信息科技有限公司 Method and system for forecasting and analyzing pedestrian flow in airport security check area
CN110807558A (en) * 2019-11-06 2020-02-18 深圳微品致远信息科技有限公司 Method and device for predicting departure taxi time based on deep neural network
CN110837928A (en) * 2019-11-05 2020-02-25 沈阳民航东北凯亚有限公司 Method and device for predicting security check time
CN111191114A (en) * 2019-11-26 2020-05-22 恒大智慧科技有限公司 Cold scenic spot recommendation method and device and storage medium
CN111435486A (en) * 2019-01-15 2020-07-21 阿里巴巴集团控股有限公司 Ticket checking resource allocation method and device
CN111832929A (en) * 2020-07-09 2020-10-27 民航成都信息技术有限公司 Dynamic scheduling method and system for airport check-in
CN111832820A (en) * 2020-07-09 2020-10-27 飞友科技有限公司 Method and system for predicting pedestrian flow of each gate at airport
CN112016731A (en) * 2019-05-31 2020-12-01 杭州海康威视系统技术有限公司 Queuing time prediction method and device and electronic equipment
CN113298306A (en) * 2021-05-24 2021-08-24 建信金融科技有限责任公司 Method and device for predicting number of people at dinner, electronic equipment and medium
CN114897205A (en) * 2022-03-07 2022-08-12 中国民航工程咨询有限公司 Target airport characteristic value prediction method and computer equipment
CN115130783A (en) * 2022-07-29 2022-09-30 中国工商银行股份有限公司 Method and device for predicting network queuing information
CN117809400A (en) * 2023-12-29 2024-04-02 厦门民航凯亚有限公司 Intelligent security check passenger flow monitoring system suitable for terminal building

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020015104A1 (en) * 2018-07-18 2020-01-23 平安科技(深圳)有限公司 Method, apparatus, computer device, and storage medium for predicting flow rate of passengers presenting security risk
CN109002988B (en) * 2018-07-18 2023-10-27 平安科技(深圳)有限公司 Risk passenger flow prediction method, apparatus, computer device and storage medium
CN109002988A (en) * 2018-07-18 2018-12-14 平安科技(深圳)有限公司 Risk passenger method for predicting, device, computer equipment and storage medium
CN109214612A (en) * 2018-11-20 2019-01-15 广东机场白云信息科技有限公司 One kind being based on XGBOOST airport traffic spatial and temporal distributions prediction technique
CN111435486A (en) * 2019-01-15 2020-07-21 阿里巴巴集团控股有限公司 Ticket checking resource allocation method and device
CN111435486B (en) * 2019-01-15 2024-04-02 阿里巴巴集团控股有限公司 Ticket checking resource allocation method and device
CN112016731B (en) * 2019-05-31 2024-02-27 杭州海康威视系统技术有限公司 Queuing time prediction method and device and electronic equipment
CN112016731A (en) * 2019-05-31 2020-12-01 杭州海康威视系统技术有限公司 Queuing time prediction method and device and electronic equipment
CN110222905A (en) * 2019-06-14 2019-09-10 智慧足迹数据科技有限公司 A kind of method and device for predicting flow of the people
CN110717616A (en) * 2019-08-30 2020-01-21 中国南方航空股份有限公司 Civil aviation unit human resource prediction method, electronic equipment and storage medium
CN110675626A (en) * 2019-09-27 2020-01-10 汉纳森(厦门)数据股份有限公司 Traffic accident black point prediction method, device and medium based on multidimensional data
CN110675626B (en) * 2019-09-27 2021-01-12 汉纳森(厦门)数据股份有限公司 Traffic accident black point prediction method, device and medium based on multidimensional data
CN110751340A (en) * 2019-10-29 2020-02-04 广东机场白云信息科技有限公司 Method and system for forecasting and analyzing pedestrian flow in airport security check area
CN110837928A (en) * 2019-11-05 2020-02-25 沈阳民航东北凯亚有限公司 Method and device for predicting security check time
CN110807558A (en) * 2019-11-06 2020-02-18 深圳微品致远信息科技有限公司 Method and device for predicting departure taxi time based on deep neural network
CN111191114A (en) * 2019-11-26 2020-05-22 恒大智慧科技有限公司 Cold scenic spot recommendation method and device and storage medium
CN111832929B (en) * 2020-07-09 2023-12-12 民航成都信息技术有限公司 Dynamic scheduling method and system for airport check-in machine
CN111832820A (en) * 2020-07-09 2020-10-27 飞友科技有限公司 Method and system for predicting pedestrian flow of each gate at airport
CN111832929A (en) * 2020-07-09 2020-10-27 民航成都信息技术有限公司 Dynamic scheduling method and system for airport check-in
CN113298306A (en) * 2021-05-24 2021-08-24 建信金融科技有限责任公司 Method and device for predicting number of people at dinner, electronic equipment and medium
CN114897205A (en) * 2022-03-07 2022-08-12 中国民航工程咨询有限公司 Target airport characteristic value prediction method and computer equipment
CN115130783A (en) * 2022-07-29 2022-09-30 中国工商银行股份有限公司 Method and device for predicting network queuing information
CN117809400A (en) * 2023-12-29 2024-04-02 厦门民航凯亚有限公司 Intelligent security check passenger flow monitoring system suitable for terminal building

Similar Documents

Publication Publication Date Title
CN107704971A (en) A kind of data processing method and device of real-time estimate airport security number
Zhang et al. Market power and its determinants in the Chinese airline industry
CN107086935B (en) People flow distribution prediction method based on WIFI AP
CN109214719A (en) A kind of system and method for the marketing inspection analysis based on artificial intelligence
CN109102157A (en) A kind of bank's work order worksheet processing method and system based on deep learning
CN116187640B (en) Power distribution network planning method and device based on grid multi-attribute image system
CN104156805B (en) Leg running time calculation method based on probability distribution
CN114943356B (en) Short-time demand integrated prediction method for airport arrival passenger to take taxi
Ripoll-Zarraga et al. Exploring the reasons for efficiency in Spanish airports
CN104143170A (en) Low-altitude rescue air traffic dispatching command system and dispatching command method thereof
CN110489556A (en) Quality evaluating method, device, server and storage medium about follow-up record
CN113704389A (en) Data evaluation method and device, computer equipment and storage medium
CN116432806A (en) Rolling prediction method and system for flight ground guarantee node time
CN115271227A (en) Resource scheduling method in cloud environment
CN111144673A (en) Method, device and equipment for evaluating structure of organization personnel and computer readable medium
CN116011653A (en) Method, device and storage medium for predicting opening quantity of cabinet-to-cabinet table
Zhang et al. Research on improvement and optimisation of modelling method of China’s civil aircraft market demand forecast model
CN112948115B (en) Cloud workflow scheduler pressure prediction method based on extreme learning machine
CN115689201A (en) Multi-criterion intelligent decision optimization method and system for enterprise resource supply and demand allocation
Novrisal et al. Simulation of departure terminal in Soekarno-Hatta International airport
CN114925663A (en) Intelligent ticket forming method and system based on overhaul ticket
CN113610402A (en) Land ecological bearing capacity assessment method based on image analysis and related equipment
Kaur et al. An innovative multi-criteria decision-making framework for assessing India's airport operating efficiency
CN110532418A (en) A kind of high net value industry AI intelligent design system
Chen et al. Bayesian Neural Network-Based Demand Forecasting for Express Transportation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180216

RJ01 Rejection of invention patent application after publication