CN107704971A - A kind of data processing method and device of real-time estimate airport security number - Google Patents
A kind of data processing method and device of real-time estimate airport security number Download PDFInfo
- Publication number
- CN107704971A CN107704971A CN201711027543.2A CN201711027543A CN107704971A CN 107704971 A CN107704971 A CN 107704971A CN 201711027543 A CN201711027543 A CN 201711027543A CN 107704971 A CN107704971 A CN 107704971A
- Authority
- CN
- China
- Prior art keywords
- data
- prediction
- airport security
- airport
- extraction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 17
- 238000000605 extraction Methods 0.000 claims abstract description 54
- 238000012545 processing Methods 0.000 claims abstract description 46
- 230000010354 integration Effects 0.000 claims abstract description 21
- 238000007637 random forest analysis Methods 0.000 claims abstract description 9
- 238000012549 training Methods 0.000 claims description 33
- 238000012360 testing method Methods 0.000 claims description 21
- 239000011159 matrix material Substances 0.000 claims description 14
- 238000004422 calculation algorithm Methods 0.000 claims description 13
- 238000012417 linear regression Methods 0.000 claims description 10
- 230000015572 biosynthetic process Effects 0.000 claims description 4
- 230000008859 change Effects 0.000 claims description 3
- 238000000034 method Methods 0.000 description 16
- 238000003066 decision tree Methods 0.000 description 7
- 235000013399 edible fruits Nutrition 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 4
- 238000010801 machine learning Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 241001123248 Arma Species 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000000151 deposition Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 206010016256 fatigue Diseases 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000002969 morbid Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000000611 regression analysis Methods 0.000 description 1
- 238000013468 resource allocation Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000012731 temporal analysis Methods 0.000 description 1
- 238000000700 time series analysis Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/40—Business processes related to the transportation industry
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Theoretical Computer Science (AREA)
- Tourism & Hospitality (AREA)
- General Physics & Mathematics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Development Economics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The present invention discloses a kind of data processing method of real-time estimate airport security number, including:To carrying out including data cleansing, structuring processing and the pretreatment of Data Integration using data;Feature extraction is carried out to pretreated data;The multiple prediction submodels of feature input of extraction are subjected to airport security number prediction, and export prediction result respectively;The prediction result of multiple prediction submodel outputs is assessed respectively, each prediction submodel weighted value is assigned according to assessment result and forms built-up pattern, and the prediction result of the prediction of airport security number and output matching is carried out according to built-up pattern.The present invention is predicted using Multi-Model Combination, and the prediction result of usage time series model, random forest and GBDT is assessed, and then is carried out assignment to the weight of every kind of model and formed combination forecasting, reaches higher predictablity rate.Concurrently, the present invention also provides a kind of processing unit of real-time estimate airport security number.
Description
Technical field
The present invention relates to data processing field, more particularly to a kind of data processing method of real-time estimate airport security number
And device.
Background technology
Airport possesses huge passenger throughput, and the airport services such as safety check, security protection, accident are emergent are intended to lead to
Following passenger throughput of prediction is crossed, and allocates manpower and materials in advance accordingly, and is preferably passenger facilities and reply burst feelings
Condition.
International Air Transport Association (IATA) is following 20 years, it is expected that following demand of passenger transport average growth rate per annum is up to 3.7%
The demand of travelling by air will double.The Air transportation service quality in China is still not enough to and big country of civil aviaton status phase at present
Match somebody with somebody, compared with developed country of civil aviaton, a certain distance in service level and quality all be present, still can not fully adapt to reform opens
Put the needs with socio-economic development.
With the growth of airport passenger amount, to ensure safety of civil aviation Effec-tive Function, safety check assume responsibility for great safety guarantee
Pressure, while be also that passenger is unsatisfied with one of maximum several links.Therefore the safety check number on airport is precisely predicted, will be helpful to
Airport Service Source Optimized Operation is realized, improves airport security efficiency.It can pass through following two methods in present airport service
To reduce passenger's safety check stand-by period:One kind is increase Mag & Bag Service stock number;Another kind is to carry out intelligent tune to Service Source
Degree, i.e., according to safety check queue number dispatch service resource.And from the point of view of combining actual conditions, intelligent scheduling is carried out to Service Source
Be solve passenger's safety check stand-by period length effective way, development and construction to following wisdom airport have certain value and
Meaning.
Chinese invention patent " is used to analyze data processing method and the device that the passenger of airport Mag & Bag Area reaches " (application
Number:201610607353.7) disclose a kind of can be predicted according to the real time data of check-in sales counter to safety check queue number
Technical scheme;Chinese invention patent " towards airport building dynamic resource allocation and the data integration model of intelligent scheduling "
(application number:201510067010.1) disclose one kind can for real-time number of passengers come realize airport building resource dynamic
Configuration and the technical scheme of the data integration of intelligent scheduling;There are some in existing using flight planning to be pre- according to being carried out to passenger flow
Survey.
Technical scheme disclosed by above-mentioned two patent and it is existing in the technical scheme that is used to be predicted airport passenger flow deposit
The defects of be:First, on data use, there is certain limitation;Second, airport passenger flow estimation is ground in existing
The data used are studied carefully either from check-in sales counter real time data or based on being flight planning data, these data are all not enough to
Security staff is accurately predicted;Third, most of airports are using the flight planning amount of leaving the port as according to security check trip at present
Guest's number is predicted, and this method can not be realized precisely and in real time to be grasped to passenger flow, it is very difficult to is realized to airport security resource
Dynamic dispatching;Fourth, the real time data for having use value rack platform in existing airport carries out safety check number prediction, do not account for going through
The influence of history data, safety check number precision of prediction can not be ensured.
For the above mentioned problem in correlation technique, still lack a kind of multiple machine learning algorithms of combination at present to handle airport
The method of passenger's True Data, airport security number precisely is predicted by establishing model, namely lack a kind of real-time estimate machine
The data processing method and device of field safety check number.
The content of the invention
The technical problem of the solution of the present invention is to combine multiple machine learning algorithms for above-mentioned existing one kind that lacks to locate
The method for managing airport passenger real time data and historical data, provides a kind of data processing side of real-time estimate airport security number
Method, airport security number is precisely predicted by establishing built-up pattern.Concurrently, the present invention also provides a kind of real-time estimate airport
The processing unit of safety check number.
To solve above-mentioned technical problem, the technical scheme that the present invention takes is as follows:A kind of real-time estimate airport security number
Data processing method, including:
It is described to use data to carrying out including data cleansing, structuring processing and the pretreatment of Data Integration using data
Including airport passenger historical data, passenger ticket reservation data, agent's seat reservation system data, departure from port data, passenger ticket data and airport
Following flight planning data;
Pretreated data are carried out with feature extraction, the feature extraction includes the history of extraction airport security flow of the people
Feature, flight feature and passenger's future stroke characteristic;
The multiple prediction submodels of feature input of extraction are subjected to airport security number prediction, and output prediction knot respectively
Fruit, the multiple forecast model include RAMA models, linear regression model (LRM), random forest regression model and GDBT models;
The prediction result of multiple prediction submodel outputs is assessed respectively, each prediction submodule is assigned according to assessment result
Type weighted value and formation built-up pattern, and according to built-up pattern progress airport security number prediction and the prediction knot of output matching
Fruit.
It is expanded on further as to above-mentioned technical proposal:
In the above-mentioned technical solutions, described pair is included using data progress data cleansing:Remove the exception used in data
Record and/or increase using the missing values in data;
Described pair carries out structuring processing using data and includes:To being standardized and being stored using data, and/or
To using data sliding-model control and store, wherein, standardization is included field according to business and model requirements
Value is mapped to continuous type section, and sliding-model control becomes discrete type field including continuous type field according to cut-point;
Described pair carries out integration using data and includes:According to the demand of prediction airport security flow of the people come by required difference
Data source carries out integration and makes the wide table of a data.
In the above-mentioned technical solutions, it is described that the progress feature extraction of pretreated data is included:
The history sometime put is extracted with the statistical nature of the airport security flow of the people at time point, the statistical nature bag
Include average, median, standard deviation, maximum and the minimum value of flow of the people;Or before extraction predicted time node in specific duration
Airport security flow of the people sum;And/or the Flight Information of extraction airport planning;
The statistical nature of extraction, flow of the people sum and Flight Information are stored with eigenmatrix.
In the above-mentioned technical solutions, the multiple prediction submodels of feature input by extraction be predicted including:
Feature according to extraction carries out vectorization processing by each prediction submodel input demand and creates eigenmatrix, and often
One eigenmatrix includes training set subcharacter matrix and test set subcharacter matrix;
Model instruction will be carried out in the algorithm bag of the prediction submodel of matching in training set subcharacter input Spark MLlib
White silk forms training pattern;
The training pattern of test set subcharacter input matching is tested and exports test and prediction result.
In the above-mentioned technical solutions, the prediction result to each prediction submodel output is assessed, and is tied according to assessing
Fruit, which assigns each prediction submodel weighted value and forms built-up pattern, to be included:
The prediction result of each prediction submodel output is assessed using mean square error, and exports assessment result;
The importance and its weighted value of each prediction submodel are determined according to assessment result corresponding to each prediction submodel;
Weighted value according to each prediction submodel forms the built-up pattern for prediction.
According to another aspect of the present invention, a kind of processing unit of real-time estimate airport security number is provided, including:
Pretreatment module, for carrying out including data cleansing, structure to the use data predicted for airport security number
Change processing and the pretreatment of Data Integration;
Characteristic extracting module, for including history feature, the boat of airport security flow of the people to the extraction of pretreated data
Class's feature and passenger's future stroke characteristic;
Prediction module, for the multiple prediction submodels of feature input of extraction to be carried out into airport security number prediction, and divide
Prediction result is not exported;
Processing module, for assessing respectively the prediction result of multiple prediction submodel outputs, according to assessment result
Assign each prediction submodel weighted value and form built-up pattern, and the prediction of airport security number and output are carried out according to built-up pattern
The prediction result of matching.
It is expanded on further as to said apparatus, the pretreatment module includes:
Data cleansing module, for removing using the exception record in data and/or increasing using the missing values in data;
Structuring processing module, for being standardized and stored using data, and/or to being carried out using data
Sliding-model control simultaneously stores;
Data Integration module, required different data sources are carried out for the demand according to prediction airport security flow of the people
Integration makes the wide table of a data.
It is expanded on further as to said apparatus, the characteristic extracting module includes:
Fisrt feature extraction module, for extracting system of the history sometime put with the airport security flow of the people at time point
Count feature;
Second feature extraction module, for extracting the airport security flow of the people before predicted time node in specific duration;
Flight characteristic extracting module, for extracting the Flight Information of airport planning.
It is expanded on further as to said apparatus, the prediction module includes:
Processing unit, vectorization processing is carried out by each prediction submodel input demand for the feature according to extraction;
Creating unit, it is special including training set subcharacter matrix and test collected works for the feature-modeling according to vectorization processing
Levy the eigenmatrix of matrix;
Training unit, for the algorithm for predicting submodel for the matching for inputting training set subcharacter in Spark MLlib
Model training is carried out in bag and forms training pattern;
Test cell, test set subcharacter is inputted to the training pattern matched and is tested and is exported test and predict knot
Fruit.
It is expanded on further as to said apparatus, the processing module includes
Assessment unit, for assessing respectively the prediction result of multiple prediction submodel outputs;
First processing units, for assigning each prediction submodel weighted value according to assessment result and forming built-up pattern;
Second processing unit, for carrying out the prediction result of the prediction of airport security number and output matching according to built-up pattern
The beneficial effect of the data processing method of the real-time estimate airport security number of the present invention is:Present invention selection pair
The history feature that plays an important role of airport passenger flow of the people prediction, flight arrange an order according to class and grade feature and passenger's future stroke characteristic be used for it is pre-
Survey, improve prediction effect;Meanwhile the present invention be predicted using Multi-Model Combination, usage time series model, random forest with
And GBDT prediction result is assessed, and then assignment is carried out to the weight of every kind of model and forms combination forecasting, reached
Higher predictablity rate.
Brief description of the drawings
Accompanying drawing described herein is used for providing a further understanding of the present invention, forms the part of the application, this hair
Bright schematic description and description is used to explain the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 is the flow chart of present invention prediction airport security number data processing method;
Fig. 2 is the structural representation of the device of real-time estimate airport security number according to embodiments of the present invention.
Embodiment
The invention will now be described in further detail with reference to the accompanying drawings.
The embodiment described by reference to accompanying drawing is exemplary, it is intended to for explaining the application, and it is not intended that right
The limitation of the application.Describe the present invention in detail below with reference to accompanying drawing and in conjunction with the embodiments.It should be noted that do not rushing
In the case of prominent, the feature in embodiment and embodiment in the application can be mutually combined.
It should be noted that term " first " in description and claims of this specification and above-mentioned accompanying drawing, "
Two " etc. be for distinguishing similar object, without for describing specific order or precedence.
Before illustrating specific embodiment again, explanation first is illustrated to following technology herein, so as to convenient hereafter to specific
Embodiment illustrates.Herein, the prediction submodel needed to use has:RAMA models (autoregressive moving-average model),
Linear regression model (LRM), random forest regression model and GDBT (gradient lifts decision tree) model;
Arma modeling (Auto-Regressive and Moving Average Model) is the weight of search time sequence
Method is wanted, is formed by autoregression model (abbreviation AR models) with " being mixed " based on moving average model (abbreviation MA models), it is main
Will be based on to steadily time series be analyzed to establish model immediately, its form is simple, data are fitted it is more convenient,
It is easy to the structure and inwardness of analyze data, carries out most preferably forecasting and controlling under the meaning of minimum variance, it is a kind of essence
The higher Short-term Forecasting Model of exactness, because exponent number can adjust according to situation, also seem that comparison is flexible, but required go through
History data volume is larger (typically more than 50), and existing historical data amount can meet completely, and the present invention uses airport history
Passenger's flow of the people individually carries out time series analysis, excavates airport passenger flow of the people variation tendency and is predicted.
Linear regression is to utilize regression analysis in mathematical statistics, complementary between two or more variable to determine
A kind of statistical analysis technique of quantitative relationship, with quite varied regression models, simple regression be it is most simple and sane,
But often weak during the behavior of complication system, therefore traditional polynary of the more common of the Predicting Technique based on multiple regression is described
Regression model be usually linear due to the dependency relation between insignificant variable that may be present and each independent variable, can lead
There is serious morbid state in the normal equation group for causing to return, the stability of regression equation is had influence on, so multiple linear regression faces
A basic problem be find " Optimal Regression Equation.
Random forest returns and gradient lifting decision tree recurrence is all the machine learning side returned with multiple decision trees
Method, for random forest, feature and data that every decision tree is got all are random, and multiple decision trees are transported parallel
OK, there is relatively good generalization ability.
For GBDT as a kind of decision Tree algorithms of iteration, final prediction result is tired out by the regression tree of each step pilot process
Add acquisition, there is higher predictablity rate.
Embodiment 1
The present invention provides a kind of data processing method of real-time estimate airport security number, and Fig. 1 is present invention prediction airport
The flow chart of safety check number data processing method, as shown in figure 1, the step of data processing method includes:
Step S102:To carrying out including data cleansing, structuring processing and the pretreatment of Data Integration using data;
Step S104:Feature extraction is carried out to pretreated data;
Step S106:The multiple prediction submodels of feature input of extraction are subjected to airport security number prediction, and it is defeated respectively
Go out prediction result,
Step S108:The prediction result of multiple prediction submodel outputs is assessed respectively, assigned according to assessment result
Each prediction submodel weighted value and formation built-up pattern, and carry out the prediction of airport security number and output matching according to built-up pattern
Prediction result.
By the step S102 to step S106 of the present embodiment, using to carrying out including data cleansing, structure using data
Change processing and the pretreatment of Data Integration;Feature extraction is carried out to pretreated data;The feature input of extraction is multiple pre-
Survey submodel and carry out airport security number prediction, and export prediction result respectively, to the prediction knot of multiple prediction submodel outputs
Fruit is assessed respectively, assigns each prediction submodel weighted value according to assessment result and forms built-up pattern, and according to combination die
Type carries out the mode of the prediction result of the prediction of airport security number and output matching, solves in correlation technique only with single pre-
Model, single features are surveyed to be predicted airport security number and cause that predictablity rate is low, single model predictive error is larger
And the problem of being limited using data, the accuracy rate of prediction is lifted by way of multi model combination forecast, prevents single model
The deviation of prediction
In the present embodiment, need to illustrate and be expanded on further is:In step s 102, airport trip is included using data
Objective historical data, passenger ticket reservation data, agent's seat reservation system data, departure from port data, passenger ticket data and the flight in airport future
Planning data, and pass through the safety check flow of the people history number at airport passenger historical data can learn airport each time point daily
According to can be learnt following 24 hours by passenger ticket reservation and agent's seat reservation system data, departure from port data, passenger ticket data etc.
Passenger's scale that interior each time point will arrive at the airport, it can be learnt i.e. by the flight planning in 24 hours futures of airport
The information such as flight departure time, type will be included in the Flight Information of field takeoff, the Flight Information;In step S104, base
It is to predict airport security flow of the people in following 24 hours in the purpose of the method for the present embodiment, and airport security flow of the people is main
Feature for safety check flow of the people history feature, flight feature etc.;Common, history feature is arranged an order according to class and grade based on the daily flight in airport
Table is basicly stable, so safety check flow of the people has a certain degree of periodicity, i.e. same time in two cycles in general
The difference of point airport security flow of the people will not be very big, and history feature can characterize the steady of airport passenger flow of the people to a certain extent
Qualitative features, i.e. history can relatively, so there is this feature higher reference to anticipate with the airport passenger flow of the people at time point
Justice;And the flight and the related letter of flight that flight feature includes the daily Flight Information of airport planning namely passenger will take
Breath, including flight number, departure place, arrival ground, Proposed Departure time, actual time of departure, acknowledgement of consignment type, flight number etc.;It is real
In border, usual flight more period of arranging an order according to class and grade has a safety check flow of the people peak for the previous period, so flight feature can
Using the key character predicted as airport security flow of the people and there is direct influence to prediction airport future flow of the people;So feature
Extraction includes extracting history feature, flight feature and the passenger's future stroke characteristic of airport security flow of the people;In step s 106,
To form prediction accurately built-up pattern, predict that submodel includes RAMA models (autoregressive moving average mould in the present embodiment
Type), linear regression model (LRM), random forest regression model and GDBT (gradient lifts decision tree) model, certainly, it is above-mentioned enumerate it is pre-
It is the prediction submodel that service efficiency is preferably and more conventional in practice to survey submodel, in practice, yet with by selecting other
Model, it might even be possible to establish process to disassemble model by writing algorithm, be allowed to refine, also can reach lifting predictablity rate
Purpose, it is therefore appreciated that the present embodiment is desirable to by forming a kind of built-up pattern of optimization, the built-up pattern bag
Containing multiple prediction submodels and corresponding weight is arranged, the feature for influenceing airport security future flow of the people is applied to built-up pattern
And the number of progress safety check is needed to predict airport future based on this, namely be desirable to improve the prediction of airport security number accurately
Property.And the accuracy rate of prediction can be lifted by the way of multi model combination forecast, and should be predicted and make using single model
Into prediction deviation
Preferably, in the optional embodiment of the present embodiment, in step s 102,
Described pair carries out data cleansing using data and includes step:S102-1:Remove use data in exception record and/
Or increase using the missing values in data;
Described pair carries out structuring processing using data and includes step S102-2:To being standardized simultaneously using data
Storage, and/or to carrying out sliding-model control using data and storing,
Described pair carries out integration using data and includes step S102-3:According to prediction airport security flow of the people demand come by
Required different data sources carry out integration and make the wide table of a data, by being integrated into the wide table of data, facilitate model algorithm
Feature extraction.
It should be noted that for step S102-1, in the embodiment of the present embodiment, exception can be first removed
Record increases missing values again, also can first increase missing values and remove exception record again, can also also carry out simultaneously, and first removes exception
Record is to prevent its algorithm of interference model, and increases missing values, is to make the record of missing minority attribute to be applied to
In model;For step S102-2, in the embodiment of the present embodiment, standardization is included according to business and mould
Field value is mapped to continuous type section by type demand, for example, field value is mapped into [0,1] area according to business and model requirements
Between, and sliding-model control is become into discrete type field including continuous type field according to cut-point.
As preferred, in the optional embodiment of the present embodiment, in step S104, to described to pretreated
Data, which carry out feature extraction, to be accomplished by the following way:
Step S104-1:The history sometime put is extracted with the statistical nature of the airport security flow of the people at time point, institute
Stating statistical nature includes average, median, standard deviation, maximum and the minimum value of flow of the people;
Step S104-2:Airport security flow of the people sum before extraction predicted time node in specific duration;
Step S104-3:Extract the Flight Information of airport planning;
Step S104-4, the statistical nature of extraction, flow of the people sum and Flight Information are stored with eigenmatrix.
It should be noted that for step S104-1 to step S104-4, in the embodiment of the present embodiment, carry
Take the history sometime put with it is specific before the statistical nature of the airport security flow of the people at time point, extraction predicted time node when
The Flight Information of airport security flow of the people sum and extraction airport planning in length, no precedence relation, can hold one by one
OK, it can also synchronously perform, and extract after feature by the input quantity demand of each prediction submodel, it is necessary to which feature is carried out into vectorization
Handle and constitutive characteristic matrix stores, such as, if the feature of extraction is applied in random forest or GBDT algorithms
Just need to carry out One-Hot codings;It is emphasized that in the present embodiment, the selection of feature is not limited only to be to choose airport
The history sometime put is the same as in specific duration before the statistical nature of the airport security flow of the people at time point, predicted time node
Airport security flow of the people sum and the Flight Information of airport planning, the selection of feature can be further features, and further feature
Combination, the purpose that the present embodiment Feature Selection can be achieved is combined by further feature.
As preferred, in the optional embodiment of the present embodiment, in step S106, the feature by extraction is defeated
Enter multiple prediction submodels to be predicted and comprise the following steps:
Step S106-1:Feature according to extraction carries out vectorization processing by each prediction submodel input demand and creates spy
Matrix is levied, and each eigenmatrix includes training set subcharacter matrix and test set subcharacter matrix;
Step S106-2:By the algorithm bag of the prediction submodel of the matching in training set subcharacter input Spark MLlib
Middle progress model training forms training pattern;
Step S106-3:The training pattern of test set subcharacter input matching is tested and exports test and predict and is tied
Fruit.
It is described to each prediction submodel in step S108 preferably, in the optional embodiment of the present embodiment
The prediction result of output is assessed, and assigning each prediction submodel weighted value according to assessment result and forming built-up pattern is included such as
Lower step:
Step S108-1:The prediction result of each prediction submodel output is assessed using mean square error, and exports and comments
Estimate result;
Step S108-2:According to it is each prediction submodel corresponding to assessment result determine it is each prediction submodel importance and its
Weighted value;
Step S108-3:Weighted value according to each prediction submodel forms the built-up pattern for prediction.
It should be noted that step S108-2 to step S108-3, when forming built-up pattern, usually using according to prediction
The weighted value that the importance of model determines determines its proportionality coefficient in proportion, and the proportionality coefficient is by the ratio between each weighted value abbreviation
Simplify and go out, can be percentage, can also be ratio of integers.
Through the above description of the embodiments, those skilled in the art can be understood that according to above-mentioned implementation
The method of example can add the mode of required general hardware platform to realize by software, naturally it is also possible to by hardware, but a lot
In the case of the former be more preferably embodiment.Based on such understanding, technical scheme is substantially in other words to existing
The part that technology contributes can be embodied in the form of software product, and the computer software product is stored in a storage
In medium (such as ROM/RAM, magnetic disc, CD), including some instructions to cause a station terminal equipment (can be mobile phone, calculate
Machine, server, or network equipment etc.) method that performs each embodiment of the present invention.
Embodiment 2
A kind of device of real-time estimate airport security number is additionally provided in the present embodiment, and the device is above-mentioned for realizing
Embodiment and preferred embodiment, repeating no more for explanation was carried out.As used below, term " module ", " list
Member " can be the combination of the software and/or hardware of realizing predetermined function.Although device described by following examples preferably with
Software is realized, but hardware, or software and hardware combination realization and may and be contemplated.
Fig. 2 is the structural representation of the device of real-time estimate airport security number according to embodiments of the present invention, such as Fig. 2 institutes
State, the device includes:
Pretreatment module 22, for carrying out including data cleansing, knot to the use data predicted for airport security number
Structure processing and the pretreatment of Data Integration;
Characteristic extracting module 24, it is of coupled connections with pretreatment module 22, for including machine to the extraction of pretreated data
History feature, flight feature and the passenger's future stroke characteristic of field safety check flow of the people;
Prediction module 26, it is of coupled connections with characteristic extracting module 24, for the feature of extraction to be inputted into multiple prediction submodules
Type carries out airport security number prediction, and exports prediction result respectively;
Processing module 28, it is of coupled connections with prediction module 26, for the prediction result point to multiple prediction submodel outputs
Do not assessed, assign each prediction submodel weighted value according to assessment result and form built-up pattern, and enter according to built-up pattern
Row airport security number is predicted and the prediction result of output matching.
Preferably, pretreatment module 22 involved in the present embodiment can include:Data cleansing module, make for removing
With the exception record in data and/or increase using the missing values in data;Structuring processing module, with data cleansing module coupling
Connection is closed, for being standardized and store using data, and/or to carrying out sliding-model control using data and depositing
Storage;Data Integration module, it is of coupled connections with structuring processing module and/or data cleansing module, for according to prediction airport peace
Required different data sources are carried out integration and make the wide table of a data by the demand of inspection flow of the people.
Preferably, the characteristic extracting module 24 involved by the present embodiment can include:Fisrt feature extraction module, for carrying
Take the statistical nature of the history sometime put with the airport security flow of the people at time point;Second feature extraction module, it is and described
Fisrt feature extraction module is of coupled connections, for extracting the airport security flow of the people before predicted time node in specific duration;Boat
Class's characteristic extracting module, is of coupled connections with fisrt feature extraction module and/or second feature extraction module, for extracting airport rule
The Flight Information drawn.
Preferably, the prediction module 26 involved by the present embodiment can include:Processing unit, for the feature according to extraction
Vectorization processing is carried out by each prediction submodel input demand;Creating unit, it is of coupled connections with processing unit, for according to vector
Changing the feature-modeling of processing includes the eigenmatrix of training set subcharacter matrix and test set subcharacter matrix;Training unit, with
Creating unit is of coupled connections, for the algorithm for predicting submodel for the matching for inputting training set subcharacter in Spark MLlib
Model training is carried out in bag and forms training pattern;Test cell, it is of coupled connections with training unit, by the input of test set subcharacter
The training pattern matched somebody with somebody is tested and exports test and prediction result.
Preferably, the processing module 28 involved by the present embodiment can include:Assessment unit, for multiple prediction submodules
The prediction result of type output is assessed respectively;First processing units, it is of coupled connections with assessment unit, for according to assessment result
Assign each prediction submodel weighted value and form built-up pattern;Second processing unit, is of coupled connections with first processing units, is used for
The prediction result of the prediction of airport security number and output matching is carried out according to built-up pattern.
It should be noted that above-mentioned modules, each unit can be realized by software or hardware, for rear
Person, it can be accomplished by the following way, but not limited to this:Above-mentioned module is respectively positioned in same processor;Or above-mentioned module point
Wei Yu not be in multiple processors.
It is not intended to limit the scope of the present invention above, all implementation according to the technology of the present invention essence to more than
Example any modification, equivalent variations and the modification made, in the range of still falling within technical scheme.
Claims (10)
- A kind of 1. data processing method of real-time estimate airport security number, it is characterised in that, including:It is described to be included using data to carrying out including data cleansing, structuring processing and the pretreatment of Data Integration using data Airport passenger historical data, passenger ticket reservation data, agent's seat reservation system data, departure from port data, passenger ticket data and airport future Flight planning data;Pretreated data are carried out with feature extraction, the history that the feature extraction includes extraction airport security flow of the people is special Sign, flight feature and passenger's future stroke characteristic;The multiple prediction submodels of feature input of extraction are subjected to airport security number prediction, and export prediction result respectively, institute Stating multiple forecast models includes RAMA models, linear regression model (LRM), random forest regression model and GDBT models;The prediction result of multiple prediction submodel outputs is assessed respectively, each prediction submodel power is assigned according to assessment result Weight values and formation built-up pattern, and according to built-up pattern progress airport security number prediction and the prediction result of output matching.
- A kind of 2. data processing method of real-time estimate airport security number according to claim 1, it is characterised in that,Described pair carries out data cleansing using data and includes:Remove using the exception record in data and/or increase using data In missing values;Described pair carries out structuring processing using data and includes:To being standardized and being stored using data, and/or to making Sliding-model control is carried out with data and is stored, wherein, standardization includes being reflected field value according to business and model requirements Continuous type section is mapped to, sliding-model control becomes discrete type field including continuous type field according to cut-point;Described pair carries out integration using data and includes:According to the demand of prediction airport security flow of the people come by required different pieces of information Source carries out integration and makes the wide table of a data.
- A kind of 3. data processing method of real-time estimate airport security number according to claim 1, it is characterised in that, institute State includes to the progress feature extraction of pretreated data:The statistical nature of the history sometime put with the airport security flow of the people at time point is extracted, the statistical nature includes people Average, median, standard deviation, maximum and the minimum value of flow;Or the airport before extraction predicted time node in specific duration Safety check flow of the people sum;And/or the Flight Information of extraction airport planning;The statistical nature of extraction, flow of the people sum and Flight Information are stored with eigenmatrix.
- A kind of 4. data processing method of real-time estimate airport security number according to claim 1, it is characterised in that, institute State by the multiple prediction submodels of the feature of extraction input be predicted including:Feature according to extraction carries out vectorization processing by each prediction submodel input demand and creates eigenmatrix, and each spy Sign matrix includes training set subcharacter matrix and test set subcharacter matrix;Model training shape will be carried out in the algorithm bag of the prediction submodel of matching in training set subcharacter input Spark MLlib Into training pattern;The training pattern of test set subcharacter input matching is tested and exports test and prediction result.
- A kind of 5. data processing method of real-time estimate airport security number according to claim 1, it is characterised in that, institute State and the prediction result of each prediction submodel output is assessed, each prediction submodel weighted value and shape are assigned according to assessment result Include into built-up pattern:The prediction result of each prediction submodel output is assessed using mean square error, and exports assessment result;The importance and its weighted value of each prediction submodel are determined according to assessment result corresponding to each prediction submodel;Weighted value according to each prediction submodel forms the built-up pattern for prediction.
- A kind of 6. processing unit of real-time estimate airport security number, it is characterised in that, including:Pretreatment module, for the use data predicted for airport security number include with data cleansing, at structuring The pretreatment of reason and Data Integration;Characteristic extracting module, it is special for including the history feature of airport security flow of the people, flight to the extraction of pretreated data Sign and passenger's future stroke characteristic;Prediction module, for the multiple prediction submodels of feature input of extraction to be carried out into airport security number prediction, and it is defeated respectively Go out prediction result;Processing module, for assessing respectively the prediction result of multiple prediction submodel outputs, assigned according to assessment result Each prediction submodel weighted value and formation built-up pattern, and carry out the prediction of airport security number and output matching according to built-up pattern Prediction result.
- 7. device according to claim 6, it is characterised in that, the pretreatment module includes:Data cleansing module, for removing using the exception record in data and/or increasing using the missing values in data;Structuring processing module, for discrete to being standardized and being stored using data, and/or to being carried out using data Change and handle and store;Data Integration module, required different data sources are integrated for the demand according to prediction airport security flow of the people Make the wide table of a data.
- 8. device according to claim 6, it is characterised in that, the characteristic extracting module includes:Fisrt feature extraction module, it is special with the statistics of the airport security flow of the people at time point for extracting the history sometime put Sign;Second feature extraction module, for extracting the airport security flow of the people before predicted time node in specific duration;Flight characteristic extracting module, for extracting the Flight Information of airport planning.
- 9. device according to claim 6, it is characterised in that the prediction module includes:Processing unit, vectorization processing is carried out by each prediction submodel input demand for the feature according to extraction;Creating unit, for including training set subcharacter matrix and test set subcharacter square according to the feature-modeling of vectorization processing The eigenmatrix of battle array;Training unit, for training set subcharacter to be inputted in the algorithm bag for predicting submodel of the matching in Spark MLlib Carry out model training and form training pattern;Test cell, test set subcharacter is inputted to the training pattern matched and is tested and exports test and prediction result.
- 10. device according to claim 6, it is characterised in that, the processing module includesAssessment unit, for assessing respectively the prediction result of multiple prediction submodel outputs;First processing units, for assigning each prediction submodel weighted value according to assessment result and forming built-up pattern;Second processing unit, for carrying out the prediction result of the prediction of airport security number and output matching according to built-up pattern.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711027543.2A CN107704971A (en) | 2017-10-27 | 2017-10-27 | A kind of data processing method and device of real-time estimate airport security number |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711027543.2A CN107704971A (en) | 2017-10-27 | 2017-10-27 | A kind of data processing method and device of real-time estimate airport security number |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107704971A true CN107704971A (en) | 2018-02-16 |
Family
ID=61176394
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711027543.2A Pending CN107704971A (en) | 2017-10-27 | 2017-10-27 | A kind of data processing method and device of real-time estimate airport security number |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107704971A (en) |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109002988A (en) * | 2018-07-18 | 2018-12-14 | 平安科技(深圳)有限公司 | Risk passenger method for predicting, device, computer equipment and storage medium |
CN109214612A (en) * | 2018-11-20 | 2019-01-15 | 广东机场白云信息科技有限公司 | One kind being based on XGBOOST airport traffic spatial and temporal distributions prediction technique |
CN110222905A (en) * | 2019-06-14 | 2019-09-10 | 智慧足迹数据科技有限公司 | A kind of method and device for predicting flow of the people |
CN110675626A (en) * | 2019-09-27 | 2020-01-10 | 汉纳森(厦门)数据股份有限公司 | Traffic accident black point prediction method, device and medium based on multidimensional data |
CN110717616A (en) * | 2019-08-30 | 2020-01-21 | 中国南方航空股份有限公司 | Civil aviation unit human resource prediction method, electronic equipment and storage medium |
CN110751340A (en) * | 2019-10-29 | 2020-02-04 | 广东机场白云信息科技有限公司 | Method and system for forecasting and analyzing pedestrian flow in airport security check area |
CN110807558A (en) * | 2019-11-06 | 2020-02-18 | 深圳微品致远信息科技有限公司 | Method and device for predicting departure taxi time based on deep neural network |
CN110837928A (en) * | 2019-11-05 | 2020-02-25 | 沈阳民航东北凯亚有限公司 | Method and device for predicting security check time |
CN111191114A (en) * | 2019-11-26 | 2020-05-22 | 恒大智慧科技有限公司 | Cold scenic spot recommendation method and device and storage medium |
CN111435486A (en) * | 2019-01-15 | 2020-07-21 | 阿里巴巴集团控股有限公司 | Ticket checking resource allocation method and device |
CN111832929A (en) * | 2020-07-09 | 2020-10-27 | 民航成都信息技术有限公司 | Dynamic scheduling method and system for airport check-in |
CN111832820A (en) * | 2020-07-09 | 2020-10-27 | 飞友科技有限公司 | Method and system for predicting pedestrian flow of each gate at airport |
CN112016731A (en) * | 2019-05-31 | 2020-12-01 | 杭州海康威视系统技术有限公司 | Queuing time prediction method and device and electronic equipment |
CN113298306A (en) * | 2021-05-24 | 2021-08-24 | 建信金融科技有限责任公司 | Method and device for predicting number of people at dinner, electronic equipment and medium |
CN114897205A (en) * | 2022-03-07 | 2022-08-12 | 中国民航工程咨询有限公司 | Target airport characteristic value prediction method and computer equipment |
CN115130783A (en) * | 2022-07-29 | 2022-09-30 | 中国工商银行股份有限公司 | Method and device for predicting network queuing information |
CN117809400A (en) * | 2023-12-29 | 2024-04-02 | 厦门民航凯亚有限公司 | Intelligent security check passenger flow monitoring system suitable for terminal building |
-
2017
- 2017-10-27 CN CN201711027543.2A patent/CN107704971A/en active Pending
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020015104A1 (en) * | 2018-07-18 | 2020-01-23 | 平安科技(深圳)有限公司 | Method, apparatus, computer device, and storage medium for predicting flow rate of passengers presenting security risk |
CN109002988B (en) * | 2018-07-18 | 2023-10-27 | 平安科技(深圳)有限公司 | Risk passenger flow prediction method, apparatus, computer device and storage medium |
CN109002988A (en) * | 2018-07-18 | 2018-12-14 | 平安科技(深圳)有限公司 | Risk passenger method for predicting, device, computer equipment and storage medium |
CN109214612A (en) * | 2018-11-20 | 2019-01-15 | 广东机场白云信息科技有限公司 | One kind being based on XGBOOST airport traffic spatial and temporal distributions prediction technique |
CN111435486A (en) * | 2019-01-15 | 2020-07-21 | 阿里巴巴集团控股有限公司 | Ticket checking resource allocation method and device |
CN111435486B (en) * | 2019-01-15 | 2024-04-02 | 阿里巴巴集团控股有限公司 | Ticket checking resource allocation method and device |
CN112016731B (en) * | 2019-05-31 | 2024-02-27 | 杭州海康威视系统技术有限公司 | Queuing time prediction method and device and electronic equipment |
CN112016731A (en) * | 2019-05-31 | 2020-12-01 | 杭州海康威视系统技术有限公司 | Queuing time prediction method and device and electronic equipment |
CN110222905A (en) * | 2019-06-14 | 2019-09-10 | 智慧足迹数据科技有限公司 | A kind of method and device for predicting flow of the people |
CN110717616A (en) * | 2019-08-30 | 2020-01-21 | 中国南方航空股份有限公司 | Civil aviation unit human resource prediction method, electronic equipment and storage medium |
CN110675626A (en) * | 2019-09-27 | 2020-01-10 | 汉纳森(厦门)数据股份有限公司 | Traffic accident black point prediction method, device and medium based on multidimensional data |
CN110675626B (en) * | 2019-09-27 | 2021-01-12 | 汉纳森(厦门)数据股份有限公司 | Traffic accident black point prediction method, device and medium based on multidimensional data |
CN110751340A (en) * | 2019-10-29 | 2020-02-04 | 广东机场白云信息科技有限公司 | Method and system for forecasting and analyzing pedestrian flow in airport security check area |
CN110837928A (en) * | 2019-11-05 | 2020-02-25 | 沈阳民航东北凯亚有限公司 | Method and device for predicting security check time |
CN110807558A (en) * | 2019-11-06 | 2020-02-18 | 深圳微品致远信息科技有限公司 | Method and device for predicting departure taxi time based on deep neural network |
CN111191114A (en) * | 2019-11-26 | 2020-05-22 | 恒大智慧科技有限公司 | Cold scenic spot recommendation method and device and storage medium |
CN111832929B (en) * | 2020-07-09 | 2023-12-12 | 民航成都信息技术有限公司 | Dynamic scheduling method and system for airport check-in machine |
CN111832820A (en) * | 2020-07-09 | 2020-10-27 | 飞友科技有限公司 | Method and system for predicting pedestrian flow of each gate at airport |
CN111832929A (en) * | 2020-07-09 | 2020-10-27 | 民航成都信息技术有限公司 | Dynamic scheduling method and system for airport check-in |
CN113298306A (en) * | 2021-05-24 | 2021-08-24 | 建信金融科技有限责任公司 | Method and device for predicting number of people at dinner, electronic equipment and medium |
CN114897205A (en) * | 2022-03-07 | 2022-08-12 | 中国民航工程咨询有限公司 | Target airport characteristic value prediction method and computer equipment |
CN115130783A (en) * | 2022-07-29 | 2022-09-30 | 中国工商银行股份有限公司 | Method and device for predicting network queuing information |
CN117809400A (en) * | 2023-12-29 | 2024-04-02 | 厦门民航凯亚有限公司 | Intelligent security check passenger flow monitoring system suitable for terminal building |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107704971A (en) | A kind of data processing method and device of real-time estimate airport security number | |
Zhang et al. | Market power and its determinants in the Chinese airline industry | |
CN107086935B (en) | People flow distribution prediction method based on WIFI AP | |
CN109214719A (en) | A kind of system and method for the marketing inspection analysis based on artificial intelligence | |
CN109102157A (en) | A kind of bank's work order worksheet processing method and system based on deep learning | |
CN116187640B (en) | Power distribution network planning method and device based on grid multi-attribute image system | |
CN104156805B (en) | Leg running time calculation method based on probability distribution | |
CN114943356B (en) | Short-time demand integrated prediction method for airport arrival passenger to take taxi | |
Ripoll-Zarraga et al. | Exploring the reasons for efficiency in Spanish airports | |
CN104143170A (en) | Low-altitude rescue air traffic dispatching command system and dispatching command method thereof | |
CN110489556A (en) | Quality evaluating method, device, server and storage medium about follow-up record | |
CN113704389A (en) | Data evaluation method and device, computer equipment and storage medium | |
CN116432806A (en) | Rolling prediction method and system for flight ground guarantee node time | |
CN115271227A (en) | Resource scheduling method in cloud environment | |
CN111144673A (en) | Method, device and equipment for evaluating structure of organization personnel and computer readable medium | |
CN116011653A (en) | Method, device and storage medium for predicting opening quantity of cabinet-to-cabinet table | |
Zhang et al. | Research on improvement and optimisation of modelling method of China’s civil aircraft market demand forecast model | |
CN112948115B (en) | Cloud workflow scheduler pressure prediction method based on extreme learning machine | |
CN115689201A (en) | Multi-criterion intelligent decision optimization method and system for enterprise resource supply and demand allocation | |
Novrisal et al. | Simulation of departure terminal in Soekarno-Hatta International airport | |
CN114925663A (en) | Intelligent ticket forming method and system based on overhaul ticket | |
CN113610402A (en) | Land ecological bearing capacity assessment method based on image analysis and related equipment | |
Kaur et al. | An innovative multi-criteria decision-making framework for assessing India's airport operating efficiency | |
CN110532418A (en) | A kind of high net value industry AI intelligent design system | |
Chen et al. | Bayesian Neural Network-Based Demand Forecasting for Express Transportation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180216 |
|
RJ01 | Rejection of invention patent application after publication |