CN114118508A

CN114118508A - OD market aviation passenger flow prediction method based on space-time convolution network

Info

Publication number: CN114118508A
Application number: CN202110878862.4A
Authority: CN
Inventors: 吴薇薇; 林思奇; 季灵; 张皓瑜
Original assignee: Nanjing University of Aeronautics and Astronautics
Current assignee: Nanjing University of Aeronautics and Astronautics
Priority date: 2021-08-02
Filing date: 2021-08-02
Publication date: 2022-03-01

Abstract

The invention discloses an OD market aviation passenger flow prediction method based on a time-space convolution network, which belongs to the technical field of big data, and comprises the steps of counting, sorting and storing aviation passenger flow data from each departure airport to a destination airport in a multi-airport system in a region, building a prediction model of the OD market aviation passenger flow based on the time-space convolution network, determining the optimal hyper-parameter setting of the prediction model under different data sets, predicting the aviation passenger flow of a plurality of OD markets according to aviation passenger flow historical data, solving the technical problems of predicting the aviation passenger flow by adopting the time-space convolution network and simultaneously predicting the passenger flow of a plurality of departure airports to a plurality of OD markets of the same destination airport in the same region, firstly applying the time-space convolution network to the field of aviation passenger flow prediction and simultaneously predicting the passenger flow of a plurality of departure airports to the same destination airport in the same region, a new idea is provided for predicting the passenger flow volume of the aviation.

Description

OD market aviation passenger flow prediction method based on space-time convolution network

Technical Field

The invention belongs to the technical field of big data, and particularly relates to an OD market aviation passenger flow prediction method based on a space-time convolutional network.

Background

For a long time, accurate prediction of the air passenger flow is always the primary task for overall planning of all main bodies in the air transportation industry. The airline can accordingly determine whether the OD market can make profits or make adjustments to existing capacity supplies, and the airport can timely assess the matching of current infrastructure capacity and passenger flow. However, the passenger flow is influenced by various factors in the air transportation market, and uncertainty and unpredictability exist, so how to scientifically and accurately predict the air passenger flow in the OD market is always one of the hot spots in the research of the field of civil aviation transportation.

According to the existing research conditions at home and abroad, the traffic flow prediction method in the traffic field can be divided into a parametric method and a non-parametric method based on a model construction mechanism. The model structure of the parametric method needs to be determined according to corresponding theoretical hypothesis and statistical analysis, the model parameters are estimated by using relevant historical time sequence data, and the relevant model of the method comprises the following steps: least squares regression models, autoregressive moving average models, autoregressive distribution lag models, time varying parameter models, vector autoregression, gray prediction models, and the like. In addition, from the perspective of system dynamics, a dynamic submodel of each factor influencing passenger flow volume changes can be constructed, and an aviation passenger flow volume prediction result is obtained through system simulation. Since the parameter method prediction model is based on the idea of metering economic modeling, the establishment and fitting of a concrete model capable of accurately reflecting the passenger flow volume change are attempted, but the model hardly represents the non-linear relation of causal variables and has insufficient capability of processing data generalization with severe fluctuation change.

The nonparametric method based on machine learning is also widely applied to the market passenger flow prediction, and relevant application means of the method are as follows: BP neural network, support vector regression, random forest regression, adaptive fuzzy neural network, wavelet neural network, and combined prediction model. Although the results obtained by the combined prediction model are superior to those of a single prediction model, most of the built models are shallow-layer structures and cannot accurately mine deep-layer information behind large-scale data, and the machine learning method still has limitation on processing high-dimensional nonlinear traffic data, so that further expansion of machine learning prediction is restricted.

In recent years, under the large background that traffic data is explosively increased, deep learning derived and developed from artificial neural network research provides a new way for further improving the passenger flow prediction accuracy. The traditional neural network takes a one-dimensional vector data form as input, and the convolutional neural network is different in that a multidimensional matrix data form is taken as input, and an original neuron connection mode is replaced by convolutional operation, and the multilayer convolutional operation can automatically extract correlation characteristics among data with spatial characteristics, so that errors caused by artificially constructed characteristics are reduced, and the convolutional neural network can be migrated and applied to the prediction problem of the aviation passenger flow volume data with the space-time characteristics.

In summary, the conventional methods for predicting the passenger flow volume of the aviation mostly focus on a parameter method, shallow machine learning and combined prediction. The parameter method relates to model construction, which is influenced by human subjective factors to a great extent, and meanwhile, the accuracy of parameter estimation influences the final prediction precision of the model; shallow machine learning and combined prediction may cause situations such as the prediction process falling into overfitting. In addition, the historical data of a single OD market in the time dimension are only considered in the aviation passenger flow prediction, the influence of adjacent airports with linkage correlation effects in the space position on the predicted OD market passenger flow is ignored, and the OD market passenger flow cannot be predicted from the perspective of a plurality of regional airports.

Disclosure of Invention

The invention aims to provide an OD market passenger flow prediction method based on a space-time convolution network, which solves the technical problem that the space-time convolution network is adopted to predict the passenger flow of the aviation, and the simultaneous prediction of the passenger flow of a plurality of OD markets from a plurality of departure airports to a same destination airport in the same region is realized.

In order to achieve the purpose, the invention adopts the following technical scheme:

an OD market aviation passenger flow prediction method based on a space-time convolutional network comprises the following steps:

selecting a certain regional multi-airport system through a client server, and counting, sorting and storing the aviation passenger flow data from each departure airport to a destination airport in the regional multi-airport system;

the method comprises the steps that a prediction model server obtains aviation passenger flow volume data, preprocesses the aviation passenger flow volume data, constructs an OD passenger flow grid diagram and external influence factor characteristic vectors, and builds an OD market aviation passenger flow volume prediction model based on a time-space convolution network to form a corresponding training set of the prediction model;

the OD market refers to an air passenger transportation market from an Origin Airport to a Destination Airport;

determining the optimal hyper-parameter setting of the prediction model under different data sets by adjusting the data characteristics and the network structure of the prediction model; the hyper-parameters comprise hyper-parameters of a network structure and hyper-parameters of data characteristics, the hyper-parameters of the network structure comprise the number of convolution layers, the size of convolution kernels and the number of convolution kernels, and the hyper-parameters of the data characteristics comprise the sample length selected by a trend segment and the sample length selected by a periodic segment;

and the central server simultaneously predicts the aviation passenger flow of a plurality of OD markets according to the aviation passenger flow historical data.

Preferably, a manager inputs a three-character code of a departure airport and a three-character code of a destination airport in the multi-airport system through a client server, and the monthly passenger flow corresponding to each OD market;

and the client server automatically arranges according to the time to generate passenger flow time sequence data of each OD market, and transmits the passenger flow time sequence data to the database module for storage.

Preferably, the method for preprocessing the aviation passenger flow volume data comprises wavelet threshold denoising and data normalization processing, wherein the wavelet threshold denoising adopts a soft threshold two-layer decomposition method;

preferably, the constructing of the OD passenger flow grid map and the external influence factor feature vector specifically includes:

step A1: definition O ═ { O₁，o₂，…，o_NRepresenting a set of departure airports, wherein N is the total number of the departure airports; d represents the destination airport; by using<o，d>Represents an OD market, O belongs to O;

the prediction model server maps N departure airports to an I multiplied by J grid graph R according to the longitude and latitude and the relative geographic position distribution condition of the departure airports in the multi-airport system of each region, a grid (I, J) represents any grid in the grid graph R, I represents the row number of the grid, J represents the column number of the grid, and the grid (I, J) represents the passenger flow from the departure airport o in the system corresponding to the position to the destination airport d outside the system at the t month

By a two-dimensional tensor X_t∈R^I×JA grid graph showing the OD passenger flow from departure airport to the same destination airport outside the system during the t month, wherein,

step A2: defining two characteristics of external influence factors including month attribute and whether to contain holidays, and processing the external influence factors into a 0-1 vector, namely representing the predicted month attribute and whether to contain holidays by 0 or 1, wherein the first 12 bits correspondingly represent 1 month to 12 months, and the 13 th bit represents whether to contain holidays;

preferably, the prediction model server builds a prediction model of the OD market aviation passenger flow based on the space-time convolution network by using a functional module in a Keras neural network library of Python, and the OD market aviation passenger flow is predicted according to the known m historical observed values X_M＝{X_tL t 1, 2, …, m, forecast airline passenger flow X for several OD markets for k months_K＝{X_t|t＝m+1，m+2，…，m+k}；

The optimal hyper-parameter setting of the prediction model is determined by changing two hyper-parameters of a network structure and data characteristics, the prediction model is trained and optimized by using an Adam optimizer, and overfitting of the model is avoided by adopting an early-stop strategy in the training process.

Preferably, the specific steps of predicting the airline passenger flow of the plurality of OD markets by the central server include:

step B1: extracting segments of the OD passenger flow grid diagram, and extracting two time segments according to different time intervals aiming at the prediction time point according to the dependency of the passenger flow on the time dimension: respectively, are trending grid graph segments X_TreAnd periodic grid map fragment X_PerThe specific extraction form is as follows:

wherein l_treAnd l_perThe method is used as an adjustable data characteristic hyper-parameter in a prediction model, and respectively represents the sample length selected by a trend segment and the sample length selected by a periodic segment;

step B2: constructing a space-time convolution network, and respectively constructing space-time characteristics of space-time convolution network branches with the same structure for capturing OD passenger flow based on two extracted grid graph segments with different time intervals, wherein the space-time convolution network branches are based on S +1(S is more than or equal to 1) convolution layers;

taking the space-time convolution network branch of the trend part as an example, the method uses

Representing a trending grid graph segment, passing through the first convolution layer C₁Will (X)_Tre)⁽⁰⁾Into a new tensor (X)_Tre)⁽¹⁾The conversion formula is as follows:

wherein the content of the first and second substances,

and

is the learning parameter of the first convolutional layer, in convolutional layer C₁After the operation, continuously adding (S-1) convolution layers according to the formula, and after the S convolution layer, passing through a convolution layer C only containing one convolution kernel_S+1Finally, the output result (X) of the trend part is obtained_Tre)^(S+1)(ii) a Similarly, the same operation is used for constructing the space-time convolution network branch of the periodic part to obtain an output result (X)_Per)^(S+1)；

Step B3: constructing an external influence factor network, wherein external influence factors considered by a prediction model comprise month attributes and whether holidays or not, and obtaining corresponding external influence factor characteristic vectors E based on the tth prediction time point_tThe first layer can be regarded as an embedded layer, and mainly quantifies and adds the external factors into a prediction model, and the second layer maps the features obtained by the first layer into a high-dimensional tensor, the size of which should be equal to X_tSo as to be fused with the output result of the space-time convolution network to obtain an output result X_Ext；

Step B4: fusing the obtained prediction result and giving the output result (X) of the space-time convolution network in the form of learning parameters_Tre)^(S+1)And (X)_Per)^(S+1)Different weight matrixes are aggregated to obtain a weighted output result, and the calculation formula is as follows:

X_Con＝W_Tre*(X_Tre)^(S+1)+W_Per*(X_Per)^(S+1)

wherein denotes the hadamard product; w_TreAnd W_PerRespectively represent trendThe weights of the potential part and the periodic part, namely the influence degrees of the two parts of output on the final prediction result respectively, further calculate the result X through the tanh function_ConAnd external influencing factor network output result X_ExtMapping to [ -1, 1 [ ]]Get the final predicted result X_KThe calculation formula is as follows:

X_K＝tanh(X_con+X_Ext)

to minimize the predictor matrix X_KAnd true value matrix

The mean square error between them is the target training prediction model:

where θ represents all learning parameters of the model.

The invention relates to an OD market aviation passenger flow prediction method based on a time-space convolution network, which solves the technical problems that the time-space convolution network is adopted to predict the aviation passenger flow and the simultaneous prediction of a plurality of OD market passenger flows from a plurality of departure airports to a same destination airport in the same area is realized; according to the invention, the convolution kernels in the convolution network are regarded as media for capturing OD passenger flow spatial correlation, so that a plurality of convolution kernels in one convolution layer can accurately capture the complex potential characteristic relation of OD passenger flow in spatial dimension, and the prediction effect is improved; the method comprehensively considers the time dependence, the spatial correlation and the effect of external influence factors of OD passenger flow, takes RMSE as the evaluation standard of the model, takes OD passenger flow data of 16 main airports from Long-triangle to Guangzhou white cloud international airport as an example, and the prediction model has better fitting effect compared with ARIMA (autoregressive moving average model), SVR (support vector regression), Elman neural network and LSTM (long-short term memory network).

Drawings

FIG. 1 is a general flow chart of the method for predicting the air passenger flow of the OD market based on the space-time convolution network of the present invention;

FIG. 2 is a model framework for predicting the air passenger flow volume in the OD market based on the space-time convolution network;

FIG. 3 is a schematic diagram of the construction of an OD passenger flow grid in an embodiment of the invention;

FIG. 4 is a result of fitting the passenger flow in different OD markets by the model method proposed in the embodiment of the present invention with other models;

fig. 5 shows the passenger flow fitting results of the model method proposed in the embodiment of the present invention and other models at different prediction intervals.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

As shown in fig. 1-5, an OD market air passenger flow prediction method based on a space-time convolutional network includes:

the destination airport is the destination airport outside the multi-airport system.

the data set refers to the aviation passenger flow data (namely a plurality of OD market aviation passenger flows) from each departure airport to a destination airport in the regional multi-airport system; different regional multi-airport systems will form different data sets, but the data types are all traffic volumes.

Hyper-parameters refer to model parameters that are artificially set before the start of training the predictive model.

"departure airport" refers to all or part of the airports within the selected "regional multi-airport system," such as all airports in the Long triangular region.

The 'destination airport' refers to an airport outside the selected 'regional multi-airport system', and only one airport is selected, such as the international airport of capital in Beijing.

Preferably, the method for preprocessing the aviation passenger flow volume data comprises wavelet threshold denoising and data normalization processing, wherein the wavelet threshold denoising adopts a soft threshold two-layer decomposition method.

as shown in fig. 3, each grid corresponds to a departure airport o in advance, and grid (i, j) corresponds to the passenger flow from the departure airport o to the destination airport d represented by the grid in a month; the grid map is equivalent to a flow matrix as input to the model.

In this embodiment, 16 departure airports are mapped to a 4 × 4 grid map according to the longitude and latitude and the relative geographic location distribution of the departure airports in the long-triangular region multi-airport system, and the mapping positions of the departure airports in the grid map are shown in the lower left corner of fig. 3.

Step A2: defining two characteristics of external influence factors including a month attribute and whether the external influence factors contain holidays, processing the external influence factors into a 0-1 vector, namely representing the predicted month attribute and whether the external influence factors contain holidays by 0 or 1, wherein the first 12 bits correspondingly represent 1 month to 12 months, and the 13 th bit represents whether the holidays contain holidays, and if the passenger flow of 2018 year 10 month is predicted, converting the passenger flow into [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1 ];

preferably, the prediction model server builds a prediction model of the OD market aviation passenger flow based on the space-time convolution network by using a functional module in a Keras neural network library of Python, as shown in fig. 2, which is a frame of the prediction model;

by knowing m historical observations X_M＝{X_tL t 1, 2, …, m, forecast airline passenger flow X for several OD markets for k months_K＝{X_t|t＝m+1，m+2，…，m+k}；

Early-stop strategy: 1) by dividing an original training data set into a training set and a validation set; 2) training the model on a training set and calculating the error of the model on a verification set in each epoch period; 3) stopping training when the error of the model on the verification set is worse than the last training result; 4) and using the parameters in the last iteration result as final parameters of the model.

Namely, the optimal model training effect is achieved by balancing the training period and generalization errors, and model overfitting is avoided

In this embodiment, the fixed learning rate is 0.002, the number of samples processed in each batch is 3, the number of training rounds is 100, and for example, OD passenger flow data of 16 major airports from long triangle to Guangzhou Baiyun International airport is taken as an example, the number of convolution layers in a network structure is set to be ∈ {1, 2, 3}, the size of convolution kernels is ∈ { (2, 2), (3, 3) }, the number of convolution kernels is ∈ {16, 32, 64, 128}, and the sample length selected by a trend segment and a periodic segment in data characteristics is ∈ {2, 3, 4 }; in the embodiment, data from 1 month to 2017 and 12 months in 2010 are used as training data, data from 1 month to 2018 and 12 months in 2018 are used as test data, and the prediction time interval is 1 month.

With the absolute error Δ, MAE (mean absolute error) and RMSE (root mean square error) as evaluation criteria, the air passenger flow prediction pair of different algorithm models is shown in table 1, the fitting result with other models in different OD markets is shown in fig. 4, and the fitting result with other models in different prediction intervals is shown in fig. 5:

TABLE 1

And predicting the aviation passenger flow of a plurality of OD markets in a period of time by using a prediction model for determining the super parameter setting according to the aviation passenger flow historical data.

wherein the content of the first and second substances,

and

Step B4: fusing the obtained prediction result and giving the output result (X) of the space-time convolution network in the form of learning parameters_Tre)^(S+1)And (X)_Per)^(S+1)Is differentAnd the weight matrix is aggregated to obtain a weighted output result, and the calculation formula is as follows:

X_Con＝W_Tre*(X_Tre)^(S+1)+W_Per*(X_Per)^(S+1)

wherein denotes the hadamard product; w_TreAnd W_PerRespectively representing the weight of the trend part and the periodic part, namely the influence degree of the output of the two parts on the final prediction result, and further calculating the result X through a tanh function_ConAnd external influencing factor network output result X_ExtMapping to [ -1, 1 [ ]]Get the final predicted result X_KThe calculation formula is as follows:

X_K＝tanh(X_Con+X_Ext)

to minimize the predictor matrix X_KAnd true value matrix

The mean square error between them is the target training prediction model:

where θ represents all learning parameters of the model.

In the present invention, any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present invention.

The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.

Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims

1. An OD market aviation passenger flow prediction method based on a space-time convolution network is characterized by comprising the following steps: the method comprises the following steps:

the method comprises the steps that a prediction model server obtains aviation passenger flow volume data, preprocesses the aviation passenger flow volume data, constructs an OD passenger flow grid diagram and external influence factor characteristic vectors, and builds a prediction model of the OD market aviation passenger flow volume based on a space-time convolution network to form a corresponding training set of the prediction model;

2. The space-time convolutional network-based OD market air passenger flow prediction method of claim 1, characterized in that: a manager inputs a three-character code of a departure airport and a three-character code of a destination airport in the multi-airport system through a client server, and the monthly passenger flow corresponding to each OD market;

3. The space-time convolutional network-based OD market air passenger flow prediction method of claim 1, characterized in that: preprocessing the aviation passenger flow data, including wavelet threshold denoising and data normalization processing, wherein the wavelet threshold denoising adopts a soft threshold two-layer decomposition method.

4. The space-time convolutional network-based OD market air passenger flow prediction method of claim 1, characterized in that: constructing an OD passenger flow grid diagram and external influence factor characteristic vectors, which specifically comprise the following steps:

step A1: definition O ═ { O₁，o₂，…，o_NRepresenting a set of departure airports, wherein N is the total number of the departure airports; d represents the destination airport; using < O, d > to represent an OD market, and O belongs to O;

step A2: defining two characteristics of external influence factors including a month attribute and whether to contain holidays, and processing the external influence factors into a 0-1 vector, namely representing the predicted month attribute and whether to contain the holidays by 0 or 1, wherein the first 12 bits correspondingly represent 1 month to 12 months, and the 13 th bit represents whether to contain the holidays.

5. The space-time convolutional network-based OD market air passenger flow prediction method of claim 4, characterized in that: the prediction model server builds a prediction model of the OD market aviation passenger flow based on the space-time convolution network by using a functional module in a Keras neural network library of Python, and the OD market aviation passenger flow is predicted according to the known m historical observed values X_M＝{X_tL t 1, 2, …, m, forecast airline passenger flow X for several OD markets for k months_K＝{X_t|t＝m+1，m+2，…，m+k}；

6. The space-time convolutional network-based OD market air passenger flow prediction method of claim 5, characterized in that: the specific steps of predicting the aviation passenger flow of a plurality of OD markets by the central server comprise:

wherein the content of the first and second substances,

and

Step B3: constructing an external influence factor network, wherein external influence factors considered by a prediction model comprise month attributes and whether holidays or not, and obtaining corresponding external influence factor characteristic vectors E based on the tth prediction time point_tUsing a two-layer fully-connected neural network branch as the external influencing factor network, the first layer can be considered as the embedding layer, mainly the embedding layerThe external factors are quantitatively added into a prediction model, and the second layer maps the features obtained by the first layer into a high-dimensional tensor, the size of which is equal to X_tSo as to be fused with the output result of the space-time convolution network to obtain an output result X_Ext；

X_Con＝W_Tre*(X_Tre)^(S+1)+W_Per*(X_Per)^(S+1)；

X_K＝tanh(X_Con+X_Ext)；

to minimize the predictor matrix X_KAnd true value matrix

The mean square error between them is the target training prediction model:

where θ represents all learning parameters of the model.