CN111178978A - Air ticket price prediction method combining flight information and price sequence - Google Patents

Air ticket price prediction method combining flight information and price sequence Download PDF

Info

Publication number
CN111178978A
CN111178978A CN201911424222.5A CN201911424222A CN111178978A CN 111178978 A CN111178978 A CN 111178978A CN 201911424222 A CN201911424222 A CN 201911424222A CN 111178978 A CN111178978 A CN 111178978A
Authority
CN
China
Prior art keywords
price
flight
sequence
model
discrete
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911424222.5A
Other languages
Chinese (zh)
Inventor
张宇光
周煊
陈星�
李悦
范修伟
夏显茁
陈亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Civil Aviation Information Technology Co ltd
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN201911424222.5A priority Critical patent/CN111178978A/en
Publication of CN111178978A publication Critical patent/CN111178978A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0206Price or cost determination based on market factors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Strategic Management (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Finance (AREA)
  • General Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Game Theory and Decision Science (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Human Resources & Organizations (AREA)
  • Primary Health Care (AREA)
  • Tourism & Hospitality (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

the invention discloses an air ticket price prediction method combining flight information and a price sequence, which comprises the steps of collecting historical flight characteristics and a recent price sequence, distinguishing flight continuous characteristics, flight discrete characteristics, price continuous characteristics and price discrete characteristics, carrying out unique hot coding on the flight discrete characteristics and the price discrete characteristics, respectively establishing a flight characteristic prediction model and a price sequence prediction model by using a machine learning model, inputting the flight continuous characteristics and the coded flight discrete characteristics into the flight characteristic prediction model for training and optimizing flight weights of the model, inputting the price continuous characteristics and the price discrete characteristics into the price sequence prediction model for training and optimizing the sequence weights of the model, constructing a target prediction function based on flight predicted prices output by the flight characteristic prediction model and sequence predicted prices output by the price sequence prediction model, and combining the optimized flight weight alpha and sequence weight β to obtain a prediction result.

Description

Air ticket price prediction method combining flight information and price sequence
Technical Field
The invention relates to the field of computer data processing, in particular to a flight ticket price prediction method combining flight information and price sequences.
Background
In recent years, an airplane as a fast, convenient, safe and reliable vehicle gradually becomes an important choice for people to travel. According to the latest data, in 2019, 9 months, domestic airports go out of port and fly to flight for more than 40 ten thousand shifts, and the increase on the same ratio is more than 6%. Generally, the price of a ticket product is determined by pricing strategies of each airline company, is influenced by passenger purchasing behavior, competition among airlines, and airline information, and has complexity and volatility. Therefore, the prediction of the trend of the price of the air ticket is a challenge. But the passenger wants to know the price trend of the air ticket, so that the passenger can purchase the air ticket at a lower price. Since airlines typically use their pricing strategy as a business secret, it is difficult for passengers to estimate future ticket price changes. Thus, a reasonable prediction of ticket prices can help passengers decide when to purchase the corresponding ticket product at a lower price.
According to the information used for establishing an air ticket price prediction model, the related work of the existing air ticket price prediction research is mainly divided into two branches: a price sequence based method and a historical flight characteristics based method. The method based on the price sequence comprises an L + +. TS method, and the method establishes a mathematical model of the price sequence on time by using the recent price sequence of the same flight and the recent price sequence of different flights. The method based on the historical flight characteristics uses information including takeoff time, arrival time, days from the takeoff date and the like to establish a mathematical model of the historical flight characteristic information and the air ticket price. The price sequence based approach can effectively predict the air ticket prices in the short term through the recent air ticket price sequence. However, when the forecast date is much farther from the query date, the price series based approach can fail due to the lack of recent ticket price series and cumulative errors. The method based on the historical flight characteristics can predict the future long-term ticket price through the historical flight characteristics. However, since recent ticket price information is not incorporated, methods based on historical flight characteristics are inferior to methods based on price sequences in predicting future short-term ticket prices when significant fluctuations in recent prices occur.
The related work of the existing research on the air ticket price prediction mainly has two branches: a price sequence based method and a flight characteristics based method. The price sequence based approach can effectively predict the air ticket prices in the short term through the recent air ticket price sequence. However, when the forecast date is much farther from the query date, the price series based approach can fail due to the lack of recent ticket price series and cumulative errors. The method based on flight characteristics can predict the long-term future ticket price through the flight characteristics. However, without incorporating recent ticket price information, flight-feature-based approaches are inferior to price-sequence-based approaches in predicting future short-term ticket prices when significant fluctuations in recent prices occur.
Disclosure of Invention
The invention mainly aims to provide a method for predicting the price of an air ticket by combining flight information and a price sequence, and aims to overcome the problems.
In order to achieve the purpose, the invention provides a method for predicting the price of an air ticket by combining flight information and a price sequence, which comprises the following steps:
s10, collecting historical flight characteristics and a recent price sequence, respectively extracting flight continuous characteristics and flight discrete characteristics from the historical flight characteristics, and carrying out unique hot coding on the flight discrete characteristics; respectively extracting price continuous features and price discrete features from the recent price sequence according to the continuous features, and carrying out one-hot coding on the price discrete features;
s20, establishing flight characteristic prediction models and price sequence prediction models respectively by using a machine learning model;
s30, inputting the flight continuous characteristic and the flight discrete characteristic after being coded into a flight characteristic prediction model to train and optimize the flight weight β of the model, and inputting the price continuous characteristic and the price discrete characteristic into a price sequence prediction model to train and optimize the sequence weight beta of the model;
s40 flight forecast price P output based on flight characteristic forecast modelstaticPrice sequence prediction model output sequence prediction price Pdynamicand constructing an objective prediction function by combining the optimized flight weight α and the optimized sequence weight β to obtain a prediction result.
Preferably, the target prediction function is specifically as follows:
Figure BDA0002353143070000031
wherein P isstaticFlight prediction price, P, output by the flight characteristic prediction modeldynamicPrices are predicted from the sequence output based on the price sequence prediction model.
Preferably, the machine learning model is a multilayer perception mechanism building model of a neural network, the neural network adopts a dynamic neural network built by a deep learning network framework pytorch, and the dynamic neural network comprises an optimizer which takes the multilayer perception mechanism as the model, takes the root mean square error RMSE as the loss function of the model and takes the Adam algorithm as the weight coefficient of the optimization model.
Preferably, the method of S30 includes:
s301, flight continuous characteristics and flight discrete characteristics are aggregated and then input into a flight characteristic prediction model; after the price continuous characteristic and the price flight discrete characteristic are aggregated, inputting a price sequence prediction model;
s302, the flight characteristic prediction model represents a mapping function of mapping flight characteristics to flight prices through a multilayer perceptron of a neural network, and flight prediction prices are output; the price sequence prediction model represents a mapping function of mapping price sequence characteristics to flight prices through a multilayer perceptron of a neural network, and outputs sequence prediction prices;
s303, respectively calculating flight errors of the predicted flight price and the actual price and sequence errors of the sequence predicted price and the actual price through a Root Mean Square Error (RMSE);
s304, adopting an Adam algorithm as an optimizer to carry out back propagation on flight errors so as to optimize flight weights α of the flight characteristic prediction model, and adopting the Adam algorithm as the optimizer to carry out back propagation on sequence errors so as to optimize sequence weights β of the price sequence prediction model.
Preferably, the flight continuity characteristics at least include flight duration, departure date, number of the day of takeoff, whether the day of takeoff is a holiday, whether the day of takeoff is a weekend, whether the flight is across the sky, number of stops, airport construction cost, fuel surcharge, tax, whether the flight is a shared flight, and number of days away from the day of takeoff.
Preferably, the price continuation feature includes, in addition to the contents included in the flight connection feature, at least a sequence of prices near the query date and a number of days from the departure date for each price in the sequence.
Preferably, the price discrete feature and the flight discrete feature are the same and at least include a takeoff time, an arrival time, a takeoff airport, an arrival airport, an airline and an actual carrier airline, wherein the takeoff time and the arrival time are subjected to segmented unique hot coding, and the other discrete features are subjected to unique hot coding.
Preferably, the machine learning model may build a model for a CART regression tree in a decision tree.
Preferably, the recent price sequence is composed of the lowest price sequence T days before flight F { PQ-T,PQ-(T-1),PQ-(T-3)...PQ-3,PQ-2,PQ-1}。
Preferably, the specific method for performing segmented one-hot coding on the takeoff time and the arrival time comprises: the 24 hours per day is divided into four time periods of [0:00-10:00], [10:00-14:00], [14:00-19:00] and [19:00-24:00], and the departure time and the arrival time correspond to the four time periods for independent thermal coding.
11. Preferably, the multilayer perceptron comprises:
the input layer comprises a plurality of neurons, the number of the neurons is determined by an actual route, and the neurons are used for inputting flight continuous features and the flight discrete features after being coded into the input flight feature prediction model; inputting the price continuous characteristic and the coded price discrete characteristic into a price sequence prediction model;
first hidden layer h1Total 32 neurons, each neuron is connected with the input layer, and the input layer is subjected to nonlinear transformation h1=Relu(w1x+b1) Is obtained wherein w1To connect coefficients, b1In the flight characteristic prediction model, x represents flight continuous characteristics and flight discrete characteristics after being coded; in the price sequence prediction model, x represents a price continuous characteristic and a coded price discrete characteristic, and Relu is a linear rectification activation function;
second hidden layer h2Total 32 neurons, each neuron is fully connected with the first hidden layer, and the first hidden layer is subjected to nonlinear transformation h2=Relu(w2h1+b2) Is obtained wherein w2To connect coefficients, b2For bias, Relu is the linear rectification activation function;
the output layer P has 1 neuron, the neuron is fully connected with the second hidden layer, and the second hidden layer is subjected to linear transformation P ═ w3h2+b3Is obtained wherein w3To connect coefficients, b3For bias, P is the predicted price of the output.
Compared with the prior art, the invention has the beneficial effects that: the invention effectively predicts the air ticket price by combining the historical flight characteristics and the air ticket price prediction method of the price sequence, and well reduces the error in the air ticket price prediction. To alleviate the failure problem of the accumulated error in the price series-based prediction model based on the flight characteristic prediction model. When the recent price fluctuates sharply, the prediction model based on the price sequence can make up the defect that the prediction model based on the flight information is not accurate enough.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the structures shown in the drawings without creative efforts.
FIG. 1 is a flow chart of a method according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of the multi-layered sensor according to the present invention;
FIG. 3 is an exemplary illustration of flight characteristics according to the present invention;
figure 4 is an exemplary illustration of a recent price sequence according to the present invention,
the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that, if directional indications (such as up, down, left, right, front, and back … …) are involved in the embodiment of the present invention, the directional indications are only used to explain the relative positional relationship between the components, the movement situation, and the like in a specific posture (as shown in the drawing), and if the specific posture is changed, the directional indications are changed accordingly.
In addition, if there is a description of "first", "second", etc. in an embodiment of the present invention, the description of "first", "second", etc. is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.
The invention provides a method for predicting air ticket prices by combining flight information and price sequences, which comprises the following steps:
s10, collecting historical flight characteristics and a recent price sequence, respectively extracting flight continuous characteristics and flight discrete characteristics from the historical flight characteristics, and carrying out unique hot coding on the flight discrete characteristics; respectively extracting price continuous features and price discrete features from the recent price sequence according to the continuous features, and carrying out one-hot coding on the price discrete features;
s20, establishing flight characteristic prediction models and price sequence prediction models respectively by using a machine learning model;
s30, inputting the flight continuous characteristic and the flight discrete characteristic after being coded into a flight characteristic prediction model to train and optimize the flight weight β of the model, and inputting the price continuous characteristic and the price discrete characteristic into a price sequence prediction model to train and optimize the sequence weight beta of the model;
s40 flight forecast price P output based on flight characteristic forecast modelstaticPrice sequence prediction model output sequence prediction price Pdynamicand constructing an objective prediction function by combining the optimized flight weight α and the optimized sequence weight β to obtain a prediction result.
In the embodiment of the invention, flight characteristics and price records of each flight with determined takeoff date under the condition of different days from the takeoff date are extracted from historical price data, the difference of the days from the creation date to the takeoff date is calculated and recorded as the 'days from the takeoff date', the continuous characteristics and the discrete characteristics in the flight characteristics are integrated as the input of a multilayer perceptron, and the corresponding price is used as the output to construct a plurality of training samples. The method comprises the steps of extracting prices and flight records of each determined takeoff date flight on different inquiry dates, calculating and recording the number difference between the creation date and the takeoff date as the number of days away from the takeoff date, integrating continuous features and discrete features in flight information, price sequences near the inquiry date Q and the number difference between each price and the takeoff date corresponding to each price in the sequence as input of a neural network, taking the corresponding price as output, and constructing a plurality of training samples.
Among the flight characteristics, the discrete characteristics include departure time, arrival time, departure airport, arrival airport, airline, actual carrier airline. For example, assuming that airlines share a, b, c, and d, the one-hot code corresponding to airline a is [1, 0, 0]]The one-hot code corresponding to airline b is [0, 1, 0]The one-hot code corresponding to the airline company c is [0, 0, 1, 0]]The one-hot code corresponding to the airline company d is [0, 0, 0, 1]]. A continuous feature is provided. In the flight characteristics, the continuous characteristics include flight duration, a takeoff month, a day of takeoff, whether the day of takeoff is a holiday, whether the day of takeoff is a weekend, whether flights cross the day, the number of stops, airport construction cost, fuel oil additional cost, tax cost, and whether the flights are shared flights. For example, if a flight has a flight duration of 95 minutes, takes off at 18 th 6 th, does not span the day, does not stop, has an airport construction fee of 50 yuan, has a fuel oil additional fee and a tax fee of 0 yuan, and is a shared flight, the feature vector corresponding to the flight is set to [95, 6, 18, 0, 0, 50, 0, 0, 1]. Setting the number of days from the takeoff date, aggregating the continuous characteristic and the discrete characteristic as input, inputting the input into the prediction model, and obtaining the corresponding predicted air ticket price Pstatic. By setting different days from the takeoff date, the predicted air ticket prices of different dates can be obtained. For example, if the date of the query day is 3 months and 5 days, and the price of the air ticket at 7 days after the query date is predicted, the "days from the departure date" may be set to 8, and the discrete and continuous features of the flight F may be aggregated as input ([8, 90, 3, 20, 1, 0, 50, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0]) After entering the prediction model, the air ticket price 820 for the 7 th day after the set inquiry date can be obtained. Setting discrete characteristics in flight characteristic-based prediction methodThe characterization steps are the same. The discrete feature code for this example is [0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0]。
The invention predicts the future price of the flight by combining the flight characteristic prediction model and the price sequence prediction model, can provide ticket purchasing reference for passengers, and is convenient for the passengers to make a better purchasing scheme according to the predicted price trend. Meanwhile, forecasting the price trend can also provide reference for the airline companies, so that the airline companies can adjust price pricing better to obtain more profits.
When the flight characteristic prediction model or the price sequence prediction model is used for predicting the future air ticket price, the number of days from the departure date is set, and the corresponding discrete characteristics and continuous characteristics are aggregated and input into the prediction model to obtain the flight predicted price or the sequence predicted price on the corresponding date. For example, if the departure date of flight F is D, the date of the query day is Q, and the ticket price i days after the query date Q is to be predicted, the "number of days from the departure date" may be set to be D-Q + i, and the discrete features and the continuous features of flight F are aggregated and input to the prediction model, so that the ticket price i days after the set query date Q can be obtained.
According to the invention, a new air ticket price prediction model is constructed by linearly combining the weight with the flight characteristic prediction model and the price sequence prediction model, so that the problem of accumulated errors in the prediction of long-term air ticket prices in the future by the single price sequence prediction model is solved, and the defect that the prediction model based on flight information is not accurate enough when the recent price fluctuates severely is overcome. The method not only reduces the error in the air ticket price prediction, but also relieves the failure problem of the accumulated error of the prediction model based on the price sequence. When the recent price fluctuates sharply, the prediction model based on the price sequence can make up the defect that the prediction model based on the flight information is not accurate enough.
Preferably, the target prediction function is specifically as follows:
Figure BDA0002353143070000081
wherein P isstaticFlight prediction price, P, output by the flight characteristic prediction modeldynamicPrices are predicted from the sequence output based on the price sequence prediction model.
Preferably, the machine learning model is a multilayer perception mechanism building model of a neural network, the neural network adopts a dynamic neural network built by a deep learning network framework pytorch, and the dynamic neural network comprises an optimizer which takes the multilayer perception mechanism as the model, takes the root mean square error RMSE as the loss function of the model and takes the Adam algorithm as the weight coefficient of the optimization model.
Preferably, the method of S30 includes:
s301, flight continuous characteristics and flight discrete characteristics are aggregated and then input into a flight characteristic prediction model; after the price continuous characteristic and the price flight discrete characteristic are aggregated, inputting a price sequence prediction model;
s302, the flight characteristic prediction model represents a mapping function of mapping flight characteristics to flight prices through a multilayer perceptron of a neural network, and flight prediction prices are output; the price sequence prediction model represents a mapping function of mapping price sequence characteristics to flight prices through a multilayer perceptron of a neural network, and outputs sequence prediction prices;
s303, respectively calculating flight errors of the predicted flight price and the actual price and sequence errors of the sequence predicted price and the actual price through a Root Mean Square Error (RMSE);
s304, adopting an Adam algorithm as an optimizer to carry out back propagation on flight errors so as to optimize flight weights α of the flight characteristic prediction model, and adopting the Adam algorithm as the optimizer to carry out back propagation on sequence errors so as to optimize sequence weights β of the price sequence prediction model.
In the embodiment of the invention, the multi-layer perceptron in the neural network is taken as an example and all models are realized by using the pytorch, the multi-layer perceptron used in the example comprises four layers, the first layer is an input layer, the number of the carrying airlines is generally different due to the number of take-off and landing airports in different routes, so that the take-off and landing airports are related, the one-hot coding lengths of the non-numerical characteristics of the airlines are different, and the number of the neurons in the input layer is determined by a specific route. The second layer is the first hidden layer, and the number of neurons in the layer is set to be 32. The third layer is a second hidden layer, and the number of neurons in the layer is set to be 32. The fourth layer is an output layer, the number of the neurons is set to be 1, and the predicted air ticket price is output. Between the input layer and the hidden layer, the hidden layer and the hidden layer, a non-linear transformation is introduced by a non-linear activation function, here a Relu activation function is used. The invention sets the root mean square error RMSE as a loss function for evaluating the error of a model to a training set, and the calculation formula of the RMSE is as follows:
Figure BDA0002353143070000091
where y denotes the true price, PstaticRepresenting the predicted price, and after taking the error, using Adam as an optimizer to back-propagate the error to update the parameters in the model.
Preferably, the flight continuity characteristics at least include flight duration, departure date, number of the day of takeoff, whether the day of takeoff is a holiday, whether the day of takeoff is a weekend, whether the flight is across the sky, number of stops, airport construction cost, fuel surcharge, tax, whether the flight is a shared flight, and number of days away from the day of takeoff.
In the embodiment of the invention, the continuous characteristics of the flight schedule include flight duration, takeoff month, the number of the takeoff day, whether flights cross the day, the number of the stop, airport construction cost, fuel oil additional cost, tax and whether the flights are shared flights. The flight F corresponds to a flight duration of 90 minutes, takes off in 3 months and 20 days, spans 1 day, has a number of stops of 0, has an engineering cost of 50 yuan, has a fuel cost and a tax cost of 0 yuan, and is a shared flight (corresponding to 1, not corresponding to 0). Thus, the continuation feature for flight F is set to [90, 3, 20, 1, 0, 50, 0, 0, 1 ].
Preferably, the price continuation feature includes, in addition to the contents included in the flight connection feature, at least a sequence of prices near the query date and a number of days from the departure date for each price in the sequence.
In the inventionIn the embodiment, the price continuation feature of the invention needs to additionally add a price sequence with a recent inquiry date and a sequence corresponding to the day number difference in addition to the step of setting the continuation feature in the prediction method of the flight feature. For example, the price series of the last 3 days of the query date Q is additionally set as { PQ-2,PQ-1,PQThe difference in the number of days corresponding to the date of takeoff is { d }Q-2,dQ-1,dQ}. In this example, the query date is 3 months and 5 days, the price sequence of the last 3 days is {863, 884, 845}, and the difference of the number of days from the takeoff date is {17, 16, 15 }. The consecutive features of this example are [90, 3, 20, 1, 0, 50, 0, 0, 1, 863, 17, 884, 16, 845, 15]。
Preferably, the price discrete feature and the flight discrete feature are the same and at least include a takeoff time, an arrival time, a takeoff airport, an arrival airport, an airline and an actual carrier airline, wherein the takeoff time and the arrival time are subjected to segmented unique hot coding, and the other discrete features are subjected to unique hot coding.
The machine learning model may build a model for a CART regression tree in a decision tree.
Preferably, the recent price sequence is composed of the lowest price sequence T days before flight F { PQ-T,PQ-(T-1),PQ-(T-3)...PQ-3,PQ-2,PQ-1}。
Preferably, the specific method for performing segmented one-hot coding on the takeoff time and the arrival time comprises: the 24 hours per day is divided into four time periods of [0:00-10:00], [10:00-14:00], [14:00-19:00] and [19:00-24:00], and the departure time and the arrival time correspond to the four time periods for independent thermal coding.
In the embodiment of the invention, the one-hot coding is carried out after the departure time and the arrival time are divided into four time periods (0:00-10:00, 10:00-14:00, 14:00-19:00 and 19:00-24: 00). For example, the takeoff time is 15:35, which belongs to the third time period, and the corresponding one-hot code is [0, 0, 1, 0] (the third position is 1, and the rest positions are 0).
Preferably, the multilayer perceptron comprises:
the input layer comprises a plurality of neurons, the number of the neurons is determined by an actual route, and the neurons are used for inputting flight continuous features and the flight discrete features after being coded into the input flight feature prediction model; inputting the price continuous characteristic and the coded price discrete characteristic into a price sequence prediction model;
first hidden layer h1Total 32 neurons, each neuron is connected with the input layer, and the input layer is subjected to nonlinear transformation h1=Relu(w1x+b1) Is obtained wherein w1To connect coefficients, b1In the flight characteristic prediction model, x represents flight continuous characteristics and flight discrete characteristics after being coded; in the price sequence prediction model, x represents a price continuous characteristic and a coded price discrete characteristic, and Relu is a linear rectification activation function;
second hidden layer h2Total 32 neurons, each neuron is fully connected with the first hidden layer, and the first hidden layer is subjected to nonlinear transformation h2=Relu(w2h1+b2) Is obtained wherein w2To connect coefficients, b2For bias, Relu is the linear rectification activation function;
the output layer P has 1 neuron, the neuron is fully connected with the second hidden layer, and the second hidden layer is subjected to linear transformation P ═ w3h2+b3Is obtained wherein w3To connect coefficients, b3For bias, P predicts the price for the outgoing ticket.
Compared with the prior art, the invention has the following advantages and technical effects:
the method combines two types of prediction models, can well combine the advantages of the two types of prediction models, solves the problem of accumulated error in predicting the future long-term air ticket price in the prediction model based on the price sequence by using the prediction model based on the flight characteristics, and overcomes the defect that the prediction model based on the flight information is not accurate enough when the recent price fluctuates severely.
The invention combines two types of prediction models to predict the future price of the flight, can provide ticket buying reference for passengers, and is convenient for the passengers to make a more optimal purchasing scheme according to the predicted price trend. Meanwhile, the predicted price trend can also provide reference for the airline company, so that the airline company can adjust price pricing better to obtain more profits.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention, and all modifications and equivalents of the present invention, which are made by the contents of the present specification and the accompanying drawings, or directly/indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A method for predicting the price of an air ticket by combining flight information and a price sequence is characterized by comprising the following steps:
s10, collecting historical flight characteristics and a recent price sequence, respectively extracting flight continuous characteristics and flight discrete characteristics from the historical flight characteristics, and carrying out unique hot coding on the flight discrete characteristics; respectively extracting price continuous features and price discrete features from the recent price sequence according to the continuous features, and carrying out one-hot coding on the price discrete features;
s20, establishing flight characteristic prediction models and price sequence prediction models respectively by using a machine learning model;
s30, inputting the flight continuous characteristic and the flight discrete characteristic after being coded into a flight characteristic prediction model to train and optimize the flight weight β of the model, and inputting the price continuous characteristic and the price discrete characteristic into a price sequence prediction model to train and optimize the sequence weight beta of the model;
s40 flight forecast price P output based on flight characteristic forecast modelstaticPrice sequence prediction model output sequence prediction price Pdynamicand constructing an objective prediction function by combining the optimized flight weight α and the optimized sequence weight β to obtain a prediction result.
2. The method of predicting the price of an air ticket in combination with flight information and price sequences as claimed in claim 1, wherein the objective prediction function is specifically as follows:
Figure FDA0002353143060000011
wherein P isstaticFlight prediction price, P, output by the flight characteristic prediction modeldynamicPrices are predicted from the sequence output based on the price sequence prediction model.
3. The method for predicting the price of the air ticket by combining flight information and price sequence according to claim 1, wherein the machine learning model is a multi-layer perception mechanism built model of a neural network, the neural network is a dynamic neural network built by a deep learning network framework pytorch, and the dynamic neural network comprises an optimizer which takes a multi-layer perception mechanism as a model, takes Root Mean Square Error (RMSE) as a loss function of the model and takes Adam algorithm as a weight coefficient of an optimization model.
4. The method for predicting the price of an air ticket based on flight information and price sequence as claimed in claim 3, wherein the method of S30 comprises:
s301, flight continuous characteristics and flight discrete characteristics are aggregated and then input into a flight characteristic prediction model; after the price continuous characteristic and the price flight discrete characteristic are aggregated, inputting a price sequence prediction model;
s302, the flight characteristic prediction model represents a mapping function of mapping flight characteristics to flight prices through a multilayer perceptron of a neural network, and flight prediction prices are output; the price sequence prediction model represents a mapping function of mapping price sequence characteristics to flight prices through a multilayer perceptron of a neural network, and outputs sequence prediction prices;
s303, respectively calculating flight errors of the predicted flight price and the actual price and sequence errors of the sequence predicted price and the actual price through a Root Mean Square Error (RMSE);
s304, adopting an Adam algorithm as an optimizer to carry out back propagation on flight errors so as to optimize flight weights α of the flight characteristic prediction model, and adopting the Adam algorithm as the optimizer to carry out back propagation on sequence errors so as to optimize sequence weights β of the price sequence prediction model.
5. The method for predicting the price of an air ticket in combination with flight information and a price sequence according to claim 1, wherein the flight continuity characteristics at least include flight duration, departure date, number of days on the day of departure, whether the day of departure is a holiday, whether the day of departure is a weekend, whether the flight is a cross-day, number of stops, airport construction cost, fuel surcharge, tax, whether the flight is a shared flight, number of days from the date of departure; the price continuation feature at least comprises a price sequence in the recent inquiry date and the number of days from the departure date corresponding to each price in the sequence besides the content included in the flight connection feature.
6. The method of claim 1, wherein the price discrete feature is the same as the flight discrete feature and comprises at least a departure time, an arrival time, a departure airport, an arrival airport, an airline and an actual carrier airline, wherein the departure time and the arrival time are segmentally encoded by one hot and the remaining discrete features are encoded by one hot.
7. The method of claim 1, wherein the machine learning model builds a model for a CART regression tree in a decision tree.
8. A method of predicting ticket prices combining flight information and price sequences according to claim 1 wherein the recent price sequence is { P consisting of the lowest price sequence T days before flight FQ-T,PQ-(T-1),PQ-(T-3)...PQ-3,PQ-2,PQ-1}。
9. The method for predicting the price of the air ticket by combining the flight information and the price sequence as claimed in claim 7, wherein the specific method for performing the segmented one-hot coding on the departure time and the arrival time comprises the following steps: the 24 hours per day is divided into four time periods of [0:00-10:00], [10:00-14:00], [14:00-19:00] and [19:00-24:00], and the departure time and the arrival time correspond to the four time periods for independent thermal coding.
10. The method of predicting a price of an air ticket in conjunction with flight information and a sequence of prices of claim 1, wherein the multi-tiered sense engine comprises:
the input layer comprises a plurality of neurons, the number of the neurons is determined by an actual route, and the neurons are used for inputting flight continuous features and the flight discrete features after being coded into the input flight feature prediction model; inputting the price continuous characteristic and the coded price discrete characteristic into a price sequence prediction model;
first hidden layer h1Total 32 neurons, each neuron is connected with the input layer, and the input layer is subjected to nonlinear transformation h1=Relu(w1x+b1) Is obtained wherein w1To connect coefficients, b1In the flight characteristic prediction model, x represents flight continuous characteristics and flight discrete characteristics after being coded; in the price sequence prediction model, x represents a price continuous characteristic and a coded price discrete characteristic, and Relu is a linear rectification activation function;
second hidden layer h2Total 32 neurons, each neuron is fully connected with the first hidden layer, and the first hidden layer is subjected to nonlinear transformation h2=Relu(w2h1+b2) Is obtained wherein w2To connect coefficients, b2For bias, Relu is the linear rectification activation function;
the output layer P has 1 neuron, the neuron is fully connected with the second hidden layer, and the second hidden layer is subjected to linear transformation P ═ w3h2+b3Is obtained wherein w3To connect coefficients, b3For bias, P predicts the price for the outgoing ticket.
CN201911424222.5A 2019-12-31 2019-12-31 Air ticket price prediction method combining flight information and price sequence Pending CN111178978A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911424222.5A CN111178978A (en) 2019-12-31 2019-12-31 Air ticket price prediction method combining flight information and price sequence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911424222.5A CN111178978A (en) 2019-12-31 2019-12-31 Air ticket price prediction method combining flight information and price sequence

Publications (1)

Publication Number Publication Date
CN111178978A true CN111178978A (en) 2020-05-19

Family

ID=70654331

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911424222.5A Pending CN111178978A (en) 2019-12-31 2019-12-31 Air ticket price prediction method combining flight information and price sequence

Country Status (1)

Country Link
CN (1) CN111178978A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111798275A (en) * 2020-07-06 2020-10-20 深圳市活力天汇科技股份有限公司 Domestic flight price prediction method
CN112132323A (en) * 2020-08-25 2020-12-25 汉海信息技术(上海)有限公司 Method and device for predicting value amount of commodity object and electronic equipment
CN112418559A (en) * 2020-12-09 2021-02-26 贵州优策网络科技有限公司 User selection behavior prediction method and device
CN113643076A (en) * 2021-10-13 2021-11-12 中航信移动科技有限公司 Air ticket price prediction method and device, computer equipment and storage medium
CN115294671A (en) * 2022-08-08 2022-11-04 杭州哲达科技股份有限公司 Air compressor outlet pressure prediction method and prediction system
CN117252631A (en) * 2023-11-14 2023-12-19 北京嗨飞科技有限公司 Method, device and equipment for predicting price trend of air ticket

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111798275A (en) * 2020-07-06 2020-10-20 深圳市活力天汇科技股份有限公司 Domestic flight price prediction method
CN111798275B (en) * 2020-07-06 2023-09-22 深圳市活力天汇科技股份有限公司 Domestic flight price prediction method
CN112132323A (en) * 2020-08-25 2020-12-25 汉海信息技术(上海)有限公司 Method and device for predicting value amount of commodity object and electronic equipment
CN112418559A (en) * 2020-12-09 2021-02-26 贵州优策网络科技有限公司 User selection behavior prediction method and device
CN112418559B (en) * 2020-12-09 2024-05-07 贵州优策网络科技有限公司 User selection behavior prediction method and device
CN113643076A (en) * 2021-10-13 2021-11-12 中航信移动科技有限公司 Air ticket price prediction method and device, computer equipment and storage medium
CN115294671A (en) * 2022-08-08 2022-11-04 杭州哲达科技股份有限公司 Air compressor outlet pressure prediction method and prediction system
CN117252631A (en) * 2023-11-14 2023-12-19 北京嗨飞科技有限公司 Method, device and equipment for predicting price trend of air ticket

Similar Documents

Publication Publication Date Title
CN111178978A (en) Air ticket price prediction method combining flight information and price sequence
Ke et al. Short-term electrical load forecasting method based on stacked auto-encoding and GRU neural network
CN109409626A (en) For the scheduling of airline operational and the Accumulation Model of resource allocation
Yan et al. A heuristic approach for airport gate assignments for stochastic flight delays
Lee Airline reservations forecasting: Probabilistic and statistical models of the booking process
CN109886444A (en) A kind of traffic passenger flow forecasting, device, equipment and storage medium in short-term
CN108846493A (en) A kind of air ticket booking number prediction technique based on improved incremental model
Sun et al. Operational risk in airline crew scheduling: do features of flight delays matter?
CN109767032A (en) A kind of business finance operation digital management optimization system based on data analysis
CN106104615A (en) For providing method and the server of one group of price evaluation value, such as air fare price evaluation value
Yang et al. Airport arrival flow prediction considering meteorological factors based on deep‐learning methods
JP2016168970A (en) Evaluation system and method for evaluating operation information
Zhang et al. Travel time prediction of urban public transportation based on detection of single routes
CN111340536A (en) Model training method, passenger seat rate progress prediction method, system, device and medium
Salgado et al. A short-term bus load forecasting system
CN110009939B (en) Flight delay prediction and sweep analysis method based on ASM
Yan et al. Inter-city bus scheduling under variable market share and uncertain market demands
CN115660728B (en) Air ticket sales order prediction method and device, electronic equipment and storage medium
CN115759386B (en) Method and device for predicting flight execution result of civil aviation flight and electronic equipment
CN112288187A (en) Big data-based electricity sales amount prediction method
CN108022009B (en) Combined prediction method for passenger flow of high-speed railway
La et al. Predictive model of air transportation management based on intelligent algorithms of wireless network communication
Elahe et al. An adaptive and parallel forecasting strategy for short-term power load based on second learning of error trend
JP3268520B2 (en) How to forecast gas demand
Padhi et al. Strategic revenue management under uncertainty: a case study on real estate projects in India

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20200702

Address after: Room 409, 4th floor, general building, 233 Kaifa Avenue, Guangzhou Economic and Technological Development Zone, Guangdong Province 510730

Applicant after: Guangzhou Civil Aviation Information Technology Co.,Ltd.

Applicant after: SUN YAT-SEN University

Address before: 510275 Xingang West Road, Guangdong, Guangzhou, No. 135, No.

Applicant before: SUN YAT-SEN University

RJ01 Rejection of invention patent application after publication

Application publication date: 20200519