CN116502776B - Flight recovery modeling method, electronic equipment and storage medium - Google Patents

Flight recovery modeling method, electronic equipment and storage medium Download PDF

Info

Publication number
CN116502776B
CN116502776B CN202310763116.XA CN202310763116A CN116502776B CN 116502776 B CN116502776 B CN 116502776B CN 202310763116 A CN202310763116 A CN 202310763116A CN 116502776 B CN116502776 B CN 116502776B
Authority
CN
China
Prior art keywords
flight
scheduling
state
delay
delayed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310763116.XA
Other languages
Chinese (zh)
Other versions
CN116502776A (en
Inventor
丁建立
刘德康
王静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Civil Aviation University of China
Original Assignee
Civil Aviation University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Civil Aviation University of China filed Critical Civil Aviation University of China
Priority to CN202310763116.XA priority Critical patent/CN116502776B/en
Publication of CN116502776A publication Critical patent/CN116502776A/en
Application granted granted Critical
Publication of CN116502776B publication Critical patent/CN116502776B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Tourism & Hospitality (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention provides a flight recovery modeling method, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring flight delay information of a target airport, and acquiring an initial scheduling state based on the acquired flight delay information; constructing an initial flight scheduling prediction model; acquiring a state transition sequence TS based on an initial flight scheduling state and an initial flight scheduling prediction model; h state transition sequences are randomly obtained from TS to serve as target state transition sequences, and shift states corresponding to the target state transition sequences serve as training samples; and training the initial flight scheduling prediction model by using a training sample to obtain a target flight scheduling prediction model. The invention can improve the efficiency and accuracy of the flight scheduling.

Description

Flight recovery modeling method, electronic equipment and storage medium
Technical Field
The invention relates to the field of civil aviation flight recovery and deep learning research, in particular to a flight recovery modeling method, electronic equipment and a storage medium.
Background
The delayed flight recovery problem is used as a real-time optimization problem, has numerous and very complex constraint conditions, and belongs to the NP-hard problem. The solution complexity of such dynamic optimization problems grows exponentially with increasing decision and state variable observables, and this computational challenge is called dimensional curse. In recent years, researches on a flight delay recovery scheduling algorithm mainly focus on integer planning and column vector generation, meta-heuristic optimization algorithm, reinforcement learning and the like.
Integer programming and column vector generation algorithms build integer programming models based on constraints to generate a restoration scheduling scheme, but often do not consider minimization of comprehensive delay loss. The meta heuristic optimization algorithm is used for continuously iterating and optimizing the objective function to approach the optimal solution by establishing the objective function, but the problems of sinking into the local optimal solution and the like are often faced. The deep reinforcement learning uses the strong fitting capability of the deep neural network to continuously iterate learning to achieve the optimal strategy instead of the optimal solution by establishing a Markov decision process, so that the training neural network tends to have faster solving speed and higher accuracy for the same problem after convergence.
Disclosure of Invention
Aiming at the technical problems, the invention adopts the following technical scheme:
the embodiment of the invention provides a flight recovery modeling method, which comprises the following steps:
s100, acquiring flight delay information of a target airport, and acquiring an initial scheduling state based on the acquired flight delay information, wherein the initial scheduling state comprises the flight delay information and a flight take-off sequence;
s200, an initial flight scheduling prediction model is built, the input of the initial flight scheduling prediction model is a scheduling state, the probability prediction value of a decision action executed on the input scheduling state is output, and the decision action is an operation of exchanging the take-off sequence of any two flights in the input scheduling state; wherein, the decision action executed on the input shift state meets the set constraint condition;
s300, acquiring a state transition sequence TS= { TS based on the initial flight scheduling state and the initial flight scheduling prediction model 1 ,TS 2 ,……,TS j ,……,TS m -a }; wherein the j-th state transition sequence TS j =(S j ,a j ,r j ,φ j ),S j A is a shift state corresponding to the j-th state transition sequence j Is S j A corresponding target decision action; r is (r) j Is a as j Corresponding return value phi j Termination flag corresponding to the j-th state transition sequence, wherein phi 1 To phi m-1 Is a first set value phi m Is a second set value; wherein, the latter shift state in the shift states corresponding to any two adjacent state transition sequences in TS is obtained by executing corresponding target decision action on the former shift state, r j =DL j+1 -DL j ,DL j+1 To S pair j Execution a j The next shift state S obtained later j+1 Corresponding delay loss,DL j Is S j Corresponding delay loss; the value of j is 1 to m, and m is the number of state transition sequences;
s400, randomly acquiring h state transition sequences from a TS (transport stream) to serve as target state transition sequences, and taking a scheduling state corresponding to the target state transition sequences as a training sample;
s500, inputting training samples of a current batch into a current flight scheduling prediction model for training to obtain a maximum probability prediction value corresponding to each sample;
s600, acquiring a target decision action corresponding to each sample based on a probability prediction value corresponding to each sample, acquiring a next scheduling state corresponding to each sample based on the acquired target decision action, and inputting the acquired next scheduling state into a current flight scheduling prediction model for training to acquire a maximum probability prediction value corresponding to the next scheduling state corresponding to each sample;
s700, acquiring a current loss function value based on a maximum probability prediction value and a return value of a training sample of a current batch and a maximum probability prediction value corresponding to a next scheduling state corresponding to the training sample of the current batch, judging whether the current loss function value accords with a preset model training ending condition, if so, taking the current flight scheduling prediction model as a target flight scheduling prediction model, and if not, adjusting parameters of the current flight scheduling prediction model, and taking the training sample of the next batch as the training sample of the current batch, and executing S500.
Embodiments of the present invention provide a non-transitory computer readable storage medium having stored therein at least one instruction or at least one program loaded and executed by a processor to implement the foregoing method.
An embodiment of the present invention provides an electronic device including a processor and the aforementioned non-transitory computer-readable storage medium.
The invention has at least the following beneficial effects:
according to the flight recovery modeling method provided by the embodiment of the invention, for an initial flight delay state, the flight scheduling is performed by using reinforcement learning, the flight scheduling problem is regarded as a sequence decision process, the condition of the flight scheduling is regarded as a state, and the take-off sequence of exchanging certain two flights or the allocation of aircrafts is regarded as a decision action. For a given scheduling state, a decision action is selected randomly with probability through an epsilon-schedule strategy, or an intelligent agent selects two flights to exchange with the aircraft allocated by the flights according to delay loss of the current scheduling, so that the current scheduling state is changed, and meanwhile, the return value is given to judge whether the action is good or bad. Training the intelligent agent according to the return value, wherein the converged intelligent agent can be used as a target intelligent agent, when similar conditions occur again, the trained target intelligent agent can be directly used for scheduling prediction, an optimal scheduling scheme is obtained, and scheduling efficiency and accuracy can be improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a flight recovery modeling method according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.
When large-area delays occur at an airport, a large number of flight vehicles remain at the airport, resulting in economic losses. In order to reduce economic loss caused by flight delay and improve the efficiency of delayed recovery and flight scheduling, the embodiment of the invention provides a deep learning-based flight recovery modeling method, which is used for scheduling delayed flights and detained aircrafts based on a deep reinforcement learning algorithm so as to achieve the purpose of delayed recovery of flights.
The flight recovery modeling method provided by the embodiment of the invention, as shown in fig. 1, can include the following steps:
s100, acquiring flight delay information of a target airport, and acquiring an initial scheduling state based on the acquired flight delay information, wherein the initial scheduling state comprises the flight delay information and a flight take-off sequence.
In the embodiment of the invention, the flight delay information can be acquired through the information release platform of the target airport. In an exemplary embodiment of the present invention, the flight delay information may include at least an ID of the delayed flight, a delay time of the delayed flight, a flight time of the delayed flight, an average riding cost of the delayed flight, a number of users of the delayed flight, that is, a number of passengers of the delayed flight, a maximum user load capacity of an aircraft corresponding to the delayed flight, that is, a maximum passenger load number, an aircraft unit delay loss corresponding to the delayed flight, and a user unit delay loss corresponding to the delayed flight.
In the embodiment of the invention, the unit delay loss of the aircraft corresponding to each delay flight refers to the loss generated by the corresponding aircraft due to the fact that each delay flight delays for one unit time. In an embodiment of the present invention, the unit of time may be minutes. The user unit delay loss corresponding to each delay flight refers to loss brought to the user by each delay of the delay flight for one unit time, for example, for some important users, the delay can generate corresponding loss. In one exemplary embodiment, aircraft unit delay loss and user unit delay loss may be obtained based on existing relevant literature, for example, literature (mixed particle swarm algorithm for flight delay recovery scheduling, journal of transportation engineering, volume 8, phase 2, month 4 of 2008) published methods. In another exemplary embodiment, aircraft unit delay loss and user unit delay loss may be derived based on historical data, e.g., for an aircraft, a benefit from service may be obtained, and the benefit divided by the total length of service may result in a corresponding aircraft unit delay loss. For a certain flight, the total user delay loss and the total delay user number caused by delay since the operation can be obtained, and the ratio of the total user delay loss and the total delay user number is the delay loss of the user unit corresponding to the flight.
In the embodiment of the invention, the initial flight scheduling state can be randomly generated, and delay information corresponding to each flight and a corresponding take-off sequence are taken as a characteristic vector in the state. For example, the initial flight status may be represented as s0= (F 1 ,F 2 ,……,F s ,……,F n ),F s Delay information for the s-th delayed flight in the target airport, F s =(Dt s ,t s ,v s ,vm s ,w s ,C as ,Cp s ,d s ),Dt s To delay the delay time of the flight s, t s To delay the flight time of flight s, v s To delay the number of users for a flight s delay vm s Maximum user load, w, for delayed flights s s To delay the average riding cost of the flight s, i.e. average fare, C as Loss of delay for aircraft unit corresponding to delayed flight s, cp s Delay loss of user unit corresponding to delay flight s, d s To delay the take-off sequence of flights s; s has a value of 1 to n, n being the number of delayed flights in the destination airport.
S200, an initial flight scheduling prediction model is built, the input of the initial flight scheduling prediction model is a scheduling state, the probability prediction value of the decision action executed on the input scheduling state is output, the probability prediction value represents the preference degree of the corresponding decision action, and the larger the probability prediction value is, the larger the preference degree of the corresponding decision action is represented. The decision action is the operation of exchanging the take-off sequence of any two flights in the input scheduling state; wherein, the decision action executed on the input shift state meets the set constraint condition.
In an embodiment of the present invention, the framework of the initial flight scheduling prediction model may be a neural network structure. In one exemplary embodiment, the neural network structure may be ANN, RNN, LSTM, transformer or the like.
In the embodiment of the invention, when the corresponding decision action is executed on the input shift state, the corresponding feature vector is modified, because if the take-off sequence is changed, the corresponding delay time is changed, so that each feature vector in each shift state is accurate.
In an embodiment of the present invention, the set constraint condition may at least include the following conditions:
condition 1: the departure time of the flight cannot be earlier than the planned departure time;
condition 2: the number of users delayed by a flight cannot be larger than the maximum user bearing capacity of the aircraft corresponding to the flight;
condition 3: the aircraft can only execute one flight task at the same time;
condition 4: each flight can only be executed once.
In the embodiment of the invention, the decision action executed on the input scheduling state meets the set constraint condition, so that the action which cannot be executed can be removed, and the decision action can be realized through a screening mechanism function f (x). If the decision action executed by the model on the input shift state meets the set constraint condition, the corresponding probability prediction value of the action is the value calculated by the model without change, otherwise, the probability prediction value corresponding to the action is 0, namely the action is not selected.
S300, acquiring a state transition sequence TS= { TS based on the initial flight scheduling state and the initial flight scheduling prediction model 1 ,TS 2 ,……,TS j ,……,TS m -a }; wherein the j-th state transition sequence TS j =(S j ,a j ,r j ,φ j ),S j A is a shift state corresponding to the j-th state transition sequence j Is S j The corresponding target decision action is S j The executed decision action has the decision action corresponding to the maximum probability predicted value; r is (r) j Is a as j Corresponding return value phi j Termination flag corresponding to the j-th state transition sequence, wherein phi 1 To phi m-1 For a first set point, e.g. 0, phi m For the second set value, for example, 1; wherein, the latter shift state in the shift states corresponding to any two adjacent state transition sequences in TS is obtained by executing corresponding target decision action on the former shift state, r j =DL j+1 -DL j ,DL j+1 To S pair j Execution a j The next shift state S obtained later j+1 Corresponding delay loss, DL j Is S j Corresponding delay loss; the value of j is 1 to m, and m is the number of state transition sequences.
Further, S300 specifically includes:
s301, setting j=1, c=0, and c as a number counter for calculating the number of iterations;
s302, the current shift state S j Inputting the initial flight scheduling prediction model to obtain a corresponding output result X j =(X j1 ,X j2 ,……,X jh ,……,X jn );X jh For model pair S j A probability prediction value obtained when the h decision action is executed; obviously S 1 For initial shift state, i.e. S 1 =S 0
S303, obtaining X j The decision action corresponding to the maximum probability predicted value in the sequence is taken as S j Corresponding target decision action to obtain the next shift state S j+1 And obtain r j
S304, set c=c+1, if S j For optimal shift status or c=c 0 Phi is set j =1, give TS j Adding the control program into the current TS, and exiting the current control program; otherwise, set phi j =0, give TS j And adds to the current TS, executing S305; the initial value of TS is the null set. C (C) 0 The threshold value for the set iteration number may be an empirical value.
In the embodiment of the invention, S j The following conditions are satisfied for the optimal shift state: for S j The return value obtained by all the decision actions executed is negative.
S305, j=j+1 is set, and S302 is executed.
In the embodiment of the invention, the return value is used for measuring the future quality of the decision action, and is mainly related to the delay loss of the flight, if the delay loss is increased due to the action, the return is negative, and the return is positive. The magnitude of the reward is related not only to the current state after the action is performed, but also to the best state that is possible to reach after that state.
Further, in the embodiment of the present invention, the delay loss corresponding to each shift state satisfies the following conditions: DL (DL) j =∑ n i=1 (1+γ ji )(P ji +k ji1 ×C jif + k ji2 ×C jia + k ji3 ×C jip ) Wherein, gamma ji Is S j Delayed flight ST corresponding to the ID of the ith delayed flight in the system ji Importance coefficient, P of (2) ji For ST ji Corresponding invisible loss, C jif For ST ji Corresponding profit loss, namely that of the affiliated airlines, C jia For ST ji Corresponding aircraft delay loss, C jip For ST ji Corresponding user delay loss, k ji1 、k ji2 And k ji3 Respectively C jif 、C jia And C jip And the corresponding weight, n, is the number of delayed flights. In an embodiment of the invention, γ ji 、k ji1 、k ji2 And k ji3 The determination can be made according to the actual situation. For example, the more important a flight is, the greater the corresponding importance coefficient is. For another example, if a flight has a greater impact on the loss of profit than on the loss of delay of the user, the weight associated with the loss of profit is greater than the weight associated with the loss of delay of the user.
In the embodiment of the invention, the invisible loss is mainly determined by the probability that the flight passengers can not select civil aviation travel due to flight delay, in particular P ji =v ji ×w ji ×β ji ,v ji For ST ji Delay of timeNumber of users, w ji For ST ji Average riding cost of beta ji For ST ji Correspondingly, the disappointing rate function of the user is provided. In one exemplary embodiment, β ji =[(△LF ji /60) 2 ] 1/3 /29,0≤β ji ≤1。△LF ji For ST ji Corresponding aircraft unit delay loss.
Further, C jif =v ji ×Pf ji ×w ji ×Dt ji /t ji ,Pf ji For ST ji Can be obtained based on the corresponding profit recording table, dt ji For ST ji Delay time t of (2) ji For ST ji Is a time of flight of (a).
Further, C jia =Dt ji ×△LF ji ,C jip = Dt ji ×△LP ji ,Dt ji For ST ji Delay time of DeltaLP ji For ST ji Corresponding subscriber units are lost.
S400, randomly acquiring h state transition sequences from the TS to serve as target state transition sequences, and taking a scheduling state corresponding to the target state transition sequences as a training sample.
S500, inputting the training samples of the current batch into the current flight scheduling prediction model for training, and obtaining the maximum probability prediction value corresponding to each sample.
S600, based on the probability prediction value corresponding to each sample, acquiring a target decision action corresponding to each sample, acquiring a next scheduling state corresponding to each sample based on the acquired target decision action, and inputting the acquired next scheduling state into a current flight scheduling prediction model for training to obtain a maximum probability prediction value corresponding to the next scheduling state corresponding to each sample.
S700, acquiring a current loss function value based on a maximum probability prediction value and a return value of a training sample of a current batch and a maximum probability prediction value corresponding to a next scheduling state corresponding to the training sample of the current batch, judging whether the current loss function value accords with a preset model training ending condition, if so, taking the current flight scheduling prediction model as a target flight scheduling prediction model, and if not, adjusting parameters of the current flight scheduling prediction model, and taking the training sample of the next batch as the training sample of the current batch, and executing S500.
In the embodiment of the present invention, the current loss function value satisfies the following condition:
L=X-(R-λ×X n ) L is the current loss function value, R is the sum of the return values of the training samples of the current batch, X is the sum of the maximum probability prediction values of all the training samples of the current batch, X n And (3) taking the sum of maximum probability predicted values corresponding to the next scheduling state and corresponding to the training samples of the current batch as a discount factor, wherein lambda is a value from 0 to 1.
In the embodiment of the invention, the current loss function value considers the long-term return value, and the loss of the long-term return due to blind selection of the current maximum return can be avoided. Furthermore, the introduction of a discounting factor can balance the estimated bias and variance due to the over-simplified future rewards.
In the embodiment of the present invention, the preset model training ending condition may be that L is smaller than the set loss threshold, or the training iteration number is larger than the set iteration number.
In the embodiment of the present invention, the set loss threshold and the set iteration number may be empirical values. Those skilled in the art know that if the training number reaches the set iteration number, but the loss function has not converged, it is unreasonable to indicate that the training parameter is set, the training number needs to be increased, and the specific implementation may be in the prior art.
Further, the method provided by the embodiment of the invention can further comprise the following steps:
s800, the received waiting shift state is input into the target flight shift prediction model, and a corresponding target decision action is obtained and displayed.
According to the flight recovery modeling method provided by the embodiment of the invention, for an initial flight delay state, the flight scheduling is performed by using reinforcement learning, the flight scheduling problem is regarded as a sequence decision process, the condition of the flight scheduling is regarded as a state, and the take-off sequence of exchanging certain two flights or the allocation of aircrafts is regarded as a decision action. For a given scheduling state, a decision action is selected randomly with probability through an epsilon-schedule strategy, or an intelligent agent selects two flights to exchange with the aircraft allocated by the flights according to delay loss of the current scheduling, so that the current scheduling state is changed, and meanwhile, the return value is given to judge whether the action is good or bad. Training the intelligent agent according to the return value, wherein the converged intelligent agent can be used as a target intelligent agent, when similar conditions occur again, the trained target intelligent agent can be directly used for scheduling prediction, an optimal scheduling scheme is obtained, and scheduling efficiency and accuracy can be improved.
Embodiments of the present invention also provide a non-transitory computer readable storage medium that may be disposed in an electronic device to store at least one instruction or at least one program for implementing one of the methods embodiments, the at least one instruction or the at least one program being loaded and executed by the processor to implement the methods provided by the embodiments described above.
Embodiments of the present invention also provide an electronic device comprising a processor and the aforementioned non-transitory computer-readable storage medium.
Embodiments of the present invention also provide a computer program product comprising program code for causing an electronic device to carry out the steps of the method according to the various exemplary embodiments of the invention as described in the specification, when said program product is run on the electronic device.
While certain specific embodiments of the invention have been described in detail by way of example, it will be appreciated by those skilled in the art that the above examples are for illustration only and are not intended to limit the scope of the invention. Those skilled in the art will also appreciate that many modifications may be made to the embodiments without departing from the scope and spirit of the invention. The scope of the present disclosure is defined by the appended claims.

Claims (3)

1. A method of modeling flight recovery, the method comprising the steps of:
s100, acquiring flight delay information of a target airport, and acquiring an initial scheduling state based on the acquired flight delay information, wherein the initial scheduling state comprises the flight delay information and a flight take-off sequence;
s200, an initial flight scheduling prediction model is built, the input of the initial flight scheduling prediction model is a scheduling state, the probability prediction value of a decision action executed on the input scheduling state is output, and the decision action is an operation of exchanging the take-off sequence of any two flights in the input scheduling state; wherein, the decision action executed on the input shift state meets the set constraint condition;
s300, based on the initial flight scheduling state and the initial flight scheduling prediction model, acquiring a state transition sequence TS= { TS 1 ,TS 2 ,……,TS j ,……,TS m -a }; wherein the j-th state transition sequence TS j =(S j ,a j ,r j ,φ j ),S j A is a shift state corresponding to the j-th state transition sequence j Is S j A corresponding target decision action; r is (r) j Is a as j Corresponding return value phi j Termination flag corresponding to the j-th state transition sequence, wherein phi 1 To phi m-1 Is a first set value phi m Is a second set value; wherein, the latter shift state in the shift states corresponding to any two adjacent state transition sequences in TS is obtained by executing corresponding target decision action on the former shift state, r j =DL j+1 -DL j ,DL j+1 To S pair j Execution a j The next shift state S obtained later j+1 Corresponding delay loss, DL j Is S j Corresponding delay loss; the value of j is 1 to m, and m is the number of state transition sequences;
s400, randomly acquiring h state transition sequences from a TS (transport stream) to serve as target state transition sequences, and taking a scheduling state corresponding to the target state transition sequences as a training sample;
s500, inputting training samples of a current batch into a current flight scheduling prediction model for training to obtain a maximum probability prediction value corresponding to each sample;
s600, acquiring a target decision action corresponding to each sample based on a probability prediction value corresponding to each sample, acquiring a next scheduling state corresponding to each sample based on the acquired target decision action, and inputting the acquired next scheduling state into a current flight scheduling prediction model for training to acquire a maximum probability prediction value corresponding to the next scheduling state corresponding to each sample;
s700, acquiring a current loss function value based on a maximum probability prediction value and a return value of a training sample of a current batch and a maximum probability prediction value corresponding to a next scheduling state corresponding to the training sample of the current batch, judging whether the current loss function value accords with a preset model training ending condition, if so, taking a current flight scheduling prediction model as a target flight scheduling prediction model, if not, adjusting parameters of the current flight scheduling prediction model, and taking the training sample of the next batch as the training sample of the current batch, and executing S500;
the flight delay information at least comprises an ID of the delayed flight, delay time of the delayed flight, flight time of the delayed flight, average riding cost of the delayed flight, number of users of the delayed flight, maximum user bearing capacity of an aircraft corresponding to the delayed flight, aircraft unit delay loss corresponding to the delayed flight and user unit delay loss corresponding to the delayed flight;
wherein DL is j =∑ n i=1 (1+γ ji )(P ji +k ji1 ×C jif + k ji2 ×C jia + k ji3 ×C jip ) Wherein, gamma ji Is S j Delayed flight ST corresponding to the ID of the ith delayed flight in the system ji Importance coefficient, P of (2) ji For ST ji Corresponding invisible loss, C jif For ST ji Corresponding loss of profit, C jia For ST ji Corresponding aircraft delay loss, C jip For ST ji Corresponding user delay loss, k ji1 、k ji2 And k ji3 Respectively C jif 、C jia And C jip Corresponding weights, n being the number of delayed flights;
P ji =v ji ×w ji ×β ji ,v ji for ST ji Number of delayed users, w ji For ST ji Average riding cost of beta ji For ST ji A corresponding user disappointment rate function;
C jif =v ji ×Pf ji ×w ji ×Dt ji /t ji ,v ji for ST ji Number of delayed users, w ji For ST ji Average riding cost, pf ji For ST ji Is the average yield of Dt ji For ST ji Delay time t of (2) ji For ST ji Is a time of flight of (2);
C jia =Dt ji ×△LF ji ,C jip = Dt ji ×△LP ji ,Dt ji for ST ji Delay time of DeltaLF) ji For ST ji Corresponding aircraft unit delay loss, Δlp ji For ST ji Delay loss of corresponding user units;
the current loss function value satisfies the following condition:
L=X-(R-λ×X n ) L is the current loss function value, R is the sum of the return values of the training samples of the current batch, X is the sum of the maximum probability prediction values of all the training samples of the current batch, X n The sum of maximum probability prediction values corresponding to the next scheduling state corresponding to the training samples of the current batch is calculated, lambda is a discount factor, and the value is 0 to 1;
the set constraint conditions at least comprise the following conditions:
condition 1: the departure time of the flight cannot be earlier than the planned departure time;
condition 2: the number of users delayed by a flight cannot be larger than the maximum user bearing capacity of the aircraft corresponding to the flight;
condition 3: the aircraft can only execute one flight task at the same time;
condition 4: each flight can only be executed once.
2. A non-transitory computer readable storage medium having at least one instruction or at least one program stored therein, wherein the at least one instruction or the at least one program is loaded and executed by a processor to implement the method of claim 1.
3. An electronic device comprising a processor and the non-transitory computer readable storage medium of claim 2.
CN202310763116.XA 2023-06-27 2023-06-27 Flight recovery modeling method, electronic equipment and storage medium Active CN116502776B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310763116.XA CN116502776B (en) 2023-06-27 2023-06-27 Flight recovery modeling method, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310763116.XA CN116502776B (en) 2023-06-27 2023-06-27 Flight recovery modeling method, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN116502776A CN116502776A (en) 2023-07-28
CN116502776B true CN116502776B (en) 2023-08-25

Family

ID=87320584

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310763116.XA Active CN116502776B (en) 2023-06-27 2023-06-27 Flight recovery modeling method, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116502776B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663262A (en) * 2012-04-27 2012-09-12 中国南方航空股份有限公司 Flight wave property cost accounting method based on immune algorithm
CN108875128A (en) * 2018-05-03 2018-11-23 西安理工大学 A kind of flight recovery modeling method with decision factor
CN109872074A (en) * 2019-03-04 2019-06-11 中国民航大学 Air net delay propagation model and method for building up based on SIS
CN115310732A (en) * 2022-10-12 2022-11-08 珠海翔翼航空技术有限公司 Flight delay prediction method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663262A (en) * 2012-04-27 2012-09-12 中国南方航空股份有限公司 Flight wave property cost accounting method based on immune algorithm
CN108875128A (en) * 2018-05-03 2018-11-23 西安理工大学 A kind of flight recovery modeling method with decision factor
CN109872074A (en) * 2019-03-04 2019-06-11 中国民航大学 Air net delay propagation model and method for building up based on SIS
CN115310732A (en) * 2022-10-12 2022-11-08 珠海翔翼航空技术有限公司 Flight delay prediction method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于空铁联运的延误旅客行程恢复方法;陆溪;《科学技术与工程》;全文 *

Also Published As

Publication number Publication date
CN116502776A (en) 2023-07-28

Similar Documents

Publication Publication Date Title
Feng et al. Heuristic hybrid game approach for fleet condition-based maintenance planning
CN111191934B (en) Multi-target cloud workflow scheduling method based on reinforcement learning strategy
JP7486507B2 (en) Reinforcement learning system and method for inventory management and optimization
CN112685165B (en) Multi-target cloud workflow scheduling method based on joint reinforcement learning strategy
Shou et al. Multi-agent reinforcement learning for Markov routing games: A new modeling paradigm for dynamic traffic assignment
CN111553118B (en) Multi-dimensional continuous optimization variable global optimization method based on reinforcement learning
Chen An intelligent hybrid system for wafer lot output time prediction
JP2020144483A (en) Reinforcement learning method, reinforcement learning program, and reinforcement learning system
CN115081936B (en) Method and device for scheduling observation tasks of multiple remote sensing satellites under emergency condition
CN109063870B (en) Q learning-based combined service strategy optimization method and system
JP7315007B2 (en) LEARNING DEVICE, LEARNING METHOD AND LEARNING PROGRAM
CN114661466A (en) Task unloading method for intelligent workflow application in edge computing environment
CN116451737A (en) PG-W-PSO method for improving particle swarm based on reinforcement learning strategy gradient
Qi et al. Integrating prediction/estimation and optimization with applications in operations management
Wang et al. Logistics-involved task scheduling in cloud manufacturing with offline deep reinforcement learning
Ghosh et al. A deep ensemble method for multi-agent reinforcement learning: A case study on air traffic control
CN116502776B (en) Flight recovery modeling method, electronic equipment and storage medium
KR20220142846A (en) Reinforcement Learning method for load equalization in Gantt planning
Hing et al. Reinforcement learning versus heuristics for order acceptance on a single resource
CN113220437B (en) Workflow multi-target scheduling method and device
CN114648178B (en) Operation and maintenance strategy optimization method of electric energy metering device based on DDPG algorithm
CN115865914A (en) Task unloading method based on federal deep reinforcement learning in vehicle edge calculation
CN115271130A (en) Dynamic scheduling method and system for maintenance order of ship main power equipment
EP3910560A1 (en) Information processing apparatus, method of solving, and solving program
CN114861318A (en) Automatic driving control parameter model training method, parameter obtaining method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant