CN116502776B - Flight recovery modeling method, electronic equipment and storage medium - Google Patents
Flight recovery modeling method, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN116502776B CN116502776B CN202310763116.XA CN202310763116A CN116502776B CN 116502776 B CN116502776 B CN 116502776B CN 202310763116 A CN202310763116 A CN 202310763116A CN 116502776 B CN116502776 B CN 116502776B
- Authority
- CN
- China
- Prior art keywords
- flight
- scheduling
- state
- delay
- delayed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 23
- 238000011084 recovery Methods 0.000 title claims abstract description 17
- 238000012549 training Methods 0.000 claims abstract description 44
- 230000007704 transition Effects 0.000 claims abstract description 31
- 230000009471 action Effects 0.000 claims description 47
- 230000003111 delayed effect Effects 0.000 claims description 35
- 230000010006 flight Effects 0.000 claims description 16
- 230000006870 function Effects 0.000 description 12
- 239000003795 chemical substances by application Substances 0.000 description 10
- 230000002787 reinforcement Effects 0.000 description 5
- 238000013528 artificial neural network Methods 0.000 description 4
- 238000005457 optimization Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 230000001934 delay Effects 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000004642 transportation engineering Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/40—Business processes related to the transportation industry
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Theoretical Computer Science (AREA)
- Tourism & Hospitality (AREA)
- General Physics & Mathematics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Development Economics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Traffic Control Systems (AREA)
Abstract
The invention provides a flight recovery modeling method, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring flight delay information of a target airport, and acquiring an initial scheduling state based on the acquired flight delay information; constructing an initial flight scheduling prediction model; acquiring a state transition sequence TS based on an initial flight scheduling state and an initial flight scheduling prediction model; h state transition sequences are randomly obtained from TS to serve as target state transition sequences, and shift states corresponding to the target state transition sequences serve as training samples; and training the initial flight scheduling prediction model by using a training sample to obtain a target flight scheduling prediction model. The invention can improve the efficiency and accuracy of the flight scheduling.
Description
Technical Field
The invention relates to the field of civil aviation flight recovery and deep learning research, in particular to a flight recovery modeling method, electronic equipment and a storage medium.
Background
The delayed flight recovery problem is used as a real-time optimization problem, has numerous and very complex constraint conditions, and belongs to the NP-hard problem. The solution complexity of such dynamic optimization problems grows exponentially with increasing decision and state variable observables, and this computational challenge is called dimensional curse. In recent years, researches on a flight delay recovery scheduling algorithm mainly focus on integer planning and column vector generation, meta-heuristic optimization algorithm, reinforcement learning and the like.
Integer programming and column vector generation algorithms build integer programming models based on constraints to generate a restoration scheduling scheme, but often do not consider minimization of comprehensive delay loss. The meta heuristic optimization algorithm is used for continuously iterating and optimizing the objective function to approach the optimal solution by establishing the objective function, but the problems of sinking into the local optimal solution and the like are often faced. The deep reinforcement learning uses the strong fitting capability of the deep neural network to continuously iterate learning to achieve the optimal strategy instead of the optimal solution by establishing a Markov decision process, so that the training neural network tends to have faster solving speed and higher accuracy for the same problem after convergence.
Disclosure of Invention
Aiming at the technical problems, the invention adopts the following technical scheme:
the embodiment of the invention provides a flight recovery modeling method, which comprises the following steps:
s100, acquiring flight delay information of a target airport, and acquiring an initial scheduling state based on the acquired flight delay information, wherein the initial scheduling state comprises the flight delay information and a flight take-off sequence;
s200, an initial flight scheduling prediction model is built, the input of the initial flight scheduling prediction model is a scheduling state, the probability prediction value of a decision action executed on the input scheduling state is output, and the decision action is an operation of exchanging the take-off sequence of any two flights in the input scheduling state; wherein, the decision action executed on the input shift state meets the set constraint condition;
s300, acquiring a state transition sequence TS= { TS based on the initial flight scheduling state and the initial flight scheduling prediction model 1 ,TS 2 ,……,TS j ,……,TS m -a }; wherein the j-th state transition sequence TS j =(S j ,a j ,r j ,φ j ),S j A is a shift state corresponding to the j-th state transition sequence j Is S j A corresponding target decision action; r is (r) j Is a as j Corresponding return value phi j Termination flag corresponding to the j-th state transition sequence, wherein phi 1 To phi m-1 Is a first set value phi m Is a second set value; wherein, the latter shift state in the shift states corresponding to any two adjacent state transition sequences in TS is obtained by executing corresponding target decision action on the former shift state, r j =DL j+1 -DL j ,DL j+1 To S pair j Execution a j The next shift state S obtained later j+1 Corresponding delay loss,DL j Is S j Corresponding delay loss; the value of j is 1 to m, and m is the number of state transition sequences;
s400, randomly acquiring h state transition sequences from a TS (transport stream) to serve as target state transition sequences, and taking a scheduling state corresponding to the target state transition sequences as a training sample;
s500, inputting training samples of a current batch into a current flight scheduling prediction model for training to obtain a maximum probability prediction value corresponding to each sample;
s600, acquiring a target decision action corresponding to each sample based on a probability prediction value corresponding to each sample, acquiring a next scheduling state corresponding to each sample based on the acquired target decision action, and inputting the acquired next scheduling state into a current flight scheduling prediction model for training to acquire a maximum probability prediction value corresponding to the next scheduling state corresponding to each sample;
s700, acquiring a current loss function value based on a maximum probability prediction value and a return value of a training sample of a current batch and a maximum probability prediction value corresponding to a next scheduling state corresponding to the training sample of the current batch, judging whether the current loss function value accords with a preset model training ending condition, if so, taking the current flight scheduling prediction model as a target flight scheduling prediction model, and if not, adjusting parameters of the current flight scheduling prediction model, and taking the training sample of the next batch as the training sample of the current batch, and executing S500.
Embodiments of the present invention provide a non-transitory computer readable storage medium having stored therein at least one instruction or at least one program loaded and executed by a processor to implement the foregoing method.
An embodiment of the present invention provides an electronic device including a processor and the aforementioned non-transitory computer-readable storage medium.
The invention has at least the following beneficial effects:
according to the flight recovery modeling method provided by the embodiment of the invention, for an initial flight delay state, the flight scheduling is performed by using reinforcement learning, the flight scheduling problem is regarded as a sequence decision process, the condition of the flight scheduling is regarded as a state, and the take-off sequence of exchanging certain two flights or the allocation of aircrafts is regarded as a decision action. For a given scheduling state, a decision action is selected randomly with probability through an epsilon-schedule strategy, or an intelligent agent selects two flights to exchange with the aircraft allocated by the flights according to delay loss of the current scheduling, so that the current scheduling state is changed, and meanwhile, the return value is given to judge whether the action is good or bad. Training the intelligent agent according to the return value, wherein the converged intelligent agent can be used as a target intelligent agent, when similar conditions occur again, the trained target intelligent agent can be directly used for scheduling prediction, an optimal scheduling scheme is obtained, and scheduling efficiency and accuracy can be improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a flight recovery modeling method according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.
When large-area delays occur at an airport, a large number of flight vehicles remain at the airport, resulting in economic losses. In order to reduce economic loss caused by flight delay and improve the efficiency of delayed recovery and flight scheduling, the embodiment of the invention provides a deep learning-based flight recovery modeling method, which is used for scheduling delayed flights and detained aircrafts based on a deep reinforcement learning algorithm so as to achieve the purpose of delayed recovery of flights.
The flight recovery modeling method provided by the embodiment of the invention, as shown in fig. 1, can include the following steps:
s100, acquiring flight delay information of a target airport, and acquiring an initial scheduling state based on the acquired flight delay information, wherein the initial scheduling state comprises the flight delay information and a flight take-off sequence.
In the embodiment of the invention, the flight delay information can be acquired through the information release platform of the target airport. In an exemplary embodiment of the present invention, the flight delay information may include at least an ID of the delayed flight, a delay time of the delayed flight, a flight time of the delayed flight, an average riding cost of the delayed flight, a number of users of the delayed flight, that is, a number of passengers of the delayed flight, a maximum user load capacity of an aircraft corresponding to the delayed flight, that is, a maximum passenger load number, an aircraft unit delay loss corresponding to the delayed flight, and a user unit delay loss corresponding to the delayed flight.
In the embodiment of the invention, the unit delay loss of the aircraft corresponding to each delay flight refers to the loss generated by the corresponding aircraft due to the fact that each delay flight delays for one unit time. In an embodiment of the present invention, the unit of time may be minutes. The user unit delay loss corresponding to each delay flight refers to loss brought to the user by each delay of the delay flight for one unit time, for example, for some important users, the delay can generate corresponding loss. In one exemplary embodiment, aircraft unit delay loss and user unit delay loss may be obtained based on existing relevant literature, for example, literature (mixed particle swarm algorithm for flight delay recovery scheduling, journal of transportation engineering, volume 8, phase 2, month 4 of 2008) published methods. In another exemplary embodiment, aircraft unit delay loss and user unit delay loss may be derived based on historical data, e.g., for an aircraft, a benefit from service may be obtained, and the benefit divided by the total length of service may result in a corresponding aircraft unit delay loss. For a certain flight, the total user delay loss and the total delay user number caused by delay since the operation can be obtained, and the ratio of the total user delay loss and the total delay user number is the delay loss of the user unit corresponding to the flight.
In the embodiment of the invention, the initial flight scheduling state can be randomly generated, and delay information corresponding to each flight and a corresponding take-off sequence are taken as a characteristic vector in the state. For example, the initial flight status may be represented as s0= (F 1 ,F 2 ,……,F s ,……,F n ),F s Delay information for the s-th delayed flight in the target airport, F s =(Dt s ,t s ,v s ,vm s ,w s ,C as ,Cp s ,d s ),Dt s To delay the delay time of the flight s, t s To delay the flight time of flight s, v s To delay the number of users for a flight s delay vm s Maximum user load, w, for delayed flights s s To delay the average riding cost of the flight s, i.e. average fare, C as Loss of delay for aircraft unit corresponding to delayed flight s, cp s Delay loss of user unit corresponding to delay flight s, d s To delay the take-off sequence of flights s; s has a value of 1 to n, n being the number of delayed flights in the destination airport.
S200, an initial flight scheduling prediction model is built, the input of the initial flight scheduling prediction model is a scheduling state, the probability prediction value of the decision action executed on the input scheduling state is output, the probability prediction value represents the preference degree of the corresponding decision action, and the larger the probability prediction value is, the larger the preference degree of the corresponding decision action is represented. The decision action is the operation of exchanging the take-off sequence of any two flights in the input scheduling state; wherein, the decision action executed on the input shift state meets the set constraint condition.
In an embodiment of the present invention, the framework of the initial flight scheduling prediction model may be a neural network structure. In one exemplary embodiment, the neural network structure may be ANN, RNN, LSTM, transformer or the like.
In the embodiment of the invention, when the corresponding decision action is executed on the input shift state, the corresponding feature vector is modified, because if the take-off sequence is changed, the corresponding delay time is changed, so that each feature vector in each shift state is accurate.
In an embodiment of the present invention, the set constraint condition may at least include the following conditions:
condition 1: the departure time of the flight cannot be earlier than the planned departure time;
condition 2: the number of users delayed by a flight cannot be larger than the maximum user bearing capacity of the aircraft corresponding to the flight;
condition 3: the aircraft can only execute one flight task at the same time;
condition 4: each flight can only be executed once.
In the embodiment of the invention, the decision action executed on the input scheduling state meets the set constraint condition, so that the action which cannot be executed can be removed, and the decision action can be realized through a screening mechanism function f (x). If the decision action executed by the model on the input shift state meets the set constraint condition, the corresponding probability prediction value of the action is the value calculated by the model without change, otherwise, the probability prediction value corresponding to the action is 0, namely the action is not selected.
S300, acquiring a state transition sequence TS= { TS based on the initial flight scheduling state and the initial flight scheduling prediction model 1 ,TS 2 ,……,TS j ,……,TS m -a }; wherein the j-th state transition sequence TS j =(S j ,a j ,r j ,φ j ),S j A is a shift state corresponding to the j-th state transition sequence j Is S j The corresponding target decision action is S j The executed decision action has the decision action corresponding to the maximum probability predicted value; r is (r) j Is a as j Corresponding return value phi j Termination flag corresponding to the j-th state transition sequence, wherein phi 1 To phi m-1 For a first set point, e.g. 0, phi m For the second set value, for example, 1; wherein, the latter shift state in the shift states corresponding to any two adjacent state transition sequences in TS is obtained by executing corresponding target decision action on the former shift state, r j =DL j+1 -DL j ,DL j+1 To S pair j Execution a j The next shift state S obtained later j+1 Corresponding delay loss, DL j Is S j Corresponding delay loss; the value of j is 1 to m, and m is the number of state transition sequences.
Further, S300 specifically includes:
s301, setting j=1, c=0, and c as a number counter for calculating the number of iterations;
s302, the current shift state S j Inputting the initial flight scheduling prediction model to obtain a corresponding output result X j =(X j1 ,X j2 ,……,X jh ,……,X jn );X jh For model pair S j A probability prediction value obtained when the h decision action is executed; obviously S 1 For initial shift state, i.e. S 1 =S 0 。
S303, obtaining X j The decision action corresponding to the maximum probability predicted value in the sequence is taken as S j Corresponding target decision action to obtain the next shift state S j+1 And obtain r j 。
S304, set c=c+1, if S j For optimal shift status or c=c 0 Phi is set j =1, give TS j Adding the control program into the current TS, and exiting the current control program; otherwise, set phi j =0, give TS j And adds to the current TS, executing S305; the initial value of TS is the null set. C (C) 0 The threshold value for the set iteration number may be an empirical value.
In the embodiment of the invention, S j The following conditions are satisfied for the optimal shift state: for S j The return value obtained by all the decision actions executed is negative.
S305, j=j+1 is set, and S302 is executed.
In the embodiment of the invention, the return value is used for measuring the future quality of the decision action, and is mainly related to the delay loss of the flight, if the delay loss is increased due to the action, the return is negative, and the return is positive. The magnitude of the reward is related not only to the current state after the action is performed, but also to the best state that is possible to reach after that state.
Further, in the embodiment of the present invention, the delay loss corresponding to each shift state satisfies the following conditions: DL (DL) j =∑ n i=1 (1+γ ji )(P ji +k ji1 ×C jif + k ji2 ×C jia + k ji3 ×C jip ) Wherein, gamma ji Is S j Delayed flight ST corresponding to the ID of the ith delayed flight in the system ji Importance coefficient, P of (2) ji For ST ji Corresponding invisible loss, C jif For ST ji Corresponding profit loss, namely that of the affiliated airlines, C jia For ST ji Corresponding aircraft delay loss, C jip For ST ji Corresponding user delay loss, k ji1 、k ji2 And k ji3 Respectively C jif 、C jia And C jip And the corresponding weight, n, is the number of delayed flights. In an embodiment of the invention, γ ji 、k ji1 、k ji2 And k ji3 The determination can be made according to the actual situation. For example, the more important a flight is, the greater the corresponding importance coefficient is. For another example, if a flight has a greater impact on the loss of profit than on the loss of delay of the user, the weight associated with the loss of profit is greater than the weight associated with the loss of delay of the user.
In the embodiment of the invention, the invisible loss is mainly determined by the probability that the flight passengers can not select civil aviation travel due to flight delay, in particular P ji =v ji ×w ji ×β ji ,v ji For ST ji Delay of timeNumber of users, w ji For ST ji Average riding cost of beta ji For ST ji Correspondingly, the disappointing rate function of the user is provided. In one exemplary embodiment, β ji =[(△LF ji /60) 2 ] 1/3 /29,0≤β ji ≤1。△LF ji For ST ji Corresponding aircraft unit delay loss.
Further, C jif =v ji ×Pf ji ×w ji ×Dt ji /t ji ,Pf ji For ST ji Can be obtained based on the corresponding profit recording table, dt ji For ST ji Delay time t of (2) ji For ST ji Is a time of flight of (a).
Further, C jia =Dt ji ×△LF ji ,C jip = Dt ji ×△LP ji ,Dt ji For ST ji Delay time of DeltaLP ji For ST ji Corresponding subscriber units are lost.
S400, randomly acquiring h state transition sequences from the TS to serve as target state transition sequences, and taking a scheduling state corresponding to the target state transition sequences as a training sample.
S500, inputting the training samples of the current batch into the current flight scheduling prediction model for training, and obtaining the maximum probability prediction value corresponding to each sample.
S600, based on the probability prediction value corresponding to each sample, acquiring a target decision action corresponding to each sample, acquiring a next scheduling state corresponding to each sample based on the acquired target decision action, and inputting the acquired next scheduling state into a current flight scheduling prediction model for training to obtain a maximum probability prediction value corresponding to the next scheduling state corresponding to each sample.
S700, acquiring a current loss function value based on a maximum probability prediction value and a return value of a training sample of a current batch and a maximum probability prediction value corresponding to a next scheduling state corresponding to the training sample of the current batch, judging whether the current loss function value accords with a preset model training ending condition, if so, taking the current flight scheduling prediction model as a target flight scheduling prediction model, and if not, adjusting parameters of the current flight scheduling prediction model, and taking the training sample of the next batch as the training sample of the current batch, and executing S500.
In the embodiment of the present invention, the current loss function value satisfies the following condition:
L=X-(R-λ×X n ) L is the current loss function value, R is the sum of the return values of the training samples of the current batch, X is the sum of the maximum probability prediction values of all the training samples of the current batch, X n And (3) taking the sum of maximum probability predicted values corresponding to the next scheduling state and corresponding to the training samples of the current batch as a discount factor, wherein lambda is a value from 0 to 1.
In the embodiment of the invention, the current loss function value considers the long-term return value, and the loss of the long-term return due to blind selection of the current maximum return can be avoided. Furthermore, the introduction of a discounting factor can balance the estimated bias and variance due to the over-simplified future rewards.
In the embodiment of the present invention, the preset model training ending condition may be that L is smaller than the set loss threshold, or the training iteration number is larger than the set iteration number.
In the embodiment of the present invention, the set loss threshold and the set iteration number may be empirical values. Those skilled in the art know that if the training number reaches the set iteration number, but the loss function has not converged, it is unreasonable to indicate that the training parameter is set, the training number needs to be increased, and the specific implementation may be in the prior art.
Further, the method provided by the embodiment of the invention can further comprise the following steps:
s800, the received waiting shift state is input into the target flight shift prediction model, and a corresponding target decision action is obtained and displayed.
According to the flight recovery modeling method provided by the embodiment of the invention, for an initial flight delay state, the flight scheduling is performed by using reinforcement learning, the flight scheduling problem is regarded as a sequence decision process, the condition of the flight scheduling is regarded as a state, and the take-off sequence of exchanging certain two flights or the allocation of aircrafts is regarded as a decision action. For a given scheduling state, a decision action is selected randomly with probability through an epsilon-schedule strategy, or an intelligent agent selects two flights to exchange with the aircraft allocated by the flights according to delay loss of the current scheduling, so that the current scheduling state is changed, and meanwhile, the return value is given to judge whether the action is good or bad. Training the intelligent agent according to the return value, wherein the converged intelligent agent can be used as a target intelligent agent, when similar conditions occur again, the trained target intelligent agent can be directly used for scheduling prediction, an optimal scheduling scheme is obtained, and scheduling efficiency and accuracy can be improved.
Embodiments of the present invention also provide a non-transitory computer readable storage medium that may be disposed in an electronic device to store at least one instruction or at least one program for implementing one of the methods embodiments, the at least one instruction or the at least one program being loaded and executed by the processor to implement the methods provided by the embodiments described above.
Embodiments of the present invention also provide an electronic device comprising a processor and the aforementioned non-transitory computer-readable storage medium.
Embodiments of the present invention also provide a computer program product comprising program code for causing an electronic device to carry out the steps of the method according to the various exemplary embodiments of the invention as described in the specification, when said program product is run on the electronic device.
While certain specific embodiments of the invention have been described in detail by way of example, it will be appreciated by those skilled in the art that the above examples are for illustration only and are not intended to limit the scope of the invention. Those skilled in the art will also appreciate that many modifications may be made to the embodiments without departing from the scope and spirit of the invention. The scope of the present disclosure is defined by the appended claims.
Claims (3)
1. A method of modeling flight recovery, the method comprising the steps of:
s100, acquiring flight delay information of a target airport, and acquiring an initial scheduling state based on the acquired flight delay information, wherein the initial scheduling state comprises the flight delay information and a flight take-off sequence;
s200, an initial flight scheduling prediction model is built, the input of the initial flight scheduling prediction model is a scheduling state, the probability prediction value of a decision action executed on the input scheduling state is output, and the decision action is an operation of exchanging the take-off sequence of any two flights in the input scheduling state; wherein, the decision action executed on the input shift state meets the set constraint condition;
s300, based on the initial flight scheduling state and the initial flight scheduling prediction model, acquiring a state transition sequence TS= { TS 1 ,TS 2 ,……,TS j ,……,TS m -a }; wherein the j-th state transition sequence TS j =(S j ,a j ,r j ,φ j ),S j A is a shift state corresponding to the j-th state transition sequence j Is S j A corresponding target decision action; r is (r) j Is a as j Corresponding return value phi j Termination flag corresponding to the j-th state transition sequence, wherein phi 1 To phi m-1 Is a first set value phi m Is a second set value; wherein, the latter shift state in the shift states corresponding to any two adjacent state transition sequences in TS is obtained by executing corresponding target decision action on the former shift state, r j =DL j+1 -DL j ,DL j+1 To S pair j Execution a j The next shift state S obtained later j+1 Corresponding delay loss, DL j Is S j Corresponding delay loss; the value of j is 1 to m, and m is the number of state transition sequences;
s400, randomly acquiring h state transition sequences from a TS (transport stream) to serve as target state transition sequences, and taking a scheduling state corresponding to the target state transition sequences as a training sample;
s500, inputting training samples of a current batch into a current flight scheduling prediction model for training to obtain a maximum probability prediction value corresponding to each sample;
s600, acquiring a target decision action corresponding to each sample based on a probability prediction value corresponding to each sample, acquiring a next scheduling state corresponding to each sample based on the acquired target decision action, and inputting the acquired next scheduling state into a current flight scheduling prediction model for training to acquire a maximum probability prediction value corresponding to the next scheduling state corresponding to each sample;
s700, acquiring a current loss function value based on a maximum probability prediction value and a return value of a training sample of a current batch and a maximum probability prediction value corresponding to a next scheduling state corresponding to the training sample of the current batch, judging whether the current loss function value accords with a preset model training ending condition, if so, taking a current flight scheduling prediction model as a target flight scheduling prediction model, if not, adjusting parameters of the current flight scheduling prediction model, and taking the training sample of the next batch as the training sample of the current batch, and executing S500;
the flight delay information at least comprises an ID of the delayed flight, delay time of the delayed flight, flight time of the delayed flight, average riding cost of the delayed flight, number of users of the delayed flight, maximum user bearing capacity of an aircraft corresponding to the delayed flight, aircraft unit delay loss corresponding to the delayed flight and user unit delay loss corresponding to the delayed flight;
wherein DL is j =∑ n i=1 (1+γ ji )(P ji +k ji1 ×C jif + k ji2 ×C jia + k ji3 ×C jip ) Wherein, gamma ji Is S j Delayed flight ST corresponding to the ID of the ith delayed flight in the system ji Importance coefficient, P of (2) ji For ST ji Corresponding invisible loss, C jif For ST ji Corresponding loss of profit, C jia For ST ji Corresponding aircraft delay loss, C jip For ST ji Corresponding user delay loss, k ji1 、k ji2 And k ji3 Respectively C jif 、C jia And C jip Corresponding weights, n being the number of delayed flights;
P ji =v ji ×w ji ×β ji ,v ji for ST ji Number of delayed users, w ji For ST ji Average riding cost of beta ji For ST ji A corresponding user disappointment rate function;
C jif =v ji ×Pf ji ×w ji ×Dt ji /t ji ,v ji for ST ji Number of delayed users, w ji For ST ji Average riding cost, pf ji For ST ji Is the average yield of Dt ji For ST ji Delay time t of (2) ji For ST ji Is a time of flight of (2);
C jia =Dt ji ×△LF ji ,C jip = Dt ji ×△LP ji ,Dt ji for ST ji Delay time of DeltaLF) ji For ST ji Corresponding aircraft unit delay loss, Δlp ji For ST ji Delay loss of corresponding user units;
the current loss function value satisfies the following condition:
L=X-(R-λ×X n ) L is the current loss function value, R is the sum of the return values of the training samples of the current batch, X is the sum of the maximum probability prediction values of all the training samples of the current batch, X n The sum of maximum probability prediction values corresponding to the next scheduling state corresponding to the training samples of the current batch is calculated, lambda is a discount factor, and the value is 0 to 1;
the set constraint conditions at least comprise the following conditions:
condition 1: the departure time of the flight cannot be earlier than the planned departure time;
condition 2: the number of users delayed by a flight cannot be larger than the maximum user bearing capacity of the aircraft corresponding to the flight;
condition 3: the aircraft can only execute one flight task at the same time;
condition 4: each flight can only be executed once.
2. A non-transitory computer readable storage medium having at least one instruction or at least one program stored therein, wherein the at least one instruction or the at least one program is loaded and executed by a processor to implement the method of claim 1.
3. An electronic device comprising a processor and the non-transitory computer readable storage medium of claim 2.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310763116.XA CN116502776B (en) | 2023-06-27 | 2023-06-27 | Flight recovery modeling method, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310763116.XA CN116502776B (en) | 2023-06-27 | 2023-06-27 | Flight recovery modeling method, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116502776A CN116502776A (en) | 2023-07-28 |
CN116502776B true CN116502776B (en) | 2023-08-25 |
Family
ID=87320584
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310763116.XA Active CN116502776B (en) | 2023-06-27 | 2023-06-27 | Flight recovery modeling method, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116502776B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102663262A (en) * | 2012-04-27 | 2012-09-12 | 中国南方航空股份有限公司 | Flight wave property cost accounting method based on immune algorithm |
CN108875128A (en) * | 2018-05-03 | 2018-11-23 | 西安理工大学 | A kind of flight recovery modeling method with decision factor |
CN109872074A (en) * | 2019-03-04 | 2019-06-11 | 中国民航大学 | Air net delay propagation model and method for building up based on SIS |
CN115310732A (en) * | 2022-10-12 | 2022-11-08 | 珠海翔翼航空技术有限公司 | Flight delay prediction method and system |
-
2023
- 2023-06-27 CN CN202310763116.XA patent/CN116502776B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102663262A (en) * | 2012-04-27 | 2012-09-12 | 中国南方航空股份有限公司 | Flight wave property cost accounting method based on immune algorithm |
CN108875128A (en) * | 2018-05-03 | 2018-11-23 | 西安理工大学 | A kind of flight recovery modeling method with decision factor |
CN109872074A (en) * | 2019-03-04 | 2019-06-11 | 中国民航大学 | Air net delay propagation model and method for building up based on SIS |
CN115310732A (en) * | 2022-10-12 | 2022-11-08 | 珠海翔翼航空技术有限公司 | Flight delay prediction method and system |
Non-Patent Citations (1)
Title |
---|
基于空铁联运的延误旅客行程恢复方法;陆溪;《科学技术与工程》;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN116502776A (en) | 2023-07-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Feng et al. | Heuristic hybrid game approach for fleet condition-based maintenance planning | |
CN111191934B (en) | Multi-target cloud workflow scheduling method based on reinforcement learning strategy | |
JP7486507B2 (en) | Reinforcement learning system and method for inventory management and optimization | |
CN112685165B (en) | Multi-target cloud workflow scheduling method based on joint reinforcement learning strategy | |
Shou et al. | Multi-agent reinforcement learning for Markov routing games: A new modeling paradigm for dynamic traffic assignment | |
CN111553118B (en) | Multi-dimensional continuous optimization variable global optimization method based on reinforcement learning | |
Chen | An intelligent hybrid system for wafer lot output time prediction | |
JP2020144483A (en) | Reinforcement learning method, reinforcement learning program, and reinforcement learning system | |
CN115081936B (en) | Method and device for scheduling observation tasks of multiple remote sensing satellites under emergency condition | |
CN109063870B (en) | Q learning-based combined service strategy optimization method and system | |
JP7315007B2 (en) | LEARNING DEVICE, LEARNING METHOD AND LEARNING PROGRAM | |
CN114661466A (en) | Task unloading method for intelligent workflow application in edge computing environment | |
CN116451737A (en) | PG-W-PSO method for improving particle swarm based on reinforcement learning strategy gradient | |
Qi et al. | Integrating prediction/estimation and optimization with applications in operations management | |
Wang et al. | Logistics-involved task scheduling in cloud manufacturing with offline deep reinforcement learning | |
Ghosh et al. | A deep ensemble method for multi-agent reinforcement learning: A case study on air traffic control | |
CN116502776B (en) | Flight recovery modeling method, electronic equipment and storage medium | |
KR20220142846A (en) | Reinforcement Learning method for load equalization in Gantt planning | |
Hing et al. | Reinforcement learning versus heuristics for order acceptance on a single resource | |
CN113220437B (en) | Workflow multi-target scheduling method and device | |
CN114648178B (en) | Operation and maintenance strategy optimization method of electric energy metering device based on DDPG algorithm | |
CN115865914A (en) | Task unloading method based on federal deep reinforcement learning in vehicle edge calculation | |
CN115271130A (en) | Dynamic scheduling method and system for maintenance order of ship main power equipment | |
EP3910560A1 (en) | Information processing apparatus, method of solving, and solving program | |
CN114861318A (en) | Automatic driving control parameter model training method, parameter obtaining method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |