CN115688971A - Wire network passenger flow control and train adjustment collaborative optimization method under train delay - Google Patents

Wire network passenger flow control and train adjustment collaborative optimization method under train delay Download PDF

Info

Publication number
CN115688971A
CN115688971A CN202211163621.2A CN202211163621A CN115688971A CN 115688971 A CN115688971 A CN 115688971A CN 202211163621 A CN202211163621 A CN 202211163621A CN 115688971 A CN115688971 A CN 115688971A
Authority
CN
China
Prior art keywords
train
delay
flow control
passenger flow
station
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211163621.2A
Other languages
Chinese (zh)
Other versions
CN115688971B (en
Inventor
郭建媛
卢伟康
秦勇
贾利民
孙方
孙琦
王月玥
唐雨昕
李�杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jiaotong University
Original Assignee
Beijing Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jiaotong University filed Critical Beijing Jiaotong University
Priority to CN202211163621.2A priority Critical patent/CN115688971B/en
Publication of CN115688971A publication Critical patent/CN115688971A/en
Application granted granted Critical
Publication of CN115688971B publication Critical patent/CN115688971B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Train Traffic Observation, Control, And Security (AREA)

Abstract

The invention provides a cooperative optimization method for network passenger flow control and train adjustment under train delay. The method comprises the following steps: acquiring time, position and duration characteristics of train delay, performing clustering iteration on delay scenes of a railway line network, and randomly generating delay scenes; constructing a railway line network passenger flow control offline model, and performing station-entering offline reinforcement learning training on the railway line network passenger flow control offline model by using a delay scene to obtain an optimized railway line network passenger flow control offline model; and generating an online training environment according to an actual delay occurrence scene, and performing reinforcement learning online training on the optimized railway line network passenger flow control offline model by using the online training environment to obtain a railway line network passenger flow control and operation adjustment collaborative optimization scheme. The invention can consider passenger behaviors and train operation plans on the road network level when the uncertainty delay occurs to the train, provides a specific scheme for passenger flow control and train operation, and improves the road network operation service level.

Description

Wire network passenger flow control and train adjustment collaborative optimization method under train delay
Technical Field
The invention relates to the technical field of urban rail transit operation organization, in particular to a cooperative optimization method for network passenger flow control and train adjustment under train delay.
Background
Due to the characteristics of rapidness and environmental protection, urban rail transit is developed rapidly, and a large number of urban traffic trips are attracted and borne. In the operation process of urban rail transit, train operation is possibly affected and delayed due to various reasons such as train components, signals, power supply and the like. The passenger flow in urban rail transit is huge, the train departure frequency is high, the load is high, the train is delayed, the normal operation of other trains on the line is influenced, the passenger flow is easily gathered, and the safety problem is caused.
At present, the conventional overtime operation adjusting method aiming at delayed trains in the prior art only compresses the operation time and reduces the stop time while neglecting the passenger flow characteristics; in recent years, adjustment strategies including jump stop and passenger flow control are gradually adopted in urban rail transit operation, and good practical effects are achieved.
The above-mentioned drawbacks of the conventional adjustment method for the trip point operation of delayed trains in the prior art include: the planning is mainly based on the experience of the dispatcher and a qualitative plan, and has great defects in the aspects of global property and accuracy. At present, most of passenger flow control and train jump and stop cooperative optimization research is gathered under daily morning and evening peak conditions, and currently utilized linear programming models, secondary programming models, nonlinear combined optimization models and the like are only suitable for a specific scene or a specific example, so that the model utilization conditions are difficult to meet when train operation delay actually occurs.
Disclosure of Invention
The embodiment of the invention provides a cooperative optimization method for network passenger flow control and train adjustment under train delay, so as to effectively improve the railway network operation service level.
In order to achieve the purpose, the invention adopts the following technical scheme.
A method for cooperative optimization of network passenger flow control and train adjustment under train delay comprises the following steps:
acquiring time, position and duration characteristics of train delay occurrence, performing clustering iteration on a railway line network delay scene according to the time, position and duration characteristics of the train delay occurrence, and randomly generating a delay scene;
constructing a railway line network passenger flow control offline model, and performing station-entering offline reinforcement learning training on the railway line network passenger flow control offline model by using the delay scene to obtain an optimized railway line network passenger flow control offline model;
and generating an online training environment according to an actual delay occurrence scene, and performing reinforcement learning online training on the optimized railway line network passenger flow control offline model by using the online training environment to obtain a railway line network passenger flow control and operation adjustment collaborative optimization scheme.
Preferably, the obtaining of the characteristics of time, position and duration of occurrence of train delay includes:
according to the specific time of the train delay, judging which time peak value the specific time of the train delay belongs to by using a time peak value HL = { early peak, next early peak, late peak, next late peak, noon peak and flat peak };
according to the actual position of the train delay, calculating the distance between the starting station and the terminal station of the delay line, the number of stations of the delay place, and judging the uplink and downlink directions of the delay place, wherein the uplink direction D =1 and the downlink direction D =2;
according to the time t from the delay of the train to the recovery of the operation, k time length levels TL = { t = is utilized<=t 1 ,t 1 <t<=t 2 ,t 2 <t<=t 3 ,…,t k <t, judging the duration grade of the time from the delay of the train to the recovery of the operation.
Preferably, the clustering iteration of the delay scenes of the railway line network is performed according to the characteristics of the time, the position and the duration of the occurrence of train delay, and the random generation of the delay scenes comprises:
randomly setting passenger travel and train delay according to a probability function, constructing initial train adjustment, and constructing an off-line training environment;
setting constraint conditions of passenger flow control and train adjustment according to the time, position and duration characteristics of the occurrence of train delay, repeatedly interacting the initial train adjustment and an offline training environment based on the constraint conditions of the passenger flow control and the train adjustment, performing reinforcement learning offline training, and outputting an offline model;
and the reinforcement learning offline training is iterated repeatedly under different offline training environments, and a delay scene is generated randomly.
Preferably, the randomly setting passenger travel and train delay according to the probability function includes:
according to the characteristics of passengers arriving at a station, the number of arriving persons of a service facility in a certain time is described, the randomness of the number of arriving persons at the station is described by adopting Poisson distribution, and the probability function of the Poisson distribution is as follows:
Figure BDA0003861249800000031
in the formula, the parameter lambda is an expected value of the occurrence frequency of random events in unit time and is used for describing the number of passengers arriving at a station on average in unit time, and k is the number of passengers;
according to the processes of the station-entering walking and the transfer walking of passengers, the station-entering walking time and the transfer walking time of the passengers in the rail transit station are described by using a probability function of normal distribution, wherein the probability function of normal distribution is as follows:
Figure BDA0003861249800000032
where x is a random variable, x obeys a mathematical expectation of μ, and the variance is σ 2 Normal distribution of (2) is marked as X-N (mu, sigma) 2 ) Mu represents the expected value of the travel time of the passenger, and x is distributed in [ mu-v, mu + v ]]In the interior of the container body,
according to the uncertainty of the train delay duration, a train breaks down at a certain station to cause delay, the delay time of the train at the position follows normal distribution, and the probability of the normal distribution is as follows:
Figure BDA0003861249800000033
where x is a random variable, x obeys a mathematical expectation of μ, and the variance is σ 2 Normal distribution of (d) is expressed as X to N (mu, sigma) 2 ) Mu represents the expected value of delay time of the train at the station, and x is distributed in [ mu-omega, mu + omega ]]And (4) the following steps.
Preferably, the constraint conditions of passenger flow control and train regulation include:
(1) Passenger flow control constraints
Figure BDA0003861249800000041
In the formula (I), the compound is shown in the specification,
Figure BDA0003861249800000042
in order to count the number of passengers arriving at the station,
Figure BDA0003861249800000043
gamma is the minimum passenger flow control coefficient for the number of passengers arriving at the station;
(2) Train capacity constraint
Figure BDA0003861249800000044
In the formula (I), the compound is shown in the specification,
Figure BDA0003861249800000045
to the highest full load rate, C i The number of passengers is determined for the train;
(3) Train station jump restraint
Figure BDA0003861249800000046
Figure BDA0003861249800000047
Where M is a set of time periods, N 0 In order to allow the set of station jumping, I is the set of trains;
(4) Restraint of train operation
Figure BDA0003861249800000048
Figure BDA0003861249800000049
In the formula (I), the compound is shown in the specification,
Figure BDA00038612498000000410
for the dwell time of the i train at the j +1 station,
Figure BDA00038612498000000411
for the departure time of the i train at the j station,
Figure BDA00038612498000000412
for the minimum run time of j station to j +1 station,
Figure BDA00038612498000000413
the minimum stay time of the train at the j station is taken;
(5) Platform infrastructure capacity constraints
Figure BDA00038612498000000414
In the formula, s n Rho is the maximum passenger flow density, and eta is the maximum capacity coefficient of the platform;
(6) Passenger transfer time constraints
Figure BDA0003861249800000051
In the formula (I), the compound is shown in the specification,
Figure BDA0003861249800000052
in order to provide the passenger with the time to get off the vehicle,
Figure BDA0003861249800000053
the travel time of the passengers;
preferably, the repeatedly interacting the train initial adjustment and the offline training environment based on the constraint conditions of the passenger flow control and the train adjustment to perform the reinforcement learning offline training includes:
step 1: the initial training times n =0, and training starts;
and 2, step: initializing a train delay operation plan, a time interval m, a state s and an award r;
and step 3: m = M +1, and when M is equal to M, skipping to step 8;
and 4, step 4: selecting a station on a line network, traversing a train arriving at the station, selecting an action according to the current state, and storing the action as an action packet;
and 5: inputting a current action packet and a current state, interacting with the environment, and obtaining a next state, a reward value and the number of the transfinites at the platform according to an environment function;
step 6: recording the current state s, the action a, the next state s' and the reward value r in a memory bank;
and 7: transmitting the data recorded in the memory base into a network for training and updating the state to obtain the reward value of the corresponding time period, and jumping to the step 3 after the reward value is accumulated to the reward r;
and step 8: and (5) skipping to the step 2 if n = n +1, n and G are woven, otherwise, finishing the training.
Preferably, the constructing of the railway line network passenger flow control offline model includes:
and constructing a railway line network passenger flow control offline model, wherein the railway line network passenger flow control offline model comprises corresponding delay scenes, neural network structures of a current network and a target network, parameter matrixes of each level, a loss function and an optimizer learning rate, and the railway line network passenger flow control offline model comprises data stored in a memory base and the size of a reward value obtained by summation after each training.
Preferably, the generating an online training environment according to an actual delay occurrence scenario, and performing online training of reinforcement learning on the optimized railway line network passenger flow control offline model by using the online training environment includes:
according to the characteristics of an actual delay scene, initializing an online model according to a stored optimized railway line passenger flow control offline model, wherein the online model inherits all parameters of the offline model and comprises data stored in a memory base;
according to the characteristics of an online environment, the updating frequency of an online model, the training set and the exploration rate are set, and a target network of the online model is only used for evaluation and does not update parameters with the increase of training times;
the online model has less training times, the online model takes action to interact with the environment, the environment returns a reward value and a next state, and only the current network parameters are updated in the training process for generating an accurate passenger flow control and train operation scheme.
Preferably, the obtaining of the railway line network passenger flow control and operation adjustment collaborative optimization scheme includes:
carrying out passenger flow control on passengers entering a rail transit platform by stages of station separation, and giving passenger flow control rate at specific time and place;
when delay occurs, a compression tracking interval and a stop time strategy are utilized to update a train schedule with the aim of recovering normal driving as soon as possible, a train drives according to the updated train schedule, and whether station jumping is carried out on a station platform allowing station jumping is selected.
According to the technical scheme provided by the embodiment of the invention, when the uncertainty delay occurs to the train, the passenger behavior and the train operation plan are considered on the road network level, the specific scheme of passenger flow control and train operation is provided, technical conditions are provided for delay passenger flow organization and dispersion, and the road network operation service level is improved.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic diagram illustrating an implementation principle of a cooperative optimization method for wire network passenger flow control and train adjustment under train delay according to an embodiment of the present invention;
fig. 2 is a flowchart of an initial train adjustment according to an embodiment of the present invention;
FIG. 3 is a block diagram of a simulation environment according to an embodiment of the present invention;
FIG. 4 is a flowchart of an offline reinforcement learning according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of an online training process for reinforcement learning according to an embodiment of the present invention;
FIG. 6 is a diagram illustrating an exemplary passenger flow control result according to an embodiment of the present invention;
fig. 7 is a diagram illustrating a train adjustment result according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are exemplary only for explaining the present invention and are not construed as limiting the present invention.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or coupled. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
For the convenience of understanding the embodiments of the present invention, the following description will be further explained by taking several specific embodiments as examples in conjunction with the drawings, and the embodiments are not to be construed as limiting the embodiments of the present invention.
Aiming at a train delay scene, the invention provides reinforcement learning-based cooperative optimization of wire network passenger flow control and train adjustment so as to reduce passenger flow aggregation caused by delay. The embodiment of the invention provides a method different from the conventional passenger flow control and train adjustment, and can obtain an accurate passenger flow control and train operation adjustment scheme by considering the passenger travel and train delay uncertain scene calculation under the network scale.
The processing flow of the cooperative optimization method for network passenger flow control and train adjustment under train delay provided by the embodiment of the invention is shown in figure 1, and comprises the following steps:
step 1, clustering the delay scenes of the railway line network according to the characteristics of time, position, duration and the like of occurrence of train delay, and iteratively and randomly generating the delay scenes according to clustering results;
step 2, constructing a railway line network passenger flow control offline model, and performing station entry offline reinforcement learning training on the railway line network passenger flow control offline model by using a delay scene to obtain an optimized railway line network passenger flow control offline model;
and 3, generating an online training environment according to an actual delay occurrence scene, and performing reinforcement learning online training on the optimized railway line network passenger flow control offline model by using the online training environment to obtain a railway line network passenger flow control and operation adjustment collaborative optimization scheme.
The flow chart of the train initial adjustment provided by the embodiment of the invention is shown in fig. 2, and the specific processing steps are as follows:
step 1: constructing a time matrix of a train operation diagram, and extracting arrival time, residence time and departure time of the train at each station in the planned operation diagram to form initial planned operation diagram information;
step 2: acquiring specific information of train delay, including train delay time, delay place and delay duration;
and step 3: judging whether delay occurs from the initial station according to the delay information and the initial plan operation diagram information, if the delay occurs, extracting the train number information with the delay, otherwise, skipping to the step 5;
and 4, step 4: according to the extracted train number information, a compressed retention time and operation time strategy is adopted for the subsequent operation of the train number, the arrival and departure time of the train is changed, and the driving constraint is met;
and 5: and (4) judging whether the terminal station is the terminal station or not, if so, ending the process of the terminal station, and otherwise, skipping to the step 3 after adding 1 to the serial number of the station.
The frame diagram of a simulation environment provided by the embodiment of the invention is shown in fig. 3, the simulation environment can simulate the passenger flow arrival, getting-on and getting-off, getting-off and transfer processes of each station on a subway line network, and generally comprises four sub-processes: arriving at a station, entering the station, getting on and off, transferring and leaving. If the control rate of the station entering amount in unit time is more than 0, passengers limited to enter the platform should wait outside the station and enter the station according to the control rate of the next stage and the arrival sequence of the passengers waiting outside the station in the previous stage; if the train jumps, the skipped station passenger needs to choose to take the next train. The input to the environment has two main aspects: on the one hand the input of data and on the other hand the current state and the action taken. The inputs and outputs of the environment are as follows.
And (3) environment input: the information data of the stations comprises the spatial distribution of each station on the rail transit network, the effective area and the maximum capacity of the platform; passenger flow OD (Origin to Destination) data, including name numbers of a departure station and a Destination station for passengers to go out and take a subway and card swiping time; train delay and operation data including train delay time, position and duration, an initial train operation schedule, train marshalling, train full load rate, and operator fixation; the current state set is the passenger flow demand of entering station of each station in the m time period, namely: the number of people wishing to enter the station; the actions taken, namely: passenger flow control rate, whether the current train jumps at the current station.
And (3) outputting an environment: the status of the next period, namely: the number of people who wish to enter the station in the next time period, and the current prize value.
A flowchart of offline reinforcement learning according to an embodiment of the present invention is shown in fig. 4, and the specific processing includes: the number of passengers arriving at each station in the initial period is input into the double-depth Q network as an observation state s, actions are selected and values are predicted in an action space, an action a is selected to interact with the environment to obtain an observation state s 'and a reward value r, and the observation state s' is used for updating the state s to be input as the state of the next period; the capacity of the memory base is fixed, the observation state s, the selection action a, the reward r and the observation state s' are stored according to the principle that old memory is squeezed out by new memory, n pieces of information are extracted from the memory base when the maximum capacity of the memory base is achieved, and the purpose of updating the network parameters is achieved by utilizing the information.
An online reinforcement learning schematic diagram provided by the embodiment of the present invention is shown in fig. 5, and the specific processing steps are as follows:
step 1: according to an actual delay scene, searching a proper offline model in an offline model library by using train delay time, position and duration, and initializing online model parameters;
step 2: according to the characteristics of the actual environment, the updating frequency of the online model is reduced, the size of the training set is reduced, and the exploration rate is set to be 0;
and step 3: and the intelligent agent selects corresponding actions in the current network according to the initial state input, inputs the states and the actions into the environment to carry out interaction between the passengers and the train, obtains the next state and reward, inputs the next state and reward into the network again, and finishes the updating for G times by continuously iterating and updating, wherein G is not more than 50 because only the current network parameters are updated.
An exemplary diagram of a passenger flow control result provided by an embodiment of the present invention is shown in fig. 6, which takes a chang-level line and a line No. 13 as an example, and when a delay occurs, the delay is divided into six time intervals to control the passenger flow of a part of stations, so that accurate passenger flow control time, passenger flow control location and passenger flow control intensity are provided.
The diagram of the train adjustment result provided by the embodiment of the invention is shown in fig. 7, 138 trains are delayed at a 13-line five-crossing station, the delay time is 5 minutes, and subsequent trains are delayed jointly, so that a reasonable train operation plan is obtained through the optimization method.
In summary, the invention provides a method different from the conventional passenger flow control and train adjustment, and can provide a scheme for more accurately calculating the passenger flow control and train operation adjustment under the condition of considering the uncertain scenes of passenger travel and train delay under the condition of network scale under the condition of train delay, thereby improving the scientific reasonability of passenger transportation organizations and relieving the problem of large passenger flow aggregation caused by train delay.
The method considers the passenger flow control of the train delay offline network scale and the cooperative optimization of the train operation adjustment, and improves the cooperative effect of the network and measures for large passenger flow dispersion; the uncertain reinforcement learning method considering passenger travel and train delay enables the optimization result of passenger flow control and train operation adjustment to have multi-stage dynamic decision and robustness characteristics, and usability of the result is improved.
Those of ordinary skill in the art will understand that: the figures are schematic representations of one embodiment, and the blocks or processes shown in the figures are not necessarily required to practice the present invention.
From the above description of the embodiments, it is clear to those skilled in the art that the present invention can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for apparatus or system embodiments, since they are substantially similar to method embodiments, they are described in relative terms, as long as they are described in partial descriptions of method embodiments. The above-described embodiments of the apparatus and system are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement without inventive effort.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (9)

1. A method for cooperative optimization of network passenger flow control and train adjustment under train delay is characterized by comprising the following steps:
acquiring time, position and duration characteristics of train delay occurrence, performing clustering iteration on a railway line network delay scene according to the time, position and duration characteristics of the train delay occurrence, and randomly generating a delay scene;
constructing a railway line network passenger flow control offline model, and performing station-entering offline reinforcement learning training on the railway line network passenger flow control offline model by using the delay scene to obtain an optimized railway line network passenger flow control offline model;
and generating an online training environment according to an actual delay occurrence scene, and performing reinforcement learning online training on the optimized railway line network passenger flow control offline model by using the online training environment to obtain a railway line network passenger flow control and operation adjustment collaborative optimization scheme.
2. The method according to claim 1, wherein the obtaining of the characteristics of time, position and duration of occurrence of train delay comprises:
according to the specific time of the train delay, judging which time peak value the specific time of the train delay belongs to by using a time peak value HL = { early peak, next early peak, late peak, next late peak, noon peak and flat peak };
according to the actual position of the train delay, calculating the distance between the starting station and the terminal station of the delay line, the number of stations of the delay place, and judging the uplink and downlink directions of the delay place, wherein the uplink direction D =1 and the downlink direction D =2;
according to the time t from the occurrence of train delay to the passing of the recovery operation, k time length grades TL = { t = is utilized<=t 1 ,t 1 <t<=t 2 ,t 2 <t<=t 3 ,…,t k <t, judging the duration grade of the time from the delay of the train to the recovery of the operation.
3. The method according to claim 1, wherein the clustering iteration of the railway network delay scenario is performed according to the time, location and duration characteristics of the occurrence of the train delay, and the delay scenario is randomly generated, and comprises:
randomly setting passenger flow travel and train delay according to a probability function, constructing train initial adjustment, and constructing an off-line training environment;
setting constraint conditions of passenger flow control and train adjustment according to the time, position and duration characteristics of the occurrence of train delay, repeatedly interacting the train initial adjustment and an off-line training environment based on the constraint conditions of the passenger flow control and the train adjustment, performing reinforcement learning off-line training, and outputting an off-line model;
and the reinforcement learning offline training is iterated repeatedly under different offline training environments, and a delay scene is generated randomly.
4. The method of claim 3, wherein randomly setting passenger travel and train delays according to a probability function comprises:
according to the characteristics of passengers arriving at a station, the number of arriving persons of a service facility in a certain time is described, the randomness of the number of arriving persons at the station is described by adopting Poisson distribution, and the probability function of the Poisson distribution is as follows:
Figure FDA0003861249790000021
in the formula, a parameter lambda is an expected value of the occurrence frequency of random events in unit time and is used for describing the number of passengers arriving at a station averagely in unit time, and k is the number of passengers;
according to the processes of the station-entering walking and the transfer walking of passengers, the station-entering walking time and the transfer walking time of the passengers in the rail transit station are described by using a probability function of normal distribution, wherein the probability function of normal distribution is as follows:
Figure FDA0003861249790000022
where x is a random variable, x obeys a mathematical expectation of μ, and variance σ 2 Normal distribution of (d) is expressed as X to N (mu, sigma) 2 ) Mu represents the expected value of the travel time of the passenger, and x is distributed in [ mu-v, mu + v ]]In the interior of said container body,
according to uncertainty of train delay duration, a train breaks down at a certain station to cause delay, the delay time of the train at the position follows normal distribution, and the probability of the normal distribution is as follows:
Figure FDA0003861249790000031
where x is a random variable, x obeys a mathematical expectation of μ, and variance σ 2 Normal distribution of (d) is expressed as X to N (mu, sigma) 2 ) Mu represents the expected value of delay time of the train at the station, and x is distributed in [ mu-omega, mu + omega]And (4) the following steps.
5. The method of claim 3, wherein the constraints of passenger flow control and train adjustment include:
(1) Passenger flow control constraints
Figure FDA0003861249790000032
In the formula (I), the compound is shown in the specification,
Figure FDA0003861249790000033
in order to count the number of passengers arriving at the station,
Figure FDA0003861249790000034
gamma is the minimum passenger flow control coefficient for the number of passengers arriving at the station;
(2) Train capacity constraint
Figure FDA0003861249790000035
In the formula (I), the compound is shown in the specification,
Figure FDA0003861249790000036
at the highest loading rate, C i The number of passengers is determined for the train;
(3) Train station jump restraint
Figure FDA0003861249790000037
Figure FDA0003861249790000038
Where M is a set of time periods, N 0 In order to allow the set of station jumping, I is the set of trains;
(4) Restraint of train operation
Figure FDA0003861249790000039
Figure FDA00038612497900000310
In the formula (I), the compound is shown in the specification,
Figure FDA0003861249790000041
for the dwell time of the i train at the j +1 station,
Figure FDA0003861249790000042
for the departure time of the i train at the j station,
Figure FDA0003861249790000043
for the minimum run time of the j station to the j +1 station,
Figure FDA0003861249790000044
the minimum residence time of the train at the j station is defined;
(5) Platform infrastructure capacity constraints
Figure FDA0003861249790000045
In the formula, s n Rho is the maximum passenger flow density, and eta is the maximum capacity coefficient of the platform;
(6) Passenger transfer time constraints
Figure FDA0003861249790000046
In the formula (I), the compound is shown in the specification,
Figure FDA0003861249790000047
in order to provide the passenger with the time to get off the vehicle,
Figure FDA0003861249790000048
the travel time of the passengers;
6. the method of claim 3, wherein said iteratively interacting said train initial adjustment with an offline training environment based on said passenger flow control and train adjustment constraints for reinforcement learning offline training comprises:
step 1: the initial training times n =0, and training starts;
step 2: initializing a train delay operation plan, a time interval m, a state s and an award r;
and step 3: m = M +1, and when M is equal to M, skipping to step 8;
and 4, step 4: selecting a station on a line network, traversing a train arriving at the station, selecting an action according to the current state, and storing the action as an action packet;
and 5: inputting a current action packet and a state, interacting with the environment, and obtaining a next state, a reward value and the number of the over-limit people at the platform according to an environment function;
step 6: recording the current state s, the action a, the next state s' and the reward value r in a memory bank;
and 7: transmitting the data recorded in the memory base into a network for training and updating the state, obtaining the reward value of the corresponding time period, and jumping to the step 3 after the reward value is accumulated to the reward r;
and 8: and (5) skipping to the step 2 if n = n +1, n and G are woven, otherwise, finishing the training.
7. The method of claim 3, wherein constructing the railway line network passenger flow control offline model comprises:
and constructing a railway line network passenger flow control offline model, wherein the railway line network passenger flow control offline model comprises corresponding delay scenes, neural network structures of a current network and a target network, parameter matrixes of each level, a loss function and an optimizer learning rate, and the railway line network passenger flow control offline model comprises data stored in a memory base and the size of a reward value obtained by summation after each training.
8. The method according to claim 1, wherein the generating of the online training environment according to the actual delay occurrence scenario, and performing online training of reinforcement learning on the optimized railway line network passenger flow control offline model by using the online training environment, comprises:
according to the characteristics of an actual delay scene, initializing an online model according to a stored optimized railway line network passenger flow control offline model, wherein the online model inherits all parameters of the offline model and comprises data stored in a memory base;
according to the characteristics of an online environment, the updating frequency of an online model, the training set and the exploration rate are set, a target network of the online model is only used for evaluation, and parameters are not updated along with the increase of training times;
the number of times of training of the online model is small, the online model takes actions to interact with the environment, the environment returns a reward value and the next state, and only the current network parameters are updated in the training process and are used for generating an accurate passenger flow control and train operation scheme.
9. The method of claim 1, wherein obtaining a coordinated optimization scheme for railway line network traffic control and operation adjustment comprises:
carrying out passenger flow control on passengers entering the rail transit platform by stages of station separation, and giving passenger flow control rate at specific time and place;
when delay occurs, a compression tracking interval and a stop time strategy are utilized to update the train schedule with the aim of recovering normal running as soon as possible, the train runs according to the updated train schedule, and whether station jumping is carried out or not is selected at a station which allows station jumping.
CN202211163621.2A 2022-09-23 2022-09-23 Network passenger flow control and train adjustment collaborative optimization method under train delay Active CN115688971B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211163621.2A CN115688971B (en) 2022-09-23 2022-09-23 Network passenger flow control and train adjustment collaborative optimization method under train delay

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211163621.2A CN115688971B (en) 2022-09-23 2022-09-23 Network passenger flow control and train adjustment collaborative optimization method under train delay

Publications (2)

Publication Number Publication Date
CN115688971A true CN115688971A (en) 2023-02-03
CN115688971B CN115688971B (en) 2023-08-01

Family

ID=85062022

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211163621.2A Active CN115688971B (en) 2022-09-23 2022-09-23 Network passenger flow control and train adjustment collaborative optimization method under train delay

Country Status (1)

Country Link
CN (1) CN115688971B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016045195A1 (en) * 2014-09-22 2016-03-31 北京交通大学 Passenger flow estimation method for urban rail network
WO2021089864A1 (en) * 2019-11-08 2021-05-14 Telefonaktiebolaget Lm Ericsson (Publ) Computer-implemented training of a policy model for specifying a configurable parameter of a telecommunications network, such as an antenna elevation degree of a network node, by smoothed-loss inverse propensity
CN112977553A (en) * 2021-03-05 2021-06-18 北京交通大学 Automatic train operation adjusting method
CN113361917A (en) * 2021-06-04 2021-09-07 中南大学 High-speed train rescheduling method based on dynamic passenger flow under strong wind condition
CN113525462A (en) * 2021-08-06 2021-10-22 中国科学院自动化研究所 Timetable adjusting method and device under delay condition and electronic equipment
CN113793042A (en) * 2021-09-18 2021-12-14 大连交通大学 Passenger flow control scheme compilation method for rail traffic line station
CN114298385A (en) * 2021-12-17 2022-04-08 南京理工大学 Subway train delay adjustment method considering passenger flow influence and regenerative braking energy utilization

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016045195A1 (en) * 2014-09-22 2016-03-31 北京交通大学 Passenger flow estimation method for urban rail network
WO2021089864A1 (en) * 2019-11-08 2021-05-14 Telefonaktiebolaget Lm Ericsson (Publ) Computer-implemented training of a policy model for specifying a configurable parameter of a telecommunications network, such as an antenna elevation degree of a network node, by smoothed-loss inverse propensity
CN112977553A (en) * 2021-03-05 2021-06-18 北京交通大学 Automatic train operation adjusting method
CN113361917A (en) * 2021-06-04 2021-09-07 中南大学 High-speed train rescheduling method based on dynamic passenger flow under strong wind condition
CN113525462A (en) * 2021-08-06 2021-10-22 中国科学院自动化研究所 Timetable adjusting method and device under delay condition and electronic equipment
CN113793042A (en) * 2021-09-18 2021-12-14 大连交通大学 Passenger flow control scheme compilation method for rail traffic line station
CN114298385A (en) * 2021-12-17 2022-04-08 南京理工大学 Subway train delay adjustment method considering passenger flow influence and regenerative braking energy utilization

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JIANYUAN GUO等: ""Short-Term Abnormal Passenger Flow Prediction Based on the Fusion of SVR and LSTM"", 《IEEE ACCESS》, vol. 7, pages 42946, XP011719370, DOI: 10.1109/ACCESS.2019.2907739 *

Also Published As

Publication number Publication date
CN115688971B (en) 2023-08-01

Similar Documents

Publication Publication Date Title
Joe et al. Deep reinforcement learning approach to solve dynamic vehicle routing problem with stochastic customers
Yu et al. A dynamic holding strategy in public transit systems with real-time information
CN111033535A (en) System and method for bus order scheduling
Cats et al. Optimizing the number and location of time point stops
CN112734097A (en) Unmanned train energy consumption prediction method, system and storage medium
Van Oort et al. Reliability improvement in short headway transit services: Schedule-and headway-based holding strategies
CN113276915B (en) Subway departure scheduling method and system
Kadri et al. An integrated Petri net and GA-based approach for performance optimisation of bicycle sharing systems
Li et al. Simulation-optimization for station capacities, fleet size, and trip pricing of one-way electric carsharing systems
Nguyen et al. Pareto routing and scheduling of dynamic urban rail transit services with multi-objective cross entropy method
Liu et al. Prediction algorithms for train arrival time in urban rail transit
CN110738356A (en) SDN-based electric vehicle charging intelligent scheduling method
Tian et al. A long-term shared autonomous vehicle system design problem considering relocation and pricing
CN113536692A (en) Intelligent dispatching method and system for high-speed rail train in uncertain environment
CN116523267B (en) Vehicle dispatching optimization method, system and storage medium suitable for rail transit
CN115688971A (en) Wire network passenger flow control and train adjustment collaborative optimization method under train delay
CN115352502B (en) Train operation scheme adjustment method and device, electronic equipment and storage medium
CN112382081A (en) Traffic flow prediction method based on multiple tasks
Othman et al. Enhancing realism in simulation through deep learning
Yu et al. Real-time holding control for transfer synchronization via robust multiagent reinforcement learning
CN115510664A (en) Instant delivery real-time cooperation scheduling system based on layered reinforcement learning
CN114117883A (en) Self-adaptive rail transit scheduling method, system and terminal based on reinforcement learning
US11657446B2 (en) Information processing apparatus for generating a vehicle operation plan in a plurality of different rental modes
Chow et al. Adaptive scheduling of mixed bus services with flexible fleet size assignment under demand uncertainty
CN109325625B (en) Bicycle quantity prediction method based on binary Gaussian heterogeneous poisson process

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant