CN113562039B

CN113562039B - Multi-vehicle cooperation oriented automatic operation diagram adjusting method and device

Info

Publication number: CN113562039B
Application number: CN202111054857.8A
Authority: CN
Inventors: 王荣笙; 张琦; 张秀广; 张涛; 王涛; 丁舒忻
Original assignee: China Academy of Railway Sciences Corp Ltd CARS; Signal and Communication Research Institute of CARS; Beijing Ruichi Guotie Intelligent Transport Systems Engineering Technology Co Ltd; Beijing Huatie Information Technology Co Ltd
Current assignee: China Academy of Railway Sciences Corp Ltd CARS; Signal and Communication Research Institute of CARS; Beijing Ruichi Guotie Intelligent Transport Systems Engineering Technology Co Ltd; Beijing Huatie Information Technology Co Ltd
Priority date: 2021-09-09
Filing date: 2021-09-09
Publication date: 2022-04-29
Anticipated expiration: 2041-09-09
Also published as: CN113562039A

Abstract

The invention discloses a method and a device for automatically adjusting a running chart facing multi-vehicle cooperation. Wherein, the method comprises the following steps: acquiring line speed limit information; calculating to obtain a train operation analysis result according to the line speed limit information; performing multi-train cooperative adjustment operation according to the train operation analysis result, and generating scheduling information; and adjusting the train operation diagram according to the scheduling information. The invention solves the technical problems that the arrival and departure time of a train is mainly adjusted singly, the time coordination relationship among train operation lines is not fully considered, and the theoretical method for cooperatively adjusting the train operation diagram is less based on indexes such as the minimum operation time of the section of the train, the interval time of the tracked train between the train operation lines, the buffer time of the section, the maximum traction and braking capacity of the train and the like.

Description

Multi-vehicle cooperation oriented automatic operation diagram adjusting method and device

Technical Field

The invention relates to the field of railway transportation scheduling, in particular to a method and a device for automatically adjusting a running chart facing multi-vehicle cooperation.

Background

Along with the continuous development of intelligent science and technology, people use intelligent equipment more and more among life, work, the study, use intelligent science and technology means, improved the quality of people's life, increased the efficiency of people's study and work.

With the continuous increase of the mileage of China railway operation, the railway transportation dispatching command system is in urgent need of digital and intelligent development. When different types of emergencies such as wind, rain, snow, foreign body intrusion, passenger flow abnormal movement and the like occur on a railway line, a train often encounters different degrees of late points, and a dispatcher is in charge of recovery and adjustment after the train late points under the supervision of a distributed self-disciplined Traffic Control (CTC). In the process of adjusting the train operation diagram, the arrival and departure time of a single adjustment train is taken as the main time, the time coordination relationship among the train operation lines is not fully considered, and the theoretical method for cooperatively adjusting the train operation diagram is less based on indexes such as the minimum operation time of the train section, the train tracking interval time among the train operation lines, the section buffer time, the maximum traction and braking capacity of the train and the like. Therefore, under the scene of multiple trains at a later time, an automatic adjustment method for a running chart oriented to multi-train cooperation needs to be researched urgently.

The invention provides a multi-train cooperation oriented automatic adjustment method for a running chart, which is used for adjusting the running time of a train interval and the departure time of a train at a station on the one hand based on deep reinforcement learning by taking the minimum total delay time of the train as an optimization target. On the other hand, on the premise that the continuous departure interval time is not changed, the minimum tracking train interval time of the train in each temporary speed-limiting section is reduced, the interval buffer time is recovered, and the late propagation on the line is reduced.

In view of the above problems, no effective solution has been proposed.

Disclosure of Invention

The embodiment of the invention provides a multi-train cooperation-oriented automatic adjustment method and device for a running chart, which at least solve the technical problems that in the prior art, the arrival and departure time of a single adjustment train is taken as a main point, the time cooperation relationship among train running lines is not fully considered, and the theoretical method for cooperatively adjusting the train running chart is less based on indexes such as the minimum running time of a train section, the tracking train interval time among the train running lines, the buffer time of the section, the maximum traction and braking capacity of the train and the like.

According to an aspect of the embodiment of the invention, an automatic adjustment method for a running chart facing multi-vehicle cooperation is provided, and comprises the following steps: acquiring line speed limit information; calculating to obtain a train operation analysis result according to the line speed limit information; performing multi-train cooperative adjustment operation according to the train operation analysis result, and generating scheduling information; and adjusting the train operation diagram according to the scheduling information.

Optionally, the line speed limit information includes: temporary speed limit information and maximum allowable driving length.

Optionally, the calculating to obtain a train operation analysis result according to the line speed limit information includes: dividing each interval according to the line speed limit information to obtain a temporary speed limit section; and calculating the train operation analysis result based on the maximum traction curve and the maximum braking curve of the temporary speed-limiting section under a preset condition by taking the left boundary point and the right boundary point of the temporary speed-limiting section as starting points.

Optionally, after performing multi-train cooperative adjustment operation according to the train operation analysis result and generating scheduling information, the method further includes: and performing reinforcement learning training according to the scheduling information to obtain a training model.

According to another aspect of the embodiments of the present invention, there is also provided an automatic adjustment device for a multiple vehicle cooperation oriented operation diagram, including: the acquisition module is used for acquiring the line speed limit information; the calculation module is used for calculating to obtain a train operation analysis result according to the line speed limit information; the generating module is used for performing multi-train cooperative adjustment operation according to the train operation analysis result and generating scheduling information; and the adjusting module is used for adjusting the train operation diagram according to the scheduling information.

Optionally, the calculation module includes: the dividing unit is used for dividing each interval according to the line speed limit information to obtain a temporary speed limit section; and the calculating unit is used for calculating the train operation analysis result based on the maximum traction and maximum braking curves of the temporary speed-limiting section under a preset condition by taking the left and right boundary points of the temporary speed-limiting section as starting points.

Optionally, the apparatus further comprises: and the training module is used for carrying out reinforcement learning training according to the scheduling information to obtain a training model.

According to another aspect of the embodiment of the invention, a nonvolatile storage medium is further provided, and the nonvolatile storage medium includes a stored program, wherein the program controls a device in which the nonvolatile storage medium is located to execute an automatic adjustment method of an operation diagram facing multi-vehicle cooperation when running.

According to another aspect of the embodiments of the present invention, there is also provided an electronic device, including a processor and a memory; the memory is stored with computer readable instructions, and the processor is used for executing the computer readable instructions, wherein the computer readable instructions execute an automatic adjustment method for a multi-vehicle cooperation oriented operation diagram when running.

In the embodiment of the invention, the method comprises the steps of acquiring line speed limit information; calculating to obtain a train operation analysis result according to the line speed limit information; performing multi-train cooperative adjustment operation according to the train operation analysis result, and generating scheduling information; the method for adjusting the train operation diagram according to the scheduling information solves the technical problems that in the prior art, the arrival and departure time of a single adjustment train is taken as a main time, the time coordination relationship among the train operation lines is not fully considered, and the theoretical method for cooperatively adjusting the train operation diagram is less based on indexes such as the minimum operation time of the section of the train, the interval time of the tracking train between the train operation lines, the buffer time of the section, the maximum traction and braking capacity of the train and the like.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

FIG. 1 is a schematic flow chart of an automatic adjustment method for a multi-vehicle cooperation oriented operation diagram according to an embodiment of the invention;

FIG. 2 is a schematic diagram of a system structure and data flow for automatic adjustment of a multi-vehicle cooperation oriented operation diagram according to an embodiment of the invention;

FIG. 3 is a schematic diagram of a train operation condition analysis scenario according to an embodiment of the invention;

FIG. 4 is a schematic diagram of interval buffering time according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a multi-vehicle collaborative automatic tuning scenario according to an embodiment of the present invention;

FIG. 6 is a diagrammatic illustration of a "fold-line" train operation according to an embodiment of the present invention;

FIG. 7 is a schematic diagram of an automatic train operation diagram adjusting method facing multi-train cooperation based on deep reinforcement learning according to an embodiment of the invention;

FIG. 8 is a flow chart of a method for automatically adjusting a multiple vehicle cooperation oriented operation diagram according to an embodiment of the invention;

fig. 9 is a block diagram of an automatic adjustment device for a multiple vehicle cooperation oriented operation diagram according to an embodiment of the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

In accordance with an embodiment of the present invention, there is provided a method embodiment for a multiple vehicle collaborative roadmap automatic adjustment method, it is noted that the steps illustrated in the flowchart of the figure may be executed in a computer system such as a set of computer executable instructions, and that while a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be executed in an order different than that illustrated or described herein.

Example one

Fig. 8 is a flowchart of an automatic adjustment method for a multi-vehicle cooperation-oriented operation diagram according to an embodiment of the present invention, as shown in fig. 8, the method includes the following steps:

and step S802, obtaining the line speed limit information.

And step S804, calculating to obtain a train operation analysis result according to the line speed limit information.

And step S806, performing multi-train cooperative adjustment operation according to the train operation analysis result, and generating scheduling information.

And step S808, adjusting the train operation diagram according to the scheduling information.

Specifically, the embodiment of the present invention provides an automatic adjustment method for a multi-vehicle cooperation-oriented operation diagram, wherein a schematic flow diagram is shown in fig. 1, a schematic system structure and a schematic data flow diagram are shown in fig. 2, and step S1 is specifically divided into the following 5 steps:

s1-1, the Temporary Speed Limit of a section in the line section caused by wind, rain, snow, equipment failure and other emergencies on the line, the traffic control desk manages the Temporary Speed Limit information in a centralized way, and sends the Temporary Speed Limit command to a Temporary Speed Limit Server (TSRS).

And S1-2 temporary speed limit information is transmitted to a multi-vehicle cooperation automatic adjusting module in an RBC database through a TSRS and a Radio Block Center (RBC).

S1-3 RBC sends train control and speed limit information to the train.

The S1-4 RBC acquires information such as line and train operation data (position, speed) and Movement Authority (MA) requests in its jurisdiction, and allocates more free routes to the train according to the forward route information sent by the interlock and the maximum MA length allowed for the train.

S1-5 RBC forwards the updated line and train operation data to RBC database as input information of multi-train cooperation automatic adjustment module.

Step S1 extends the RBC database function. Firstly, collecting line and train operation data and temporary speed limit information in real time; and secondly, the added multi-train cooperation automatic adjusting module is used for calculating the arrival time and speed of the train at each temporary speed-limiting section boundary point in the step S3 and providing arrival time prediction for train operation diagram adjustment.

Step S2, calculating the train operation condition analysis and the multi-train coordinated automatic adjustment in step S3 in the multi-train coordinated automatic adjustment module of the RBC database, where the scene diagram of step S2 is shown in fig. 3, and the specific calculation process is as follows:

s2-1 temporary speed-limiting section division: setting M temporary speed limit section boundary points in an interval (i, i +1) between stations i and i + 1: m is_i,1,m_i,2,…,m_i,q,m_i,q+1,…,m_i,M-1,m_i,MThen, M-1 temporary speed-limiting sections exist in the section (i, i + 1): (m)_i,1,m_i,2),…,(m_i,q,m_i,q+1),…,(m_i,M-1,m_i,M)。

S2-2 temporary speed limit grade division: according to 8 temporary speed limit grades of TSRS, the train speed limit value is set as v_l. When l is 1,2, 3, 4, 5, 6, 7, 8, v_l250, 200, 160, 120, 100, 80, 60, 45 (unit: km/h). Therefore, the q-th temporary speed-limiting section of the interval (i, i +1) is set as (m)_i,q,m_i,q+1,v_l)，q∈{1,2,…,M-1}。

S2-3 maximum traction and maximum braking curve calculation: and respectively drawing the maximum traction curve and the maximum braking curve of the train by taking the left and right dividing points of each temporary speed-limiting section as starting points, wherein the expressions are shown as a formula (1) and a formula (2).

Wherein, F_max、B_maxRespectively representing maximum traction and maximum braking force, alpha₁,α₂,v₁,v₂,v₃,v₄Is a constant relating to the train model. Maximum traction and maximum braking lower train gThe speeds at the temporary speed-limiting section boundary points q and q +1 are respectively

And

the actual speed of the train g at the boundary points q and q +1 of the temporary speed-limiting section is set as

Then

And (3) performing minimum operation time division calculation in an S2-4 interval: the speed of the train g in the operation of each temporary speed limit section should be the minimum of the maximum traction, the maximum braking and the speed under the temporary speed limit, and then the train g is divided into the minimum operation of the section (i, i +1)

In addition, the basic principle of the multi-train cooperative automatic adjustment in the step S3 is to ensure that the multi-train in each temporary speed-limiting section keeps the minimum tracking train interval time cooperative operation under the condition that the two-train departure interval time is not changed, and the essence is to recover the interval buffer time as much as possible, supplement the late time caused by the fault on the line, and reduce the late propagation. The section buffer time is defined as the difference between the continuous departure interval time of two trains and the tracking train interval time under the condition that the continuous departure interval time of two trains is not changed, and the function of the section buffer time is to inhibit the late spread between the trains as shown in fig. 4. According to fig. 4, the buffer time is related to the late propagation as follows: if the buffering time between two trains is more than or equal to the late time of the rear train, the rear train can run by using the 'overtaking' buffering time, and the influence of the late time on the rear train is thoroughly eliminated. On the contrary, if the buffering time between two trains is less than the delay time of the following train, the buffering time can eliminate a part of delay, but the following train cannot completely eliminate the delay effect even if the following train runs at an accelerated speed by using the whole buffering time. The other part of the late time can be transmitted to the next train, the influence of the late time on the rest interval is continuously eliminated by the two subsequent trains by using the buffering time until the late time on the line is completely eliminated, and the subsequent trains can run at the right time. Based on the discussion, the key point is that the interval buffer time is reasonably and accurately distributed to the late trains according to the sizes of the late trains on the line, the actual running time of each train in each temporary speed-limiting section in the interval is calculated, the trains run in a multi-train cooperative running mode under the constraint of the minimum tracking train interval time, and the arrival time, the speed and the arrival time of the trains at the boundary points of each temporary speed-limiting section are calculated for the automatic adjustment of the train running diagram.

A basic flowchart of step S3 is shown in fig. 5, and specifically includes:

s3-1 initial statistics of buffering time and late time

S3-1-1 late time initialization: b-1 intervals influenced by a certain fault of the line are set as follows: (1,2), (2,3), …, (i, i +1), …, (B-1, B), the total number of affected trains is N, the total time of the late point is w, when the late point occurs, the q temporary speed limit section of the section (i, i +1) of the g train is (m)_i,q,m_i,q+1,v_l) At a later time of

Wherein the interval i belongs to {1,2, …, B-1}, the temporary speed-limiting section q belongs to {1,2, …, M-1}, and the train g belongs to {1,2, …, N }. Let the train g be at the late time of the section (i, i +1)

If the train g is a fault train, the initial late time of the train is set as

S3-1-2 buffer time initialization: if the total buffering time of the N trains is b, N-1 buffering times exist, and the buffering time of the trains g and g +1 in the interval (i, i +1) is set as

Then

Minimum tracking train interval time of

The interval time between successive departure is

The relationship between the three is as follows:

and is

Wherein the content of the first and second substances,

is the departure time of the train g at stations i +1 and i.

S3-2 buffer time allocation; s3-2-1 late transmission analysis.

It should be noted that, from the discussion of fig. 4, it is necessary to optimize the late time of the following train under the conditions of the initial late time and the buffering time, and eliminate the influence of the late time on the following train. When a fault occurs, the delay is propagated from the first train before the fault to the subsequent train, the buffer time between the front train and the rear train is used for supplementing the delay time of the rear train, and then the trains run in the section (i, i + 1): if it is

That is, the late time of the preceding train g is greater than the buffering time between the preceding and following trains, and the late time of the train g will affect the operation of the train g +1, then the late time of the train g +1 is as follows:

wherein the content of the first and second substances,

indicating the initial late time of train g + 1.

If it is

That is, the time of the rear vehicle g is not more than the buffering time between the front and rear vehicles, the operation of the train g +1 is not influenced by the time of the rear vehicle g, and the time of the rear vehicle g +1 is not more than the buffering time

The time that can be buffered by the overtaking point operation in the interval (i, i +1)

Upon cancellation, the subsequent train will return to the right. Therefore, the late propagation equation for the interval (i, i +1) on the line is as follows:

and (3) calculating the total operation time division of the S3-2-2 interval: setting the late point value of a certain train on the line

When the value is equal to 0, the late point is not propagated to the following trains any more, and the late point starts from the first train g ═ 1 before the fault to the last train g ═ C, and the equations (3), (4), (5), (8) and (9) are collated, so that all trains g ═ 1,2, …, C on the line are divided into total operation time of all sections (1,2), (2,3), …, (i, i +1), …, (B-1, B)

Wherein the content of the first and second substances,

is the minimum running time division of the train g in the section (i, i +1) where the train g is located,

is the late time of the train g in its section (i, i + 1).

And S3-2-3, each temporary speed-limiting section runs time division allocation: taking each temporary speed-limiting section of the section (i, i +1) as an example, let μ ═ μ₁,μ₂,…,μ_M-1]Is the ratio of the total running time of each temporary speed-limiting section in the section during running, and the q-th temporary speed-limiting section (m)_i,q,m_i,q+1,v_l) The section (2) is divided into:

mu planning operation time division by each temporary speed limit section

Time division with actual operation after a night

In a ratio of

S3-3, calculating the time and the speed of the boundary point of the temporary speed-limiting section: the arrival time and departure time of the train at the stations i and i +1 are respectively a_g,i，d_g,iAnd a_g,i+1，d_g,i+1Considering the condition that the buffering time eliminates the late point, the time and the speed of the train at the temporary speed limit boundary point q are as follows:

s3-4, cooperatively operating a plurality of trains at the minimum tracking train interval time: multiple trains running in each temporary speed-limiting section need to run under the temporary speed limit,

indicates the temporary speed-limiting section (m) in which two adjacent trains g and g +1 are positioned_i,q,m_i,q+1,v_l) And the two positions are close to each other, the temporary speed limit section tracks the train interval time at the minimum

The two trains adjacent to each other should meet the following constraint conditions:

very Small constant (18)

Wherein the content of the first and second substances,

representing two adjacent positions

And

is detected at the time of the corresponding time instant,

representing the real-time position of two trains g and g +1 at the same time,

representing the minimum tracked train separation time of two trains below the temporary speed limit section. Equation (18) illustrates that the distance η between any two adjacent positions is small, and the function is to approximate the real-time of the running train by the short time, and the average speed of the section is the real-time speed of the train. Equation (19) is to ensure that the real-time tracking distance between two trains is greater than the safety braking distance s. Speed of train at positions p and p +1

And

it should satisfy:

s3-5 train arrival time calculation: arrangement formulas (10), (11) and (12) of the time a for the train g to arrive at the station i +1_g,i+1Is represented as follows:

wherein s is_g,iIs the stop time of the train g at station i.

S3-6 generation of a broken line train operation diagram: the "dog-line" train operation diagram is intended to be as shown in figure 6. And converting the original 'straight line' train operation diagram into a 'broken line' train operation diagram according to the time and the speed of the boundary point of the temporary speed-limiting section. On one hand, the train can generate a required driving strategy on line in real time according to the time and the speed of the boundary point; on the other hand, the train dispatcher can know the speed and position information of a plurality of trains in the interval in real time according to the broken line train operation diagram, and the dispatching command of acceleration and deceleration for the trains running in the interval is favorably issued by the dispatcher. In addition, the method ensures that multiple trains keep the minimum tracking train interval time cooperative operation in each temporary speed-limiting section of the section, recovers the section buffering time to the maximum extent and inhibits the late propagation of the line. The multi-train cooperation automatic adjustment module sends the line static information, the train state information and the broken line train operation diagram to the train dispatching desk through the RBC. And step S4, based on the fact that the actual arrival time of the train is known in step S3, the total late time of the train is the minimum as an optimization target, and the departure time of the late train on the line at the station is determined on the one hand based on a deep reinforcement learning method, and on the other hand, the train is guaranteed to keep the minimum tracking train interval time to cooperatively run in each temporary speed-limiting section of the section. The automatic train operation diagram adjusting method facing multi-train cooperation based on deep reinforcement learning is schematically shown in FIG. 7, and the intelligent agent and the environment are set as follows: the environment comprises a state space, an action space, a state transition probability and a reward function.

Wherein, the state space S includes 2 kinds:

setting interval state space variable as tracking train interval time of each temporary speed limit section

Automatically adjusting the following of the train in the section under the conditions of the minimum operation time of the step S2, the speed and the time of the temporary speed limit section boundary point of the step S3Train interval times are tracked.

Setting the state space variable of the station as the departure time d of the train_g,iAnd g belongs to {1,2, …, N }, i belongs to {1,2, …, B-1}, and the departure time of each train at each station is adjusted on the basis of the train arrival time in the step 3-5 and the train operation diagram in the step 3-6 broken line'.

The action space A is set according to the state space S to reduce or increase the interval time of the tracked train in the section, and the departure in the station is advanced or delayed, namely changed

And d_g,iThe size of (2).

The state transition probability P represents the set of probabilities P (S' | S, a) ═ P [ S ] of the current state-action transition to the next state_t+1＝s′|S_t＝s,A_t＝a]I.e. the probability that the current station will adjust to the next station.

The reward function R is an optimization target of automatic adjustment of the running chart, is set to be minimum in total late time of trains on a line, and is represented by 2 types: the total time of the train at the station at the late point is minimum, the interval time of the train tracking under the temporary speed limit in the section is minimum, and the setting method comprises the following steps:

the total time of the late point of the station is minimum, and a station reward function R is set_stationThe following were used:

wherein

Representing scheduled arrival and departure times of the train.

The interval time of the section tracking train is minimum, and an interval reward function R is set_blockThe following were used:

equations (23) and (24) show that the larger the reinforcement learning reward function is, the smaller the total delay time of each train at each station is, and the smaller the time interval between the trains to be tracked at each temporary speed-limit section is, the smaller the total delay time of the trains on the route is.

The intelligent agent is a decision unit for automatically adjusting a train running chart under the cooperation of multiple trains, and the process of deep reinforcement learning off-line training is as follows: under the environment of a scene of multiple trains at a later time, the minimum total later time of the trains on the line is taken as an optimization target, and the environment is learned based on deep reinforcement learning. The environment accurately simulates the scenes of multiple trains at night, a state space, an action space, a state transition probability and a reward function are set, the current state is transmitted to the intelligent agent, the intelligent agent updates the value function of the intelligent agent, strategy evaluation and improvement are carried out, and the maximum reward value and the optimal action are generated for the environment. And the intelligent agent continuously and interactively learns and deduces with the environment to generate a global optimal target. Step S4 the off-line training process is performed in the background of the driving schedule.

Step S5 is to store the parameters of the training model for the on-line automatic adjustment of the train diagram after the optimization target is generated by the off-line training of step S4, and the automatic adjustment result is displayed on the staring adjustment interface of the train dispatching desk.

In summary, the traffic control console in the invention issues temporary speed limit commands and temporary speed limit information to the RBC through the TSRS, and the RBC sends speed limit and vehicle control information and MA length to the train. And the RBC database receives temporary speed limit information, line and train operation data, designs a multi-train cooperative automatic adjustment module, and calculates the minimum operation time of an interval. And calculating the time and speed information of each train at each temporary speed-limiting section boundary point of each section and the arrival time of each train at a station by adopting a multi-train cooperative running mode according to the influence of the late point on the buffer time of the distribution section, and transmitting the information to a travelling dispatching desk. And finally, generating a running chart adjustment result facing multi-vehicle cooperation based on deep reinforcement learning, and reducing the total delay time of the trains on the line.

Through the embodiment, the technical problems that in the prior art, the arrival and departure time of a train is mainly adjusted singly, the time coordination relationship among the train operation lines is not fully considered, and the theoretical method for cooperatively adjusting the train operation diagram is less based on indexes such as the minimum operation time of the train section, the train tracking interval time among the train operation lines, the section buffer time, the maximum traction and braking capacity of the train and the like are solved.

Example two

Fig. 9 is a block diagram of an automatic adjustment apparatus for a multiple vehicle cooperation oriented operation diagram according to an embodiment of the present invention, as shown in fig. 9, the apparatus includes:

and the obtaining module 90 is used for obtaining the line speed limit information.

And the calculating module 92 is used for calculating to obtain a train operation analysis result according to the line speed limit information.

And the generating module 94 is configured to perform multi-train cooperative adjustment operation according to the train operation analysis result, and generate scheduling information.

And the adjusting module 96 is used for adjusting the train operation diagram according to the scheduling information.

S1-3 RBC sends train control and speed limit information to the train.

Wherein, F_max、B_maxRespectively representing maximum traction and maximum braking force, alpha₁,α₂,v₁,v₂,v₃,v₄Is a constant relating to the train model. The speed of the train g under the maximum traction and the maximum braking at the boundary points q and q +1 of the temporary speed-limiting section are respectively

And

Then

A basic flowchart of step S3 is shown in fig. 5, and specifically includes:

s3-1 initial statistics of buffering time and late time

If the train g is a fault train, the initial late time of the train is set as

Then

Minimum tracking train interval time of

The interval time between successive departure is

The relationship between the three is as follows:

and is

Wherein the content of the first and second substances,

is the departure time of the train g at stations i +1 and i.

S3-2 buffer time allocation; s3-2-1 late transmission analysis.

wherein the content of the first and second substances,

indicating the initial late time of train g + 1.

If it is

Wherein the content of the first and second substances,

is the late time of the train g in its section (i, i + 1).

mu planning operation time division by each temporary speed limit section

Time division with actual operation after a night

In a ratio of

very Small constant (18)

Wherein the content of the first and second substances,

representing two adjacent positions

And

is detected at the time of the corresponding time instant,

representing the real-time position of two trains g and g +1 at the same time,

representing the minimum tracked train separation time of two trains below the temporary speed limit section. Equation (18) illustrates that the distance η between any two adjacent positions is small, and the function is to approximate the real-time of the running train by the short time, and the average speed of the section is the real-time speed of the train. The formula (19) ensures that the real-time tracking distance between two trains is largeAt a safety braking distance s. Speed of train at positions p and p +1

And

it should satisfy:

wherein s is_g,iIs the stop time of the train g at station i.

Wherein, the state space S includes 2 kinds:

And automatically adjusting the tracking train interval time of the train in the section under the conditions of the minimum operation time of the step S2, the speed and the time of the temporary speed limit section boundary point of the step S3.

And d_g,iThe size of (2).

wherein

Representing scheduled arrival and departure times of the train.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. A multi-vehicle cooperation oriented automatic operation diagram adjusting method is characterized by comprising the following steps:

s1: the method for acquiring the line speed limit information comprises the following steps:

s1-1: the method comprises the steps that a temporary speed limit is conducted on a certain section of a line section due to an emergency on a line, a driving dispatching desk manages temporary speed limit information in a centralized mode, and a temporary speed limit command is sent to a temporary speed limit server;

s1-2: the temporary speed limit information is transmitted to a multi-vehicle cooperation automatic adjustment module in an RBC database through a TSRS and a radio block center RBC;

s1-3: the RBC sends train control and speed limit information to the train;

s1-4: the RBC acquires the line and train operation data and the MA request information of the driving permission in the jurisdiction range of the RBC, and allocates more idle access routes for the train according to the front access route information sent by interlocking and the allowed maximum MA length of the train;

s1-5: the RBC forwards the updated line and train operation data to an RBC database to serve as input information of the multi-train cooperative automatic adjustment module;

s2: according to the line speed limit information, calculating to obtain a train operation analysis result, comprising the following steps:

s2-1: and (3) temporary speed-limiting section division: setting M temporary speed limit section boundary points in an interval (i, i +1) between stations i and i + 1: m is_i,1,m_i,2,…,m_i,q,m_i,q+1,…,m_i,M-1,m_i,MThen, M-1 temporary speed-limiting sections exist in the section (i, i + 1): (m)_i,1,m_i,2),…,(m_i,q,m_i,q+1),…,(m_i,M-1,m_i,M)；

S2-2: and (3) temporary speed limit grading: according to 8 temporary speed limit grades of TSRS, the train speed limit value is set as v_lWhen l is 1,2, 3, 4, 5, 6, 7, 8, v is_l250, 200, 160, 120, 100, 80, 60, 45, unit: km/h; therefore, the q-th temporary speed-limiting section of the interval (i, i +1) is set as (m)_i,q,m_i,q+1,v_l)，q∈{1,2,…,M-1}；

S2-3: maximum traction and maximum braking curve calculation: respectively drawing maximum traction curves and maximum braking curves of the train by taking the left and right dividing points of each temporary speed-limiting section as starting points, wherein the expressions are shown as a formula (1) and a formula (2);

wherein, F_max、B_maxRespectively representing maximum traction and maximum braking force, alpha₁,α₂,v₁,v₂,v₃,v₄Is a constant related to the train model; the speed of the train g under the maximum traction and the maximum braking at the boundary points q and q +1 of the temporary speed-limiting section are respectively

，

And

，

(ii) a The actual speed of the train g at the boundary points q and q +1 of the temporary speed-limiting section is set as

，

(ii) a Then:

s2-4: and (3) calculating the minimum operation time division of the interval: the speed of the train g in the operation of each temporary speed-limiting section should be the minimum of the maximum traction, the maximum braking and the speed under the temporary speed limit, so that the minimum operation of the train g in the section (i, i +1) is divided into:

s3: and performing multi-train cooperative adjustment operation according to the train operation analysis result, and generating scheduling information:

s3-1: initial statistics of buffering time and late time:

s3-1-1: initializing the late time: b-1 intervals influenced by a certain fault of the line are set as follows: (1,2), (2,3), …, (i, i +1), …, (B-1, B), the total number of affected trains is N, the total time of the late point is w, when the late point occurs, the q temporary speed limit section of the section (i, i +1) of the g train is (m)_i,q,m_i,q+1,v_l) At a later time of

Wherein the interval i belongs to {1,2, …, B-1}, the temporary speed-limiting section q belongs to {1,2, …, M-1}, the train g belongs to {1,2, …, N }, and the late time of the train g in the interval (i, i +1) is set as

If the train g is a fault train, the initial late time of the train is set as

；

S3-1-2: initializing the buffering time: if the total buffering time of the N trains is b, N-1 buffering times exist, and the buffering time of the trains g and g +1 in the interval (i, i +1) is set as

Then, then

The minimum tracking train interval time is

The interval time of continuous departure is

Then the relationship between the three is as follows:

and is

Wherein the content of the first and second substances,

is the departure time of the train g at stations i +1 and i;

s3-2: distributing buffer time;

s3-2-1: late point spread analysis:

optimizing the late time of the rear train under the conditions of the initial late time and the buffering time, and eliminating the influence of the late on the subsequent trains; when a fault occurs, the delay is propagated from the first train before the fault to the subsequent train, the buffer time between the front train and the rear train is used for supplementing the delay time of the rear train, and then the trains run in the section (i, i + 1): if it is

That is, the delay time of the preceding train g is greater than the buffering time between the preceding train and the following train, and the delay of the train g will affect the operation of the train g +1, the delay time of the train g +1 is as follows:

wherein the content of the first and second substances,

represents the initial late time of train g + 1;

if it is

That is, the time of the rear vehicle g is not more than the buffering time between the front and rear vehicles, the operation of the train g +1 is not affected by the time of the rear vehicle g, and the time of the rear vehicle g +1 is not more than the buffering time

Eliminated, the subsequent train will return to the positive point, and therefore, the late point propagation equation for the on-line interval (i, i +1) is as follows:

s3-2-2: and (3) calculating the total operation time of the interval: setting the late point value of a certain train on the line

When the value is equal to 0, the late point is not propagated to the subsequent trains any more, and the late point starts from the first train g ═ 1 before the fault to the last train g ═ C, the equations (3), (4), (5), (8) and (9) are collated, and all trains g ═ 1,2, … and C on the line are in all sections (1,2), (2,3), …The total operation time division of (i, i +1), …, (B-1, B):

(10)

wherein the content of the first and second substances,

is the late time of the train g in its section (i, i + 1);

s3-2-3: and (3) running time division distribution of each temporary speed limit section: taking each temporary speed-limiting section of the section (i, i +1) as an example, let μ ═ μ₁,μ₂,…,μ_M-1]Is the ratio of the total running time of each temporary speed-limiting section in the section, the q-th temporary speed-limiting section (m)_i,q,m_i,q+1,v_l) The section (2) is divided into:

mu planning operation time division by each temporary speed limit section

Time division with actual operation after a night

The ratio of (A) to (B) is as follows:

s3-3: calculating the time and the speed of the boundary point of the temporary speed-limiting section: the arrival time and departure time of the train at the stations i and i +1 are respectively a_g,i，d_g,iAnd a_g,i+1，d_g,i+1Considering the condition that the buffering time eliminates the late point, the time and the speed of the train at the temporary speed limit boundary point q are as follows:

s3-4: and (3) performing multi-train cooperative operation under the minimum tracking train interval time: multiple trains running in each temporary speed-limiting section need to run under the temporary speed limit,

indicates the temporary speed-limiting section (m) in which two adjacent trains g and g +1 are positioned_i,q,m_i,q+1,v_l) And the two positions are close to each other, the temporary speed limit section is in the minimum tracking train interval time

wherein the content of the first and second substances,

representing two adjacent positions

Is detected at the time of the corresponding time instant,

representing the real-time position of two trains g and g +1 at the same time,

representing the minimum tracking train interval time of two trains under the temporary speed limit section; the equation (18) shows that the distance eta between any two adjacent positions is very small, and the function is to use the short time to approach the real-time moment of running the train, and the average speed of the section is the real-time speed of the train; the formula (19) ensures that the real-time tracking distance between two trains is greater than the safety braking distance s; speed of train at positions p and p +1

It should satisfy:

s3-5: train arrival time calculation: arrangement formulas (10), (11) and (12) of the time a for the train g to arrive at the station i +1_g,i+1Is represented as follows:

(22) wherein s is_g,iIs a traing stop time at station i;

s3-6: and (3) generating a broken line train operation diagram: converting the original 'straight line' train operation diagram into a 'broken line' train operation diagram according to the time and the speed of the boundary point of the temporary speed limiting section; on one hand, the train can generate a required driving strategy on line in real time according to the time and the speed of the boundary point; on the other hand, the train dispatcher can know the speed and position information of a plurality of trains in the interval in real time according to the broken line train running diagram, and is favorable for the dispatcher to issue a dispatching command of acceleration and deceleration for the trains running in the interval; in addition, the multiple trains are ensured to keep the minimum tracking train interval time cooperative operation in each temporary speed-limiting section of the interval, the interval buffering time is recovered to the maximum extent, and the line late propagation is inhibited; the multi-train cooperation automatic adjustment module sends the line static information, the train state information and the broken line train operation diagram to a train dispatching desk through the RBC;

step S4: on the basis that the actual arrival time of the train is known in the step S3, on the one hand, the departure time of the train at the station at the later point on the route is determined based on a deep reinforcement learning method by taking the minimum total later point time of the train as an optimization target, and on the other hand, the train is ensured to keep the minimum tracking train interval time cooperative running in each temporary speed-limiting section of the section;

step S5: after the optimization target is generated by off-line training in step S4, the parameters of the training model are stored for the on-line automatic adjustment of the train running chart, and the automatic adjustment result is displayed on the staring adjustment interface of the train dispatching desk.

2. The method of claim 1, wherein the route speed limit information comprises: temporary speed limit information and maximum allowable driving length.

3. The method of claim 1, wherein the step of calculating and obtaining a train operation analysis result according to the route speed limit information comprises the following steps:

dividing each interval according to the line speed limit information to obtain a temporary speed limit section;

and calculating the train operation analysis result based on the maximum traction curve and the maximum braking curve of the temporary speed-limiting section under a preset condition by taking the left boundary point and the right boundary point of the temporary speed-limiting section as starting points.

4. The method according to claim 1, wherein after performing a multi-train cooperative adjustment operation and generating scheduling information according to the train operation analysis result, the method further comprises:

and performing reinforcement learning training according to the scheduling information to obtain a training model.

5. An automatic adjustment device for a multi-vehicle cooperation oriented operation diagram, comprising the automatic adjustment method for the multi-vehicle cooperation oriented operation diagram of any one of claims 1 to 4, characterized by comprising:

the acquisition module is used for acquiring the line speed limit information;

the calculation module is used for calculating to obtain a train operation analysis result according to the line speed limit information;

the generating module is used for performing multi-train cooperative adjustment operation according to the train operation analysis result and generating scheduling information;

and the adjusting module is used for adjusting the train operation diagram according to the scheduling information.

6. The apparatus of claim 5, wherein the line speed limit information comprises: temporary speed limit information and maximum allowable driving length.

7. The apparatus of claim 5, wherein the computing module comprises:

the dividing unit is used for dividing each interval according to the line speed limit information to obtain a temporary speed limit section;

and the calculating unit is used for calculating the train operation analysis result based on the maximum traction and maximum braking curves of the temporary speed-limiting section under a preset condition by taking the left and right boundary points of the temporary speed-limiting section as starting points.

8. The apparatus of claim 5, further comprising:

and the training module is used for carrying out reinforcement learning training according to the scheduling information to obtain a training model.

9. A non-volatile storage medium, comprising a stored program, wherein the program, when executed, controls an apparatus in which the non-volatile storage medium is located to perform the method of any one of claims 1 to 4.

10. An electronic device comprising a processor and a memory; the memory has stored therein computer readable instructions for execution by the processor, wherein the computer readable instructions when executed perform the method of any one of claims 1 to 4.