WO2022205175A1 - 列车运行优化方法及装置 - Google Patents

列车运行优化方法及装置 Download PDF

Info

Publication number
WO2022205175A1
WO2022205175A1 PCT/CN2021/084680 CN2021084680W WO2022205175A1 WO 2022205175 A1 WO2022205175 A1 WO 2022205175A1 CN 2021084680 W CN2021084680 W CN 2021084680W WO 2022205175 A1 WO2022205175 A1 WO 2022205175A1
Authority
WO
WIPO (PCT)
Prior art keywords
train
model
virtual scene
state
current
Prior art date
Application number
PCT/CN2021/084680
Other languages
English (en)
French (fr)
Inventor
杜峰
吴剑强
Original Assignee
西门子股份公司
西门子(中国)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 西门子股份公司, 西门子(中国)有限公司 filed Critical 西门子股份公司
Priority to PCT/CN2021/084680 priority Critical patent/WO2022205175A1/zh
Priority to CN202180093314.9A priority patent/CN116888030A/zh
Publication of WO2022205175A1 publication Critical patent/WO2022205175A1/zh

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B61RAILWAYS
    • B61LGUIDING RAILWAY TRAFFIC; ENSURING THE SAFETY OF RAILWAY TRAFFIC
    • B61L27/00Central railway traffic control systems; Trackside control; Communication systems specially adapted therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Definitions

  • the present disclosure relates to the technical field of rail transportation, and more particularly, to a train operation optimization method, apparatus, computing device, computer-readable storage medium, and program product.
  • the entire traction power supply system includes multiple components such as trains, power supply networks, stations, and environments.
  • the energy consumption of the traction power supply system is necessary to reduce the energy consumption of the traction power supply system as much as possible, that is, the total amount of electricity consumed by the entire rail transit line in a unit time (such as peak hours, a day and a night, or a year).
  • pre-calculated train operation diagrams are generally used to control the operation of trains in rail transit lines.
  • the train operation diagram usually includes the train stop time, the number of trains, the train interval time, and the running direction and section, which are used to control the time, position and speed of the train running.
  • FIG. 1( a ) shows a schematic diagram of a train running in a taxiing mode in an ideal situation in the prior art.
  • a train traveling between two platforms with a distance S A is divided into four stages: 0-S 1 is the acceleration stage, in which the train accelerates at the maximum acceleration under the action of traction force Continue to accelerate until the maximum speed V 1 ; S 1 -S 2 is the constant speed running stage, in this stage, the train runs at a constant speed at the maximum speed V 1 under the action of traction force; S 2 -S 3 is the taxiing stage, in this stage, the train In the case of no traction, the train decelerates to V 2 due to resistance; S 3 -S A is the deceleration stage, in which the train decelerates at the maximum deceleration until the speed is 0.
  • Figure 1(b) shows a schematic diagram of a vehicle running in a coasting mode under the condition of superimposed speed limit in the prior art. Compared to Fig. 1(a), Fig. 1(b) considers some speed restrictions on the rail transit lines.
  • the prior art operation optimization method does not consider conduction energy consumption (such as wire loss) and regenerative energy generated by train braking. That is to say, the operation optimization method of the prior art only starts from the perspective of minimizing the traction energy consumption of a single train, but does not consider the energy consumption of the entire traction power supply system.
  • the first embodiment of the present disclosure proposes a train operation optimization method. Including: obtaining the virtual scene model of the traction power supply system of the train, the virtual scene model corresponds to the operation scene of the traction power supply system; establishing an operation optimization model, and the operation optimization model is used to determine the corresponding train state according to the train state of each train in the virtual scene model and using reinforcement learning to iteratively adjust the model parameters of the operation optimization model according to the simulated power of the virtual scene model in the updated train state to train the operation optimization model.
  • factors related to energy consumption such as geographic information, train resistance, and train power characteristics, in the virtual scene model are closer to the actual situation.
  • this method starts from the energy consumption of the entire traction power supply system, including conduction energy consumption, regenerative energy generated by train braking, etc., rather than only considering the traction energy consumption of a single train, making the operation optimization more comprehensive and accurate.
  • This makes it possible to use the virtual scene model to train the operation optimization model even under complex operation scenarios, so as to obtain the optimal train operation diagram with minimum energy consumption.
  • self-learning model training can be achieved through reinforcement learning, with little reliance on human experience.
  • a second embodiment of the present disclosure proposes a train operation optimization device, including: a scene model obtaining unit configured to obtain a virtual scene model of the traction power supply system of the train, the virtual scene model being related to the operation scene of the traction power supply system Corresponding; an optimization model establishment unit, which is configured to establish an operation optimization model, the operation optimization model is used to determine the corresponding train action according to the train state of each train in the virtual scene model, and the train action is used to update the train state of each train. and an optimization model training unit configured to iteratively adjust the model parameters of the operation optimization model according to the simulation power of the virtual scene model in the updated train state to train the operation optimization model.
  • factors related to energy consumption such as geographic information, train resistance, and train power characteristics, in the virtual scene model are closer to the actual situation.
  • this method starts from the energy consumption of the entire traction power supply system, including conduction energy consumption, regenerative energy generated by train braking, etc., rather than only considering the traction energy consumption of a single train, making the operation optimization more comprehensive and accurate.
  • This makes it possible to use the virtual scene model to train the operation optimization model even under complex operation scenarios, so as to obtain the optimal train operation diagram with minimum energy consumption.
  • self-learning model training can be achieved through reinforcement learning, with little reliance on human experience.
  • a third embodiment of the present disclosure proposes a computing device comprising: a processor; and a memory for storing computer-executable instructions that, when executed, cause the processor to perform the first implementation method in the example.
  • a fourth embodiment of the present disclosure proposes a computer-readable storage medium having computer-executable instructions stored thereon for performing the method of the first embodiment.
  • a fifth embodiment of the present disclosure proposes a computer program product tangibly stored on a computer-readable storage medium and comprising computer-executable instructions that, when executed, cause at least one The processor executes the method of the first embodiment.
  • Figure 1(a) shows a schematic diagram of a train running in a taxiing mode in an ideal situation in the prior art
  • Figure 1(b) shows a schematic diagram of a vehicle running in a coasting mode under the condition of superimposed speed limit in the prior art
  • FIG. 2 shows a flowchart of a method for optimizing train operation according to one embodiment of the present disclosure
  • Fig. 3 shows the flow chart of training and running the optimization model in the embodiment of Fig. 2;
  • Fig. 4 shows the flow chart of calculating the current simulation power of the virtual scene model in the embodiment of Fig. 2;
  • Fig. 5 (a) shows the network topology structure of an exemplary virtual scene model at the current moment according to the embodiment of Fig. 2;
  • Fig. 5(b) shows the network topology of the exemplary virtual scene model in Fig. 5(a) at the next moment;
  • FIG. 6 shows a schematic block diagram of a train operation optimization system according to one embodiment of the present disclosure
  • FIG. 7 shows a schematic block diagram of a train operation optimization apparatus according to an embodiment of the present disclosure.
  • FIG. 8 shows a schematic block diagram of a computing device for train operation optimization in accordance with one embodiment of the present disclosure.
  • the terms “including”, “comprising” and similar terms are open-ended terms, ie, “including/including but not limited to,” meaning that other content may also be included.
  • the term “based on” is “based at least in part on”.
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one additional embodiment” and so on.
  • FIG. 2 shows a flowchart of a method for optimizing train operation according to one embodiment of the present disclosure.
  • method 200 begins at step 21 .
  • a virtual scene model of the traction power supply system of the train is obtained, and the virtual scene model corresponds to the operation scene of the traction power supply system.
  • the entire traction power supply system of a rail transit line includes multiple components such as trains, power supply networks, stations, and environments, and each component has its specific parameters or configurations.
  • the operating scenario refers to the situation in which the traction power supply system operates under a set of parameters or configurations. Therefore, there are many different operating scenarios for a traction power supply system.
  • Each virtual scenario model simulates the traction power system under a specific operating scenario.
  • a virtual scene model corresponding to the target operation scene can be established for the traction power supply system in advance.
  • the virtual scene model created each time can be saved in the database. With the continuous expansion and accumulation of the virtual scene models in the database, the required virtual scene models can be searched from the database when needed subsequently.
  • an operation optimization model is established, the operation optimization model is used to determine the corresponding train action according to the train state of each train in the virtual scene model, and the train action is used to update the train state of each train.
  • the train state may include the position and speed of the train, and the train action may include the acceleration of the train.
  • reinforcement learning is used to train and run the optimization model.
  • the operation optimization model may select the corresponding train action according to the train state of each train in the virtual scene model in a value-based, policy gradient-based or a combination manner.
  • the operational optimization model includes a deep neural network.
  • the structure of the deep neural network can be designed as needed to achieve end-to-end training. Any suitable continuous or discrete deep reinforcement learning method can be used, such as DQN or DDPG, etc.
  • step 23 using reinforcement learning, the model parameters of the operation optimization model are adjusted iteratively according to the simulation power of the virtual scene model in the updated train state, so as to train the operation optimization model. Since it is expected to minimize the energy consumption of the traction power supply system in a certain operating scenario (such as a specific passenger load factor, train interval time, etc.), the simulation energy consumption of the virtual scenario model corresponding to the operating scenario can be used to train the operation optimization Model.
  • the training process of the model is a continuous interaction process between the running optimization model and the virtual scene model.
  • the operation optimization model is used to update the train state of each train in the virtual scene model, the virtual scene model under the train state is simulated to obtain the simulation power, and then the model parameters of the operation optimization model are adjusted according to the simulation power.
  • the adjusted operation optimization model is again used to update the train state of each train in the virtual scene model, simulate the virtual scene model under the updated train state to obtain new simulation power, and continue to adjust the operation optimization according to the simulation power.
  • Model parameters for the model The above process is performed iteratively in this way, and finally the running optimization model can be converged.
  • step 23 includes sub-step 231 - sub-step 235 .
  • step 231 using the operation optimization model, for each train in the virtual scene model, the train action corresponding to its previous train state is determined. Initially, the initial train state of each train in the virtual scene model is input into the operation optimization model. The output of the operational optimization model is a set of initial train actions for each train. After that, each time the operation optimization model is used to determine the train behavior corresponding to the input train state.
  • the previous train state of each train is updated to the current train state according to the previous train state of each train and the determined train action, and provided to the operation optimization model and the virtual scene model.
  • the train state includes the position and speed of the train
  • the train action includes the acceleration of the train.
  • the current train state of each train can be calculated by the following formulas (1)-(2).
  • v tj and s tj respectively represent the train speed and train position at time t j , that is, the current train state.
  • s tj-1 and v tj-1 respectively represent the train speed and train position at the previous time t j -1 at the time t j, that is, the previous train state.
  • a tj-1 represents the train acceleration corresponding to the previous train state output by the operation optimization model.
  • the train acceleration can be positive or negative, or zero. When the acceleration is positive, it means that the train is accelerating; when the acceleration is negative, it means that the train is decelerating; when the acceleration is 0, it means that the train is running at a constant speed.
  • the calculated current train state is fed back to the operation optimization model for updating the next train state, and on the other hand, it is provided to the virtual scene model for simulation power calculation.
  • the current simulation power of the virtual scene model is calculated according to the current train state of each train.
  • the virtual scene model can reflect the situation of the entire traction power supply system in the actual operation scene to the greatest extent. Therefore, the sum of the inlet power of all traction substations in the virtual scene model is the simulated power of the virtual scene model in the current train state.
  • sub-step 233 further includes sub-steps 2331-2333.
  • the network topology of the virtual scene model in the current train state is converted into an equivalent circuit, and the power supply of the equivalent circuit includes at least one traction substation in the virtual scene model. Since the virtual scene model includes all the information of the traction power supply system in the corresponding operating scenario, including but not limited to the power supply network parameters, train parameters, operating route and geographic information, additional load parameters, and train scheduling information of the traction power supply system, it can be These information and the train state of the train convert the network topology of the virtual scene model at each moment into an equivalent circuit.
  • the inlet power of each of the at least one traction substation is calculated using the node voltage method.
  • the node voltage method is used to list the nonlinear equations, and the Newton iteration method is used to solve the linear equations to obtain the voltage of each node and the current of each branch in the equivalent circuit, and finally calculate Obtain the total current and voltage at the entrance of each traction substation.
  • each traction substation Multiply the current and voltage at the entrance of each traction substation to obtain the entrance power at time t j , namely P TPSitj , where TPS i represents the ith traction substation.
  • the calculated inlet power of each traction substation is added to obtain the current simulated power of the virtual scene model, that is,
  • Figure 5(a) shows the network topology of an exemplary virtual scene model at the current moment.
  • Figure 5(b) shows the network topology of the exemplary virtual scene model at the next moment.
  • trains 521-523 run in the upward direction (rightward in the figure), and trains 524-526 run in the downward direction (leftward in the figure).
  • the trains 521 to 522 and 524 to 525 are accelerated by the traction force, while the trains 523 and 526 are braked and decelerated, and the acceleration values are different from each other.
  • the positions of the trains 521-526 have all changed.
  • the equivalent circuits of the network topologies 500 and 501 also change.
  • the two traction substations 510 and 511 supply power to the contact line 531 in the upstream direction and the contact line 532 in the downstream direction through the wires 541-544, respectively.
  • the return rail 533 in the upward direction is connected to the traction substations 510 and 511 through wires 551 and 553, and the return rail 534 in the downward direction is connected to the traction substations 510 and 511 through wires 552 and 554, thus forming a current loop.
  • the traction substations 510 and 511 are equivalent to power sources
  • the trains 521-526 are equivalent to power elements
  • the contact wires, return rails and conductors are in the equivalent circuit generate resistance.
  • the operating state of the trains 521-526 determines their traction power or braking power in the equivalent circuit. They consume power during traction acceleration and provide power during braking deceleration. Traction power or braking power can be calculated according to the following formulas (3) or (4):
  • P train is the traction power or braking power of the trains 521-526.
  • V is the train speed
  • F is the traction force or braking force at the train speed according to the traction characteristic curve or braking characteristic curve of the train
  • is the corresponding conversion efficiency.
  • v train is the voltage across the trains 521-526
  • i train is the current flowing through the trains 521-526. Since both traction substations 510 and 511 supply power to contact lines 531 and 532, the power at the inlets A1 and A2 of these two traction substations 510 and 511 needs to be calculated. Then the powers at the two entrances A 1 and A 2 are added to obtain the simulated power of the virtual scene model under the network topology.
  • sub-step 234 includes calculating a reward value according to the current simulated power and the current train state using the set reward and punishment function and providing it to the operation optimization model.
  • the reward and punishment function is set according to the comparison result between the current train operating condition and the preset train operating condition and the simulation power.
  • Train operating conditions include any one or more of the following: speed, running time, and arrival time.
  • the current train operating conditions may be generated based on the current train conditions.
  • the following formula (5) shows an example of a reward and punishment function.
  • the simulated power It is inversely proportional to the reward value R tj at time t j . That is, the smaller the simulation power, the larger the reward value R tj , and vice versa.
  • the preset train operating conditions include the train speed limit along the line, the running time limit between platforms, and the arrival time limit. When the speed of the train at time tj violates the speed limit or the running time of the train between platforms violates the running time limit, the reward value R tj is a negative constant; when the train stops at the target platform at the correct time, the reward value R tj is a normal number.
  • the above formula (5) is only a simple example of the reward function, and those skilled in the art should understand that the reward function may be set according to one or more other additional optimization objectives. If there are multiple optimization objectives at the same time, different scoring functions can be designed for multiple optimization objectives, and the scoring functions can be combined with different weights to form the final reward function. The calculated reward value is provided to run the optimization model.
  • sub-step 235 the model parameters of the running optimization model are adjusted according to the reward value.
  • the current train state obtained in sub-step 232 is fed back to the operation optimization model as the train state input to the operation optimization model in the next iteration.
  • the train behavior is determined using the adjusted operational optimization model, after which the execution of sub-steps 232-235 is continued.
  • the above steps 231-235 are performed iteratively until the run optimization model converges.
  • the train status in the virtual scene model is updated by continuously updating the train action through the operation optimization model, so that the operation optimization model and the virtual scene model are continuously iteratively trained, so that the operation optimization model can learn the optimal train driving mode and train operation diagram.
  • factors related to energy consumption such as geographic information, train resistance, and train power characteristics in the virtual scene model are closer to the actual situation.
  • the method starts from the energy consumption of the entire traction power supply system, including conduction energy consumption and regenerative energy generated by train braking, rather than only considering the traction energy consumption of a single train, making the operation optimization more comprehensive and accurate.
  • This makes it possible to use the virtual scene model to train the operation optimization model even under complex operation scenarios, so as to obtain the optimal train operation diagram with minimum energy consumption.
  • self-learning model training can be achieved through reinforcement learning, with little reliance on human experience.
  • step 21 further includes: collecting raw data related to the virtual scene model; performing data processing on the raw data according to preset rules to serve as modeling data; and establishing a model based on the modeling data Virtual scene model.
  • the raw data includes all relevant data needed to model the virtual scenario for the traction power supply system, such as at least one of the following: power supply network parameters of the traction power supply system, train parameters, operating routes and geographic information, additional load parameters, and Train scheduling information.
  • Power supply network parameters include but are not limited to rectifier parameters (such as short-circuit current, wire type, load loss, coupling factor, etc.), circuit breaker parameters (such as connection relationship, rated insulation voltage, rated impulse withstand voltage, etc.), as well as contact wire and return Rail parameters (such as feed distance, wire type, wire impedance, inner diameter, outer diameter, resistivity, wear, temperature coefficient, joint type, feed point, etc.).
  • Train parameters include but are not limited to maximum acceleration, train class, length, dead weight, rotating mass, maximum load, maximum speed, inverter parameters, motor parameters, etc.
  • the running route and geographic information include, but are not limited to, running direction, station number and physical coordinates, marshalling arrangement, tunnel factor, route terrain information (such as gradient value), etc.
  • Additional load parameters include, but are not limited to, vehicle-mounted equipment (such as ventilation and lighting equipment, display equipment) parameters, platform equipment (such as elevators, ventilation and lighting equipment, communication equipment) parameters, etc.
  • the train scheduling information includes, but is not limited to, train interval time, stop time at each station, and the like.
  • Raw data typically comes from different data sources, including, for example, data collected from various databases and offline data such as data entered by a user via a user interface and online data such as data received from data collection devices in the traction power system . These data usually have different forms such as photos, tables, text, etc. Therefore, after collecting the raw data, it is necessary to convert these raw data with different formats into the target format, and perform processing such as data filtering as modeling data. These raw data can be processed according to preset rules (eg format conversion rules) using any known data processing techniques in the art. Afterwards, at least one virtual scene model is established based on the modeling data.
  • the established virtual scene model can be a plane model or a three-dimensional model.
  • FIG. 6 shows a schematic block diagram of a train operation optimization system according to one embodiment of the present disclosure. After training and running the optimization model, it is possible to control multiple trains actually running in the corresponding operating scenarios.
  • the train operation optimization system 600 in FIG. 6 includes a central control module 601 and an operation optimization module 602 .
  • the central controller 601 communicates with the on-board communication module of each train via a communication module (not shown in FIG. 6 ).
  • the operation optimization module 602 uses the operation optimization model trained through the above steps to output a list of acceleration values of each train at each moment in real time.
  • each train senses its own position and speed in the environment through the on-board camera in real time, that is, the train state, and sends it to the central controller 601 via the on-board communication module.
  • the central controller 601 After receiving the train state of each train, the central controller 601 sends it to the operation optimization module 602.
  • the operation optimization module 602 uses the trained operation optimization model to output the corresponding acceleration value according to the train state of each train. After that, the operation optimization module 602 returns the acceleration value of each train to the central controller 601 .
  • the central controller 601 sends the acceleration value to the onboard control module of the corresponding train via the communication module, so as to realize the speed control of the train.
  • FIG. 7 shows a schematic block diagram of a train operation optimization apparatus according to an embodiment of the present disclosure.
  • Each unit in FIG. 7 may be implemented by software, hardware (eg, integrated circuit, FPGA, etc.), or a combination of software and hardware.
  • the apparatus 700 includes a scene model obtaining unit 701 , an optimization model establishing unit 702 and an optimization model training unit 703 .
  • the scene model obtaining unit 701 is configured to obtain a virtual scene model of the traction power supply system of the train, where the virtual scene model corresponds to the operation scene of the traction power supply system.
  • the optimization model establishing unit 702 is configured to establish an operation optimization model, the operation optimization model is used to determine the corresponding train action according to the train state of each train in the virtual scene model, and the train action is used to update the train state of each train.
  • the optimization model training unit 703 is configured to iteratively adjust the model parameters of the operation optimization model according to the simulated power of the virtual scene model in the updated train state, so as to train the operation optimization model.
  • the optimization model training unit 703 further includes a train action determination unit, a train state update unit, a simulation power calculation unit, a reward value calculation unit, and a model parameter adjustment unit ( FIG. 7 . not shown).
  • the train action determination unit is configured to use the operation optimization model to determine, for each train in the virtual scene model, a train action corresponding to its previous train state.
  • the train state update unit is configured to update the previous train state of each train to the current train state based on the previous train state of each train and the determined train action.
  • the simulation power calculation unit is configured to calculate the current simulation power of the virtual scene model according to the current train state of each train.
  • the reward value calculation unit is configured to use the set reward and punishment function to calculate the reward value according to the current simulation power and the current train state and provide it to the operation optimization model.
  • the model parameter adjustment unit is configured to adjust the model parameters of the running optimization model according to the reward value.
  • the current train state is used to generate the current train operating state
  • the reward and punishment function is based on the comparison result between the current train operating state and the preset train operating state and the simulation results.
  • the simulated power is inversely proportional to the reward value
  • the train operating conditions include any one or more of the following: speed, running time, and arrival time.
  • the simulated power calculation unit is further configured to: convert the network topology of the virtual scene model in the current train state into an equivalent circuit, and the power supply of the equivalent circuit includes: at least one traction substation in the virtual scene model; calculating the inlet power of each traction substation in the at least one traction substation using the node voltage method; and comparing the calculated inlet power of each traction substation with Add to obtain the current simulation power of the virtual scene model.
  • the train state includes the position and speed of the train
  • the train action includes the acceleration of the train
  • the operational optimization model includes a deep neural network.
  • the operation optimization apparatus 700 further includes a train operation control unit (not shown in FIG. 7 ).
  • the train operation control unit is configured to utilize the trained operation optimization model to control each train actually operating in the operation scenario.
  • the train operation control unit is further configured to iteratively execute the following steps: receiving the current train state of each train actually running; using the trained operation optimization model, A corresponding train action is determined for each train according to the current train state; and the determined train action is sent to the corresponding train.
  • the scene model obtaining unit is further configured to: collect raw data related to the virtual scene model; perform data processing on the raw data according to preset rules to serve as modeling data; and building a virtual scene model based on the modeling data.
  • a computing device 800 for operational optimization of rail transit includes a central processing unit (CPU) 801 (eg, a processor) and a memory 802 coupled to the central processing unit (CPU) 801 .
  • the memory 802 is used to store computer-executable instructions, which, when executed, cause the central processing unit (CPU) 601 to execute the methods in the above embodiments.
  • a central processing unit (CPU) 801 and a memory 802 are connected to each other through a bus to which an input/output (I/O) interface is also connected.
  • the computing device 801 may also include a number of components (not shown in FIG.
  • the I/O interface includes but not limited to: an input unit, such as a keyboard, mouse, etc.; an output unit, such as various types of displays, speakers etc.; storage units, such as magnetic disks, optical discs, etc.; and communication units, such as network cards, modems, wireless communication transceivers, and the like.
  • the communication unit allows the computing device 801 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks.
  • a computer-readable storage medium carries computer-readable program instructions for carrying out various embodiments of the present disclosure.
  • a computer-readable storage medium may be a tangible device that can hold and store instructions for use by the instruction execution device.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Non-exhaustive list of computer readable storage media include: portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM) or flash memory), static random access memory (SRAM), portable compact disk read only memory (CD-ROM), digital versatile disk (DVD), memory sticks, floppy disks, mechanically coded devices, such as printers with instructions stored thereon Hole cards or raised structures in grooves, and any suitable combination of the above.
  • RAM random access memory
  • ROM read only memory
  • EPROM erasable programmable read only memory
  • flash memory static random access memory
  • SRAM static random access memory
  • CD-ROM compact disk read only memory
  • DVD digital versatile disk
  • memory sticks floppy disks
  • mechanically coded devices such as printers with instructions stored thereon Hole cards or raised structures in grooves, and any suitable combination of the above.
  • Computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (eg, light pulses through fiber optic cables), or through electrical wires transmitted electrical signals.
  • the present disclosure proposes a computer-readable storage medium having computer-executable instructions stored thereon for performing the functions of the present disclosure.
  • the present disclosure proposes a computer program product tangibly stored on a computer-readable storage medium and comprising computer-executable instructions, which when executed At least one processor is caused to perform the methods of various embodiments of the present disclosure.
  • the various example embodiments of the present disclosure may be implemented in hardware or special purpose circuits, software, firmware, logic, or any combination thereof. Certain aspects may be implemented in hardware, while other aspects may be implemented in firmware or software that may be executed by a controller, microprocessor or other computing device. While aspects of the embodiments of the present disclosure are illustrated or described as block diagrams, flowcharts, or using some other graphical representation, it is to be understood that the blocks, apparatus, systems, techniques, or methods described herein may be taken as non-limiting Examples of are implemented in hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controllers or other computing devices, or some combination thereof.
  • Computer-readable program instructions or computer program products for executing various embodiments of the present disclosure can also be stored in the cloud, and when invoking is required, users can access the data stored in the cloud through the mobile Internet, fixed network or other network.
  • the computer-readable program instructions of one embodiment of the present disclosure are executed, thereby implementing the technical solutions disclosed in accordance with various embodiments of the present disclosure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Mechanical Engineering (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Train Traffic Observation, Control, And Security (AREA)

Abstract

一种列车运行优化方法,包括:获得列车的牵引供电系统的虚拟场景模型,虚拟场景模型与牵引供电系统的运行场景相对应(21);建立运行优化模型,运行优化模型用于根据虚拟场景模型中的每辆列车的列车状态确定对应的列车动作,列车动作用于更新每辆列车的列车状态(22);以及利用强化学习,迭代地根据虚拟场景模型在更新的列车状态下的仿真功率调整运行优化模型的模型参数,以训练运行优化模型(23)。在该方法中,不仅地理信息、列车阻力、列车功率特性等与能耗有关的因素更接近实际情形,而且从整个牵引供电系统的能耗出发,而不仅仅考虑单辆列车的牵引能耗,使得运行优化更加全面和准确。

Description

列车运行优化方法及装置 技术领域
本公开内容涉及轨道交通的技术领域,更具体地说,涉及列车运行优化方法、装置、计算设备、计算机可读存储介质和程序产品。
背景技术
在轨道交通线路中,整个牵引供电系统包括列车、供电网络、车站、环境等多个组成部分。出于经济和环保的目的,需要尽可能降低牵引供电系统的能耗,即整条轨道交通线路在单位时间(如高峰小时、一昼夜或一年)内所消耗的总电量。另外,目前普遍采用预先计算的列车运行图来控制列车在轨道交通线路中的运行。列车运行图通常包括列车停站时间、列车数量、列车间隔时间和运行方向及区间,其用于控制列车运行的时间、位置和速度。
为了降低列车运行时牵引供电系统的能耗,工程师通常需要在特定的地理信息、速度限制和/或列车运动特征等条件下,根据经验来调整列车运行图,从而达到最小牵引能耗的优化结果。目前,通常认为列车以“滑行模式”行驶能够实现最小牵引能耗。参考图1(a),图1(a)示出了现有技术中列车在理想情况下以滑行模式行驶的示意图。在图1(a)中,一辆列车在距离为S A的两个站台之间的行驶分为四个阶段:0-S 1为加速阶段,在该阶段,列车在牵引力作用下以最大加速度持续加速,直至最大速度V 1;S 1-S 2为匀速行驶阶段,在该阶段,列车在牵引力作用下以最大速度V 1匀速行驶;S 2-S 3为滑行阶段,在该阶段,列车在无牵引力的情况下滑行,受阻力作用而减速到V 2;S 3-S A为减速阶段,在该阶段,列车以最大减速度减速,直至速度为0。图1(b)示出了现有技术中在叠加了速度限制的情况下列车以滑行模式行驶的示意图。相比于图1(a),图1(b)中则考虑了轨道交通线路上的一些速度限制。
发明内容
现有技术的运行优化方法在理想情况(如不考虑上下坡的地理信息、线性的列车阻力、线性的列车功率特性等)或在最多考虑了速度限制的情况下实现。事实上,列车阻力(如摩擦阻力、滚动阻力、滑动阻力、振动阻力、空气阻力等)随着行驶速度的不同而发生非线性的变化,在一些特定条件下(如坡道和隧道等),还会产生非线性的附加阻力,这些都导致了列车在运行时受到的阻力是非线性的。另外,列车的牵引特性和制动特性也是非线性的。这些非线性的因素导致现有技术的运行优化方法并不能得到具有最小牵引能耗的最优列车运行图。不仅如此,现有技术的运行优化方法还未考虑传导能耗(如导线损耗)和列车制动产生的再生能量。也就是说,现有技术的运行优化方法仅从最小化单辆列车的牵引能耗的角度出发,而并未考虑整个牵引供电系统的能耗。
本公开内容的第一实施例提出了一种列车运行优化方法。包括:获得列车的牵引供电系统的虚拟场景模型,虚拟场景模型与牵引供电系统的运行场景相对应;建立运行优化模型,运行优化模型用于根据虚拟场景模型中的每辆列车的列车状态确定对应的列车动作,列车动作用于更新每辆列车的列车状态;以及利用强化学习,迭代地根据虚拟场景模型在更新的列车状态下的仿真功率调整运行优化模型的模型参数,以训练运行优化模型。
在该实施例中,虚拟场景模型中的地理信息、列车阻力、列车功率特性等与能耗有关的因素更接近实际情形。而且,该方法从整个牵引供电系统的能耗出发,包括传导能耗、列车制动产生的再生能量等,而不仅仅考虑单辆列车的牵引能耗,使得运行优化更加全面和准确。这使得即使在复杂运行场景下,也能利用虚拟场景模型进行运行优化模型的训练,从而得到具有最小能耗的最优列车运行图。此外,通过强化学习能够实现自学习的模型训练,极少依赖于人工经验。
本公开内容的第二实施例提出了一种列车运行优化装置,包括:场景模型获得单元,其被配置为获得列车的牵引供电系统的虚拟场景模型,虚拟场景模型与牵引供电系统的运行场景相对应;优化模型建立单元,其被配置为建立运行优化模型,运行优化模型用于根据虚拟场景模型中的每辆列车的列车状态确定对应的列车动作,列车动作用于更新每辆列车的列车状态;以及优化模型训练单元,其被配置为迭代地根据虚拟场景模型在更新的列车状 态下的仿真功率调整运行优化模型的模型参数,以训练运行优化模型。
在该实施例中,虚拟场景模型中的地理信息、列车阻力、列车功率特性等与能耗有关的因素更接近实际情形。而且,该方法从整个牵引供电系统的能耗出发,包括传导能耗、列车制动产生的再生能量等,而不仅仅考虑单辆列车的牵引能耗,使得运行优化更加全面和准确。这使得即使在复杂运行场景下,也能利用虚拟场景模型进行运行优化模型的训练,从而得到具有最小能耗的最优列车运行图。此外,通过强化学习能够实现自学习的模型训练,极少依赖于人工经验。
本公开内容的第三实施例提出了一种计算设备,该计算设备包括:处理器;以及存储器,其用于存储计算机可执行指令,当计算机可执行指令被执行时使得处理器执行第一实施例中的方法。
本公开内容的第四实施例提出了一种计算机可读存储介质,该计算机可读存储介质具有存储在其上的计算机可执行指令,计算机可执行指令用于执行第一实施例的方法。
本公开内容的第五实施例提出了一种计算机程序产品,该计算机程序产品被有形地存储在计算机可读存储介质上,并且包括计算机可执行指令,计算机可执行指令在被执行时使至少一个处理器执行第一实施例的方法。
附图说明
结合附图并参考以下详细说明,本公开内容的各实施例的特征、优点及其他方面将变得更加明显,在此以示例性而非限制性的方式示出了本公开内容的若干实施例,在附图中:
图1(a)示出了现有技术中列车在理想情况下以滑行模式行驶的示意图;
图1(b)示出了现有技术中在叠加了速度限制的情况下列车以滑行模式行驶的示意图;
图2示出了根据本公开内容的一个实施例的列车运行优化方法的流程图;
图3示出了图2的实施例中训练运行优化模型的流程图;
图4示出了图2的实施例中计算虚拟场景模型的当前仿真功率的流程 图;
图5(a)示出了根据图2的实施例的一个示例性虚拟场景模型在当前时刻的网络拓扑结构;
图5(b)示出了图5(a)中的示例性虚拟场景模型在下一时刻的网络拓扑结构;
图6示出了根据本公开内容的一个实施例的列车运行优化系统的示意方框图;
图7示出了根据本公开内容的一个实施例的列车运行优化装置的示意方框图;以及
图8示出了根据本公开内容的一个实施例的用于列车运行优化的计算设备的示意方框图。
具体实施方式
以下参考附图详细描述本公开内容的各个示例性实施例。虽然以下所描述的示例性方法、装置包括在其它组件当中的硬件上执行的软件和/或固件,但是应当注意,这些示例仅仅是说明性的,而不应看作是限制性的。例如,考虑在硬件中独占地、在软件中独占地、或在硬件和软件的任何组合中可以实施任何或所有硬件、软件和固件组件。因此,虽然以下已经描述了示例性的方法和装置,但是本领域的技术人员应容易理解,所提供的示例并不用于限制用于实现这些方法和装置的方式。
此外,附图中的流程图和框图示出了根据本公开内容的各个实施例的方法和系统的可能实现的体系架构、功能和操作。应当注意,方框中所标注的功能也可以按照不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,或者它们有时也可以按照相反的顺序执行,这取决于所涉及的功能。同样应当注意的是,流程图和/或框图中的每个方框、以及流程图和/或框图中的方框的组合,可以使用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以使用专用硬件与计算机指令的组合来实现。
本文所使用的术语“包括”、“包含”及类似术语是开放性的术语,即“包括/包含但不限于”,表示还可以包括其他内容。术语“基于”是“至少 部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”等等。
下面根据一个实施例来说明本公开内容。图2示出了根据本公开内容的一个实施例的列车运行优化方法的流程图。参考图2,方法200从步骤21开始。在步骤21中,获得列车的牵引供电系统的虚拟场景模型,虚拟场景模型与牵引供电系统的运行场景相对应。如上所述,轨道交通线路的整个牵引供电系统包括列车、供电网络、车站、环境等多个组成部分,每个组成部分都具有其特定的参数或配置。一些参数或配置在轨道交通线路建成时便已固定,如列车最大加速度、长度、自重、最大负载、每个车站和隧道的地理信息(如坡度信息)、牵引变电所的数量和位置等;而其它一些参数或配置可以发生变化,如列车的间隔时间、载客率、列车数量、牵引变电所中的整流器是否正常工作等等。运行场景是指牵引供电系统在一组参数或配置下运行的情形。因此,对于牵引供电系统而言,可以有许多不同的运行场景。每个虚拟场景模型模拟一个特定的运行场景下的牵引供电系统。可以预先为牵引供电系统建立与目标运行场景对应的虚拟场景模型。每次建立的虚拟场景模型可以被保存在数据库中。随着数据库中的虚拟场景模型不断地扩充和累积,在后续需要时,可以从数据库中查找所需要的虚拟场景模型。
接下来,在步骤22中,建立运行优化模型,该运行优化模型用于根据虚拟场景模型中的每辆列车的列车状态确定对应的列车动作,列车动作用于更新每辆列车的列车状态。列车状态可以包括列车的位置和速度,列车动作可以包括列车的加速度。在本实施例中,利用强化学习来训练运行优化模型。运行优化模型可以以基于值、基于策略梯度或两者结合的方式来根据虚拟场景模型中的每辆列车的列车状态选择对应的列车动作。在本实施例中,运行优化模型包括深度神经网络。可以根据需要设计深度神经网络的结构,以实现端到端的训练。可以使用任何适当的连续或离散的深度强化学习方法,例如DQN或DDPG等。
最后,在步骤23中,利用强化学习,迭代地根据虚拟场景模型在更新的列车状态下的仿真功率调整运行优化模型的模型参数,以训练运行优化模型。由于期望使牵引供电系统在某个运行场景下(如特定的载客率、列车间隔时间等)的能耗最小,因此可以利用与该运行场景对应的虚拟场景模型的 仿真能耗来训练运行优化模型。模型的训练过程是运行优化模型与虚拟场景模型的不断交互过程。在训练过程中,利用运行优化模型更新虚拟场景模型中的每辆列车的列车状态,对该列车状态下的虚拟场景模型进行仿真以得到仿真功率,再根据仿真功率调整运行优化模型的模型参数。调整后的运行优化模型再次用于更新虚拟场景模型中的每辆列车的列车状态,对再次更新的列车状态下的虚拟场景模型进行仿真以得到新的仿真功率,继续根据该仿真功率调整运行优化模型的模型参数。如此不断迭代地进行上述过程,最终使得运行优化模型得以收敛。
下面参考图3具体说明上述训练过程。图3示出了图2的实施例中训练运行优化模型的流程图。在图3中,步骤23包括子步骤231-子步骤235。在子步骤231中,利用运行优化模型,为虚拟场景模型中的每辆列车确定与其前一列车状态相对应的列车动作。最初,将虚拟场景模型中的每辆列车的初始列车状态输入运行优化模型。运行优化模型的输出为各列车的一组初始列车动作。之后,每次利用运行优化模型确定与输入的列车状态对应的列车动作。在子步骤232中,根据每辆列车的前一列车状态和所确定的列车动作,将每辆列车的前一列车状态更新为当前列车状态,并提供给运行优化模型和虚拟场景模型。在本实施例中,列车状态包括列车的位置和速度,列车动作包括列车的加速度。可以通过以下公式(1)-(2)来计算每辆列车的当前列车状态。
v tj=v tj-1+Δt×a tj-1     (1)
Figure PCTCN2021084680-appb-000001
在上述公式(1)-(2)中,v tj和s tj分别表示在t j时刻的列车速度和列车位置,即当前列车状态。s tj-1和v tj-1分别表示在t j时刻的前一时刻t j-1的列车速度和列车位置,即前一列车状态。a tj-1表示运行优化模型输出的与前一列车状态对应的列车加速度。应当指出,列车加速度可以为正或者为负,也可以为0。当加速度为正时,表示该列车在加速行驶;当加速度为负时,表示该列车在减速行驶;当加速度为0时,表示该列车在匀速行驶。计算得到的当前列车状态一方面被反馈给运行优化模型用于下一列车状态的更新,另一方面被提供给虚拟场景模型进行仿真功率计算。
在子步骤233中,根据每辆列车的当前列车状态,计算虚拟场景模型的当前仿真功率。如前所述,虚拟场景模型能在最大程度上体现整个牵引供电系统在实际运行场景下的情形。因此,虚拟场景模型中所有牵引变电所的入口功率之和即为该虚拟场景模型在当前列车状态下的仿真功率。
下面参考图4来说明计算虚拟场景模型的当前仿真功率的过程。在图4中,子步骤233进一步包括子步骤2331-2333。在子步骤2331中,将虚拟场景模型在当前列车状态下的网络拓扑结构转换为等效电路,等效电路的电源包括虚拟场景模型中的至少一个牵引变电所。由于虚拟场景模型包括牵引供电系统在对应运行场景下的所有信息,包括但不限于牵引供电系统的供电网络参数、列车参数、运行线路和地理信息、附加载荷参数、以及列车调度信息,因此能够根据这些信息及列车的列车状态将虚拟场景模型在每个时刻的网络拓扑结构转换为等效电路。应当指出,由于列车位置在每个时刻都会发生变化,因此网络拓扑结构及其等效电路的电路结构和参数也同样会发生变化。在子步骤2332中,利用节点电压法计算该至少一个牵引变电所中的每个牵引变电所的入口功率。在将网络拓扑结构转换为等效电路之后,利用节点电压法列出非线性方程组,并通过牛顿迭代法求解该线性方程组,得到等效电路中各节点电压和各支路电流,最终计算得到每个牵引变电所入口处的总电流和电压。将每个牵引变电所入口处的电流和电压相乘便得到t j时刻的入口功率,即P TPSitj,其中TPS i表示第i个牵引变电所。在子步骤2333中,将所计算的每个牵引变电所的入口功率相加,得到虚拟场景模型的当前仿真功率,即
Figure PCTCN2021084680-appb-000002
图5(a)示出了一个示例性虚拟场景模型在当前时刻的网络拓扑结构。图5(b)示出了该示例性虚拟场景模型在下一时刻的网络拓扑结构。在图5(a)示出的网络拓扑结构500中,列车521-523在上行方向(图中向右)运行,列车524-526在下行方向(图中向左)运行。在当前时刻,列车521-522及524-525在牵引力作用下加速行驶,而列车523和526制动减速,且加速度值彼此不同。从图5(b)示出的网络拓扑结构501中可以看到,在下一时刻,列车521-526的位置都发生了变化。受到导线阻抗、地理信息、列车牵引特性等因素的影响,网络拓扑结构500和501的等效电路也同样发生变化。以网络拓扑结构500为例,两个牵引变电所510和511分别通过导线541-544 为上行方向的接触线531和下行方向的接触线532供电。上行方向的回流轨533通过导线551和553连接至牵引变电所510和511,下行方向的回流轨534通过导线552和554连接至牵引变电所510和511,从而构成了电流回路。在将网络拓扑结构500转换为等效电路时,牵引变电所510和511被等效为电源,列车521-526被等效为功率元件,接触线、回流轨和导线在该等效电路中产生阻抗。列车521-526的运行状态确定了其在等效电路中的牵引功率或制动功率。它们在牵引加速时消耗功率,而在制动减速时提供功率。可以根据以下公式(3)或(4)来计算牵引功率或制动功率:
P train=η×F×V                    (3)
P train=v train×i train                  (4)
在上述公式(3)和(4)中,P train为列车521-526的牵引功率或制动功率。在公式(3)中,V为列车速度,F为根据列车的牵引特征曲线或制动特征曲线在该列车速度下的牵引力或制动力,η为对应的转换效率。在公式(4)中,v train为跨列车521-526的电压,i train为流过列车521-526的电流。由于牵引变电所510和511都向接触线531和532供电,因此需要计算这两个牵引变电所510和511的入口A 1和A 2处的功率。之后将这两个入口A 1和A 2处的功率相加,便得到虚拟场景模型在该网络拓扑结构下的仿真功率。
回到图3,子步骤234包括利用设定的奖惩函数,根据当前仿真功率和当前列车状态计算奖励值并提供给运行优化模型。当通过运行优化模型得到的列车动作使虚拟场景模型的仿真结果靠近优化目标时,给予运行优化模型正奖励,反之则给予负奖励。在本实施例中,根据当前的列车运行状况与预设的列车运行状况之间的比较结果以及仿真功率来设定奖惩函数。列车运行状况包括以下各项中的任意一项或多项:速度、运行时间和到站时间。可以基于当前列车状态来生成当前的列车运行状况。以下公式(5)示出了奖惩函数的一个示例。
Figure PCTCN2021084680-appb-000003
从公式(5)中可以看出,仿真功率
Figure PCTCN2021084680-appb-000004
与t j时刻的奖励值R tj成反比。也就是说,仿真功率越小,奖励值R tj越大,反之亦然。预设的列车运行状况包括沿线的列车速度限制、站台之间的运行时间限制以及到站时间限制。当列车在t j时刻的速度违反速度限制或列车在站台之间的运行时间违反运行时间限制时,奖励值R tj为负常数;当列车在正确的时间停靠在目标站台时,奖励值R tj为正常数。应当指出,上述公式(5)仅仅为奖励函数的一个简单示例,本领域技术人员应当理解,可以根据其它附加的一个或多个优化目标来设置奖励函数。如果同时存在多个优化目标,可以为多个优化目标设计不同的评分函数,并以不同的权重将各评分函数组合在一起形成最终的奖励函数。将计算得到的奖励值提供给运行优化模型。
之后,在子步骤235中,根据奖励值调整运行优化模型的模型参数。如前所述,子步骤232中得到的当前列车状态被反馈给运行优化模型,作为下一次迭代时输入运行优化模型的列车状态。在下一次迭代中,在子步骤231中,利用调整后的运行优化模型确定列车动作,之后继续执行子步骤232-235。迭代地执行上述步骤231-235直至运行优化模型收敛。通过运行优化模型不断地更新列车动作来更新虚拟场景模型中的列车状态,从而实现运行优化模型与虚拟场景模型不断迭代训练,使得运行优化模型学习到最优的列车驾驶模式以及列车运行图。
在上述实施例中,虚拟场景模型中的地理信息、列车阻力、列车功率特性等与能耗有关的因素更接近实际情形。而且,该方法从整个牵引供电系统的能耗出发,包括传导能耗、列车制动产生的再生能量,而不仅仅考虑单辆列车的牵引能耗,使得运行优化更加全面和准确。这使得即使在复杂运行场景下,也能利用虚拟场景模型进行运行优化模型的训练,从而得到具有最小能耗的最优列车运行图。此外,通过强化学习能够实现自学习的模型训练,极少依赖于人工经验。
在依据本公开内容的一个实施例之中,步骤21进一步包括:收集与虚拟场景模型有关的原始数据;按照预设规则对原始数据进行数据处理,以作为建模数据;以及基于建模数据建立虚拟场景模型。原始数据包括为牵引供电系统建立虚拟场景模型所需的所有相关数据,例如以下各项中的至少一项:牵引供电系统的供电网络参数、列车参数、运行线路和地理信息、附加载荷 参数、以及列车调度信息。供电网络参数包括但不限于整流器参数(如短路电流、导线类型、负载损耗、耦合因数等)、断路器参数(如连接关系、额定绝缘电压、额定冲击耐受电压等)、以及接触线和回流轨参数(如送电距离、导线类型、导线阻抗、内径、外径、电阻率、磨损、温度系数、接头类型、馈电点等)。列车参数包括但不限于最大加速度、列车等级、长度、自重、旋转质量、最大负载、最大速度、逆变器参数、电机参数等。运行线路和地理信息包括但不限于运行方向、车站数量和物理坐标、编组排列、隧道因子、线路地形信息(如梯度数值)等。附加载荷参数包括但不限于车载设备(如通风照明设备、显示设备)参数、站台设备(如电梯、通风照明设备、通信设备)参数等。列车调度信息包括但不限于列车间隔时间、在每个车站的停靠站时间等。本领域技术人员能够理解,以上仅列出了为牵引供电系统建立虚拟场景模型所需的部分数据,它们仅用于示例而不是限制的目的。
原始数据通常来自不同的数据源,例如包括从各个不同数据库收集的数据以及由用户经由用户接口输入的数据之类的离线数据以及从牵引供电系统中的数据采集设备接收的数据之类的在线数据。这些数据通常具有不同的形式,如照片、表格、文字等。因此,在收集原始数据之后,需要将具有不同格式的这些原始数据转换为目标格式,并进行数据过滤之类的处理,作为建模数据。可以按照预设规则(如格式转换规则),使用本领域中任何已知的数据处理技术来对这些原始数据进行处理。之后,基于建模数据建立至少一个虚拟场景模型。所建立的虚拟场景模型可以是平面模型,也可以是三维模型。
图6示出了根据本公开内容的一个实施例的列车运行优化系统的示意方框图。在训练运行优化模型之后,便可以对在对应运行场景下实际运行的多辆列车进行控制。图6中的列车运行优化系统600包括中央控制模块601和运行优化模块602。中央控制器601经由通信模块(图6中未示出)与各列车的车载通信模块进行通信。运行优化模块602使用经过上述步骤训练的运行优化模型实时地输出各列车在每个时刻的加速度值列表。具体来说,在自动驾驶控制的过程中,各列车实时地通过车载摄像头感测其自身在环境中的位置和速度,即列车状态,并经由车载通信模块发送给中央控制器601。中央控制器601在接收到各列车的列车状态后,将其发送给运行优化模块 602。运行优化模块602利用训练好的运行优化模型,根据各列车的列车状态输出对应加速度值。之后,运行优化模块602将各列车的加速度值返回给中央控制器601。中央控制器601经由通信模块将加速度值发送给对应列车的车载控制模块,从而实现列车速度控制。
图7示出了根据本公开内容的一个实施例的列车运行优化装置的示意方框图。图7中的各单元可以利用软件、硬件(例如集成电路、FPGA等)或者软硬件结合的方式来实现。参照图7,装置700包括场景模型获得单元701、优化模型建立单元702和优化模型训练单元703。场景模型获得单元701被配置为获得列车的牵引供电系统的虚拟场景模型,虚拟场景模型与牵引供电系统的运行场景相对应。优化模型建立单元702被配置为建立运行优化模型,运行优化模型用于根据虚拟场景模型中的每辆列车的列车状态确定对应的列车动作,列车动作用于更新每辆列车的列车状态。优化模型训练单元703被配置为迭代地根据虚拟场景模型在更新的列车状态下的仿真功率调整运行优化模型的模型参数,以训练运行优化模型。
可选地,在依据本公开内容的一个实施例之中,优化模型训练单元703进一步包括列车动作确定单元、列车状态更新单元、仿真功率计算单元、奖励值计算单元和模型参数调整单元(图7中未示出)。列车动作确定单元被配置为利用运行优化模型,为虚拟场景模型中的每辆列车确定与其前一列车状态相对应的列车动作。列车状态更新单元被配置为根据每辆列车的前一列车状态和所确定的列车动作,将每辆列车的前一列车状态更新为当前列车状态。仿真功率计算单元被配置为根据每辆列车的当前列车状态,计算虚拟场景模型的当前仿真功率。奖励值计算单元被配置为利用设定的奖惩函数,根据当前仿真功率和当前列车状态计算奖励值并提供给运行优化模型。模型参数调整单元被配置为根据奖励值调整运行优化模型的模型参数。
可选地,在依据本公开内容的一个实施例之中,当前列车状态用于生成当前的列车运行状况,奖惩函数根据当前的列车运行状况与预设的列车运行状况之间的比较结果以及仿真功率来设定,仿真功率与奖励值成反比,并且,列车运行状况包括以下各项中的任意一项或多项:速度、运行时间和到站时间。
可选地,在依据本公开内容的一个实施例之中,仿真功率计算单元被 进一步配置为:将虚拟场景模型在当前列车状态下的网络拓扑结构转换为等效电路,等效电路的电源包括虚拟场景模型中的至少一个牵引变电所;利用节点电压法计算至少一个牵引变电所中的每个牵引变电所的入口功率;以及将所计算的每个牵引变电所的入口功率相加,得到虚拟场景模型的所述当前仿真功率。
可选地,在依据本公开内容的一个实施例之中,列车状态包括列车的位置和速度,列车动作包括列车的加速度。
可选地,在依据本公开内容的一个实施例之中,运行优化模型包括深度神经网络。
可选地,在依据本公开内容的一个实施例之中,运行优化装置700进一步包括列车运行控制单元(图7中未示出)。列车运行控制单元被配置为利用训练好的运行优化模型,对在运行场景下实际运行的每辆列车进行控制。
可选地,在依据本公开内容的一个实施例之中,列车运行控制单元被进一步配置为迭代地执行以下步骤:接收实际运行的每辆列车的当前列车状态;利用训练好的运行优化模型,根据当前列车状态为每辆列车确定对应的列车动作;以及将所确定的列车动作发送给对应列车。
可选地,在依据本公开内容的一个实施例之中,场景模型获得单元被进一步配置为:收集与虚拟场景模型有关的原始数据;按照预设规则对原始数据进行数据处理,以作为建模数据;以及基于建模数据建立虚拟场景模型。
图8示出了根据本公开内容的一个实施例的用于列车运行优化的计算设备的示意方框图。从图8中可以看出,用于轨道交通的运行优化的计算设备800包括中央处理单元(CPU)801(例如处理器)以及与中央处理单元(CPU)801耦合的存储器802。存储器802用于存储计算机可执行指令,当计算机可执行指令被执行时使得中央处理单元(CPU)601执行以上实施例中的方法。中央处理单元(CPU)801和存储器802通过总线彼此相连,输入/输出(I/O)接口也连接至总线。计算设备801还可以包括连接至I/O接口的多个部件(图8中未示出),包括但不限于:输入单元,例如键盘、鼠标等;输出单元,例如各种类型的显示器、扬声器等;存储单元,例如磁盘、光盘等;以及通信单元,例如网卡、调制解调器、无线通信收发机等。通信单元允许该计算设备801通过诸如因特网的计算机网络和/或各种电信 网络与其他设备交换信息/数据。
此外,替代地,上述方法能够通过计算机可读存储介质来实现。计算机可读存储介质上载有用于执行本公开内容的各个实施例的计算机可读程序指令。计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是但不限于电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。这里所使用的计算机可读存储介质不被解释为瞬时信号本身,诸如无线电波或者其他自由传播的电磁波、通过波导或其他传输媒介传播的电磁波(例如,通过光纤电缆的光脉冲)、或者通过电线传输的电信号。
因此,在另一个实施例中,本公开内容提出了一种计算机可读存储介质,该计算机可读存储介质具有存储在其上的计算机可执行指令,计算机可执行指令用于执行本公开内容的各个实施例中的方法。
在另一个实施例中,本公开内容提出了一种计算机程序产品,该计算机程序产品被有形地存储在计算机可读存储介质上,并且包括计算机可执行指令,该计算机可执行指令在被执行时使至少一个处理器执行本公开内容的各个实施例中的方法。
一般而言,本公开内容的各个示例实施例可以在硬件或专用电路、软件、固件、逻辑,或其任何组合中实施。某些方面可以在硬件中实施,而其他方面可以在可以由控制器、微处理器或其他计算设备执行的固件或软件中实施。当本公开内容的实施例的各方面被图示或描述为框图、流程图或使用某些其他图形表示时,将理解此处描述的方框、装置、系统、技术或方法可以作为非限制性的示例在硬件、软件、固件、专用电路或逻辑、通用硬件或控制器或其他计算设备,或其某些组合中实施。
用于执行本公开内容的各个实施例的计算机可读程序指令或者计算机 程序产品也能够存储在云端,在需要调用时,用户能够通过移动互联网、固网或者其他网络访问存储在云端上的用于执行本公开内容的一个实施例的计算机可读程序指令,从而实施依据本公开内容的各个实施例所公开内容的技术方案。
虽然已经参考若干具体实施例描述了本公开内容的实施例,但是应当理解,本公开内容的实施例并不限于所公开内容的具体实施例。本公开内容的实施例旨在涵盖在所附权利要求的精神和范围内所包括的各种修改和等同布置。权利要求的范围符合最宽泛的解释,从而包含所有这样的修改及等同结构和功能。

Claims (21)

  1. 列车运行优化方法,包括:
    获得列车的牵引供电系统的虚拟场景模型,所述虚拟场景模型与所述牵引供电系统的运行场景相对应;
    建立运行优化模型,所述运行优化模型用于根据所述虚拟场景模型中的每辆列车的列车状态确定对应的列车动作,所述列车动作用于更新所述每辆列车的列车状态;以及
    利用强化学习,迭代地根据所述虚拟场景模型在更新的列车状态下的仿真功率调整所述运行优化模型的模型参数,以训练所述运行优化模型。
  2. 根据权利要求1所述的方法,其中,利用强化学习,迭代地根据所述虚拟场景模型在更新的列车状态下的仿真功率调整所述运行优化模型的模型参数,以训练所述运行优化模型进一步包括:
    迭代地执行以下步骤直至所述运行优化模型收敛:
    利用所述运行优化模型,为所述虚拟场景模型中的所述每辆列车确定与其前一列车状态相对应的列车动作;
    根据所述每辆列车的所述前一列车状态和所确定的列车动作,将所述每辆列车的所述前一列车状态更新为当前列车状态,并提供给所述运行优化模型和所述虚拟场景模型;
    根据所述每辆列车的所述当前列车状态,计算所述虚拟场景模型的当前仿真功率;
    利用设定的奖惩函数,根据所述当前仿真功率和所述当前列车状态计算奖励值并提供给所述运行优化模型;以及
    根据所述奖励值调整所述运行优化模型的模型参数。
  3. 根据权利要求2所述的方法,其中,所述当前列车状态用于生成当前的列车运行状况,所述奖惩函数根据所述当前的列车运行状况与预设的列车运行状况之间的比较结果以及所述仿真功率来设定,所述仿真功率与所述奖励值成反比,并且,所述列车运行状况包括以下各项中的任意一项或多项: 速度、运行时间和到站时间。
  4. 根据权利要求2所述的方法,其中,根据所述每辆列车的所述当前列车状态,计算所述虚拟场景模型的当前仿真功率进一步包括:
    将所述虚拟场景模型在所述当前列车状态下的网络拓扑结构转换为等效电路,所述等效电路的电源包括所述虚拟场景模型中的至少一个牵引变电所;
    利用节点电压法计算所述至少一个牵引变电所中的每个牵引变电所的入口功率;以及
    将所计算的所述每个牵引变电所的入口功率相加,得到所述虚拟场景模型的所述当前仿真功率。
  5. 根据权利要求2所述的方法,其中,所述列车状态包括列车的位置和速度,所述列车动作包括列车的加速度。
  6. 根据权利要求1所述的方法,其中,所述运行优化模型包括深度神经网络。
  7. 根据权利要求1所述的方法,进一步包括:利用训练好的运行优化模型,对在所述运行场景下实际运行的每辆列车进行控制。
  8. 根据权利要求7所述的方法,其中,利用训练好的运行优化模型,对在所述运行场景下实际运行的每辆列车进行控制进一步包括:
    迭代地执行以下步骤:
    接收所述实际运行的每辆列车的当前列车状态;
    利用训练好的运行优化模型,根据所述当前列车状态为所述每辆列车确定对应的列车动作;以及
    将所确定的列车动作发送给对应列车。
  9. 根据权利要求1所述的方法,其中,获得轨道交通的牵引供电系统 的虚拟场景模型进一步包括:
    收集与所述虚拟场景模型有关的原始数据;
    按照预设规则对所述原始数据进行数据处理,以作为建模数据;以及
    基于所述建模数据建立所述虚拟场景模型。
  10. 列车运行优化装置,包括:
    场景模型获得单元,其被配置为获得列车的牵引供电系统的虚拟场景模型,所述虚拟场景模型与所述牵引供电系统的运行场景相对应;
    优化模型建立单元,其被配置为建立运行优化模型,所述运行优化模型用于根据所述虚拟场景模型中的每辆列车的列车状态确定对应的列车动作,所述列车动作用于更新所述每辆列车的列车状态;以及
    优化模型训练单元,其被配置为迭代地根据所述虚拟场景模型在更新的列车状态下的仿真功率调整所述运行优化模型的模型参数,以训练所述运行优化模型。
  11. 根据权利要求10所述的装置,其中,所述优化模型训练单元进一步包括:
    列车动作确定单元,其被配置为利用所述运行优化模型,为所述虚拟场景模型中的所述每辆列车确定与其前一列车状态相对应的列车动作;
    列车状态更新单元,其被配置为根据所述每辆列车的所述前一列车状态和所确定的列车动作,将所述每辆列车的所述前一列车状态更新为当前列车状态;
    仿真功率计算单元,其被配置为根据所述每辆列车的所述当前列车状态,计算所述虚拟场景模型的当前仿真功率;
    奖励值计算单元,其被配置为利用设定的奖惩函数,根据所述当前仿真功率和所述当前列车状态计算奖励值并提供给所述运行优化模型;以及
    模型参数调整单元,其被配置为根据所述奖励值调整所述运行优化模型的模型参数。
  12. 根据权利要求11所述的装置,其中,所述当前列车状态用于生成 当前的列车运行状况,所述奖惩函数根据所述当前的列车运行状况与预设的列车运行状况之间的比较结果以及所述仿真功率来设定,所述仿真功率与所述奖励值成反比,并且,所述列车运行状况包括以下各项中的任意一项或多项:速度、运行时间和到站时间。
  13. 根据权利要求11所述的装置,其中,所述仿真功率计算单元被进一步配置为:
    将所述虚拟场景模型在所述当前列车状态下的网络拓扑结构转换为等效电路,所述等效电路的电源包括所述虚拟场景模型中的至少一个牵引变电所;
    利用节点电压法计算所述至少一个牵引变电所中的每个牵引变电所的入口功率;以及
    将所计算的所述每个牵引变电所的入口功率相加,得到所述虚拟场景模型的所述当前仿真功率。
  14. 根据权利要求11所述的装置,其中,所述列车状态包括列车的位置和速度,所述列车动作包括列车的加速度。
  15. 根据权利要求10所述的装置,其中,其中,所述运行优化模型包括深度神经网络。
  16. 根据权利要求10所述的装置,进一步包括列车运行控制单元,其被配置为:利用训练好的运行优化模型,对在所述运行场景下实际运行的每辆列车进行控制。
  17. 根据权利要求16所述的装置,其中,所述列车运行控制单元被进一步配置为迭代地执行以下步骤:
    接收所述实际运行的每辆列车的当前列车状态;
    利用训练好的运行优化模型,根据所述当前列车状态为所述每辆列车确定对应的列车动作;以及
    将所确定的列车动作发送给对应列车。
  18. 根据权利要求10所述的装置,其中,所述场景模型获得单元被进一步配置为:
    收集与所述虚拟场景模型有关的原始数据;
    按照预设规则对所述原始数据进行数据处理,以作为建模数据;以及
    基于所述建模数据建立所述虚拟场景模型。
  19. 计算设备,包括:
    处理器;以及
    存储器,其用于存储计算机可执行指令,当所述计算机可执行指令被执行时使得所述处理器执行根据权利要求1-9中任一项所述的方法。
  20. 计算机可读存储介质,所述计算机可读存储介质具有存储在其上的计算机可执行指令,所述计算机可执行指令用于执行根据权利要求1-9中任一项所述的方法。
  21. 计算机程序产品,所述计算机程序产品被有形地存储在计算机可读存储介质上,并且包括计算机可执行指令,所述计算机可执行指令在被执行时使至少一个处理器执行根据权利要求1-9中任一项所述的方法。
PCT/CN2021/084680 2021-03-31 2021-03-31 列车运行优化方法及装置 WO2022205175A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2021/084680 WO2022205175A1 (zh) 2021-03-31 2021-03-31 列车运行优化方法及装置
CN202180093314.9A CN116888030A (zh) 2021-03-31 2021-03-31 列车运行优化方法及装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/084680 WO2022205175A1 (zh) 2021-03-31 2021-03-31 列车运行优化方法及装置

Publications (1)

Publication Number Publication Date
WO2022205175A1 true WO2022205175A1 (zh) 2022-10-06

Family

ID=83457708

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/084680 WO2022205175A1 (zh) 2021-03-31 2021-03-31 列车运行优化方法及装置

Country Status (2)

Country Link
CN (1) CN116888030A (zh)
WO (1) WO2022205175A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115892140A (zh) * 2022-11-23 2023-04-04 北京交通大学 基于状态预测和群体智能的再生能任务预分配方法及系统
CN117682429A (zh) * 2024-02-01 2024-03-12 华芯(嘉兴)智能装备有限公司 一种物料控制系统的天车搬运指令调度方法及装置
CN117681932A (zh) * 2024-01-02 2024-03-12 北京交通大学 一种基于虚拟连挂的重载列车控制方法、系统及存储介质
CN118194710A (zh) * 2024-03-20 2024-06-14 华东交通大学 一种用于磁悬浮列车的多目标优化方法及系统

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102706274B1 (ko) * 2023-11-29 2024-09-13 주식회사 디메타 (D-meta,corp.) 강화학습 기반의 열차 제어 방법 및 이를 수행하는 시스템

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140012454A1 (en) * 2012-07-09 2014-01-09 General Electric Company Method and system for timetable optimization utilizing energy consumption factors
CN106651009A (zh) * 2016-11-23 2017-05-10 北京交通大学 城市轨道交通任意多车协作的节能优化控制方法
CN106774275A (zh) * 2017-01-16 2017-05-31 湖南中车时代通信信号有限公司 可视化列车运行监控装置的控制功能的测试系统和方法
CN110497943A (zh) * 2019-09-03 2019-11-26 西南交通大学 一种基于强化学习的城轨列车节能运行策略在线优化方法
CN110562301A (zh) * 2019-08-16 2019-12-13 北京交通大学 基于q学习的地铁列车节能驾驶曲线计算方法
CN112116156A (zh) * 2020-09-18 2020-12-22 中南大学 基于深度强化学习的混动列车的能量管理方法及系统

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140012454A1 (en) * 2012-07-09 2014-01-09 General Electric Company Method and system for timetable optimization utilizing energy consumption factors
CN106651009A (zh) * 2016-11-23 2017-05-10 北京交通大学 城市轨道交通任意多车协作的节能优化控制方法
CN106774275A (zh) * 2017-01-16 2017-05-31 湖南中车时代通信信号有限公司 可视化列车运行监控装置的控制功能的测试系统和方法
CN110562301A (zh) * 2019-08-16 2019-12-13 北京交通大学 基于q学习的地铁列车节能驾驶曲线计算方法
CN110497943A (zh) * 2019-09-03 2019-11-26 西南交通大学 一种基于强化学习的城轨列车节能运行策略在线优化方法
CN112116156A (zh) * 2020-09-18 2020-12-22 中南大学 基于深度强化学习的混动列车的能量管理方法及系统

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115892140A (zh) * 2022-11-23 2023-04-04 北京交通大学 基于状态预测和群体智能的再生能任务预分配方法及系统
CN117681932A (zh) * 2024-01-02 2024-03-12 北京交通大学 一种基于虚拟连挂的重载列车控制方法、系统及存储介质
CN117682429A (zh) * 2024-02-01 2024-03-12 华芯(嘉兴)智能装备有限公司 一种物料控制系统的天车搬运指令调度方法及装置
CN117682429B (zh) * 2024-02-01 2024-04-05 华芯(嘉兴)智能装备有限公司 一种物料控制系统的天车搬运指令调度方法及装置
CN118194710A (zh) * 2024-03-20 2024-06-14 华东交通大学 一种用于磁悬浮列车的多目标优化方法及系统

Also Published As

Publication number Publication date
CN116888030A (zh) 2023-10-13

Similar Documents

Publication Publication Date Title
WO2022205175A1 (zh) 列车运行优化方法及装置
WO2022178865A1 (zh) 轨道交通的牵引供电系统的监控和预测方法及装置
Thibault et al. A unified approach for electric vehicles range maximization via eco-routing, eco-driving, and energy consumption prediction
CN112700639B (zh) 一种基于联邦学习与数字孪生的智能交通路径规划方法
Huang et al. Saving energy and improving service quality: Bicriteria train scheduling in urban rail transit systems
CN108791367B (zh) 列车的节能操纵方法
Li et al. Dynamic trajectory optimization design for railway driver advisory system
Liu et al. Cooperative optimal control of the following operation of high-speed trains
CN113408189B (zh) 基于可变元胞的城市多点循环式紧急疏散与仿真推演方法
Han et al. Leveraging multiple connected traffic light signals in an energy-efficient speed planner
JP7016676B2 (ja) 車両制御装置及びその動作方法
CN108122052A (zh) 航班延误信息的推送方法、系统、存储介质和电子设备
CN111002975B (zh) 车辆能量管理方法、系统、电子设备和存储介质
Powell et al. A comparison of modelled and real-life driving profiles for the simulation of railway vehicle operation
Yi et al. Energy aware driving: Optimal electric vehicle speed profiles for sustainability in transportation
CN110910642A (zh) 一种考虑混合交通系统的公交线路分析方法
Zhao et al. Dynamic eco‐driving on signalized arterial corridors during the green phase for the connected vehicles
Umiliacchi et al. Delay management and energy consumption minimisation on a single‐track railway
CN109655294B (zh) 基于混合动力的虚拟轨道列车半实物仿真系统
Yu et al. Performance of an eco‐driving model predictive control system for hevs during car following
CN117962967A (zh) 一种基于虚拟编组的列车协同控制方法
Cha et al. Discrete event simulation of Maglev transport considering traffic waves
Ying et al. Energy‐efficient train operation with steep track and speed limits: A novel Pontryagin's maximum principle‐based approach for adjoint variable discontinuity cases
Zhou et al. Optimal automatic train operation via deep reinforcement learning
Gao et al. Control of a heterogeneous vehicular platoon with uniform communication delay

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21933840

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202180093314.9

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21933840

Country of ref document: EP

Kind code of ref document: A1