CN114550456B - Urban traffic jam scheduling method based on reinforcement learning - Google Patents

Urban traffic jam scheduling method based on reinforcement learning Download PDF

Info

Publication number
CN114550456B
CN114550456B CN202210188427.3A CN202210188427A CN114550456B CN 114550456 B CN114550456 B CN 114550456B CN 202210188427 A CN202210188427 A CN 202210188427A CN 114550456 B CN114550456 B CN 114550456B
Authority
CN
China
Prior art keywords
intersection
traffic
traffic light
reinforcement learning
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210188427.3A
Other languages
Chinese (zh)
Other versions
CN114550456A (en
Inventor
肖友
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Changan Automobile Co Ltd
Original Assignee
Chongqing Changan Automobile Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Changan Automobile Co Ltd filed Critical Chongqing Changan Automobile Co Ltd
Priority to CN202210188427.3A priority Critical patent/CN114550456B/en
Publication of CN114550456A publication Critical patent/CN114550456A/en
Application granted granted Critical
Publication of CN114550456B publication Critical patent/CN114550456B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/048Detecting movement of traffic to be counted or controlled with provision for compensation of environmental or other condition, e.g. snow, vehicle stopped at detector
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/07Controlling traffic signals
    • G08G1/08Controlling traffic signals according to detected number or speed of vehicles
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/09Arrangements for giving variable traffic instructions
    • G08G1/095Traffic lights
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2119/00Details relating to the type or aim of the analysis or the optimisation
    • G06F2119/02Reliability analysis or reliability optimisation; Failure analysis, e.g. worst case scenario performance, failure mode and effects analysis [FMEA]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Chemical & Material Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention discloses an urban traffic jam scheduling method based on reinforcement learning, which comprises the steps of acquiring vehicle quantity information, vehicle queuing information and real-time data of traffic light states of an urban road intersection through an image sensor and an inductance sensor; then utilizing a machine learning algorithm, and forming intersection road condition state data as scheduling model training data according to real-time data of vehicle quantity information, vehicle queuing information and traffic light states by combining intersection priori knowledge of road section limit and lane information obtained from image information and storage structural data; calculating a reward signal according to the passing effect and the reward function of each lane of the intersection fed back by the environment by the scheduling model, so as to train the scheduling model; training a scheduling model based on intersection road condition state data and intersection traffic safety criteria by using a reinforcement learning algorithm; and taking intersection road condition state data as input, and outputting a traffic light state instruction and corresponding traffic light control signals through the trained scheduling model.

Description

Urban traffic jam scheduling method based on reinforcement learning
Technical Field
The invention relates to the field of intelligent traffic, in particular to an urban traffic jam scheduling method based on reinforcement learning.
Background
With the continuous improvement of the economic level of people and the promotion of the urban process, the problem of urban traffic jam is more serious when automobiles serve as the most important transportation means and enter thousands of households. Traffic jams, on the one hand, reduce social productivity, cause substantial economic losses, consume fuel resources, and cause serious carbon dioxide emissions problems. Therefore, the urban traffic efficiency is improved, and the optimized traffic scheduling method occupies an important position in the modern traffic field, wherein the traffic light intersection traffic is the most common traffic efficiency bottleneck of urban road sections.
The existing traffic light control method is mainly divided into two major categories, one category is a traditional signal lamp control algorithm based on rules, such as algorithms of fixed duration, traffic flow, lane occupation ratio and the like, the cognition of the method on the scene is one-sided, the traffic flow scheduling is difficult to deal with under the complex scene, and the vehicle passing efficiency is low. The other type is an adaptive control algorithm based on machine learning, such as a traffic light scheduling algorithm based on reinforcement learning, the reinforcement learning has achieved good performance in the fields of game games, optimized scheduling and the like, and attention is paid to the traffic light control field in recent years due to the characteristics of self-learning and decision capability improvement of reinforcement learning.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to solve the technical problems that: how to provide a reinforced learning-based urban traffic jam scheduling method for improving urban vehicle traffic efficiency and relieving traffic jam.
In order to solve the technical problems, the invention adopts the following technical scheme:
a city traffic jam scheduling method based on reinforcement learning comprises the following steps:
(1) Acquiring real-time data of vehicle quantity information, vehicle queuing information and traffic light states of an urban road intersection through an image sensor and an inductance sensor;
(2) Combining the road section limit acquired from the image information and the reserve structural data with the intersection priori knowledge of the lane information to form intersection road condition state data as scheduling model training data by utilizing a machine learning algorithm according to the real-time data of the number information of vehicles, the queuing information of the vehicles and the traffic light state;
(3) Adopting a reinforcement learning algorithm, selecting a traffic light state switching action in an action space for traffic light state switching by a scheduling model according to intersection road condition state data and intersection traffic safety criteria at a given moment, calculating a reward signal according to traffic effects and reward functions of all lanes of an intersection fed back by environment, and maximizing the action selected by the model after multiple iterations so as to train the scheduling model;
(4) And taking intersection road condition state data as input, and outputting a traffic light state instruction and corresponding traffic light control signals through the trained scheduling model.
As an optimization, in the step (1), the running speed of the vehicle approaching the intersection is also acquired through a laser radar, and the environmental state information of the intersection is also acquired through a temperature sensor and a humidity sensor.
In the step (2), data cleaning and feature construction data preprocessing work is performed on real-time data of vehicle quantity information, vehicle queuing information and traffic light states, and then any one of CNN, MLP, GBDT, SVM machine learning algorithms is utilized to extract structural real-time road condition features input as a scheduling model.
As an optimization, in step (2), the intersection a priori knowledge includes road segment speed limit, steering limit, number of lanes, lane category, and traffic light switching duration.
As optimization, in the step (3), the reinforcement learning algorithm comprises a Q-learning or time difference algorithm, the input characteristics of the reinforcement learning algorithm and the variables of the reward function are obtained from the road condition state data of the intersection in the step (2), the input characteristics of the reinforcement learning algorithm comprise the average speed of vehicles of each lane of the intersection, the number of vehicles, the positions of the vehicles, the number of lanes, the type of lanes, the weather state, the accident state and the traffic efficiency, wherein the traffic efficiency is calculated by the formula (1), and the variables of the reward function comprise the traffic number, the waiting time of the vehicles, the average speed difference of the vehicles before and after the traffic and whether traffic lights are switched;
Figure BDA0003524528730000021
wherein efficiency is the overall passing efficiency of the vehicle, v car_avg For average speed of vehicles at intersection v lane_speed_limit Is the upper limit speed of the crossing.
As an optimization, in the step (3), the intersection traffic safety criterion is a basic constraint on the safe traffic of the intersection so as to ensure that traffic of each lane cannot collide.
In the step (4), the road condition data of the intersection and the priori knowledge of the intersection are input into a scheduling model to obtain the target state of the traffic light, if the current traffic light state is consistent with the target state, the traffic light switching action is not performed, and otherwise, the traffic light is switched to the target state.
In summary, the beneficial effects of the invention are as follows: the invention solves the problems of incomplete strategy input and inflexible control strategy of the traditional scheduling algorithm by combining the current intersection road condition information with the reinforcement learning algorithm, provides a solution for the urban complex traffic network scheduling, effectively relieves the traffic jam condition and improves the urban vehicle passing efficiency.
Drawings
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings, in which:
FIG. 1 is a flow chart of the overall control of the intersection vehicle dispatch in the present invention;
FIG. 2 is a flow chart of reinforcement learning model information in the present invention;
fig. 3 is a diagram of an active state space of a traffic lamp according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
As shown in fig. 1 and 2, the urban traffic jam scheduling method based on reinforcement learning in this embodiment includes the following steps:
(1) Acquiring real-time data of vehicle quantity information, vehicle queuing information and traffic light states of an urban road intersection through an image sensor and an inductance sensor;
(2) Combining the road section limit acquired from the image information and the reserve structural data with the intersection priori knowledge of the lane information to form intersection road condition state data as scheduling model training data by utilizing a machine learning algorithm according to the real-time data of the number information of vehicles, the queuing information of the vehicles and the traffic light state;
(3) Adopting a reinforcement learning algorithm, selecting a traffic light state switching action in an action space for traffic light state switching by a scheduling model according to intersection road condition state data and intersection traffic safety criteria at a given moment, calculating a reward signal according to traffic effects and reward functions of all lanes of an intersection fed back by environment, and maximizing the action selected by the model after multiple iterations so as to train the scheduling model;
(4) And taking intersection road condition state data as input, and outputting a traffic light state instruction and corresponding traffic light control signals through the trained scheduling model.
In this embodiment, in step (1), the traveling speed of the vehicle approaching the intersection is also obtained by the laser radar, and the environmental state information of the intersection is also obtained by the temperature sensor and the humidity sensor.
In the specific embodiment, in the step (2), data preprocessing work of data cleaning and feature construction is performed on real-time data of vehicle number information, vehicle queuing information and traffic light states, and then any one machine learning algorithm in CNN, MLP, GBDT, SVM is utilized to extract structured real-time road condition features input as a scheduling model.
In this embodiment, in step (2), the intersection priori knowledge includes a road segment speed limit, a steering limit, a number of lanes, a lane category, and a traffic light switching duration.
In this specific embodiment, in step (3), the reinforcement learning algorithm includes a Q-learning or time difference algorithm, the input features of the reinforcement learning algorithm and variables of the reward function are obtained from the road condition status data of the intersection in step (2), the input features of the reinforcement learning algorithm include the average speed of the vehicles, the number of vehicles, the positions of the vehicles, the number of lanes, the type of lanes, the weather status, the accident status and the traffic efficiency of each lane of the intersection, where the traffic efficiency is calculated by the formula (1), and the variables of the reward function include the number of traffic, the waiting time of the vehicles, the average speed difference of the vehicles before and after the traffic, and whether the traffic lights are switched;
Figure BDA0003524528730000041
wherein efficiency is the overall passing efficiency of the vehicle, v car_avg For average speed of vehicles at intersection v lane_speed_limit Is the upper limit speed of the crossing.
In this embodiment, in step (3), the intersection traffic safety criterion is a basic constraint on the safe traffic of the intersection, so as to ensure that traffic in each lane does not collide. In the dispatch model, the safety criterion can be combined with the traffic light state space, for example, for a standard intersection, the effective state space of the traffic light can be considered to have 8 states, as shown in fig. 3, so that the state with the largest rewards among the 8 states can be selected as the traffic light target state according to the model according to the road condition state input
In the specific embodiment, in step (4), the road condition status data of the intersection and the priori knowledge of the intersection are input into the scheduling model to obtain the target state of the traffic light, if the current traffic light state is consistent with the target state, the traffic light switching action is not performed, otherwise, the traffic light is switched to the target state.
Finally, it is noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be understood that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (7)

1. A city traffic jam scheduling method based on reinforcement learning is characterized in that: the method comprises the following steps:
(1) Acquiring real-time data of vehicle quantity information, vehicle queuing information and traffic light states of an urban road intersection through an image sensor and an inductance sensor;
(2) Combining the road section limit acquired from the image information and the reserve structural data with the intersection priori knowledge of the lane information to form intersection road condition state data as scheduling model training data by utilizing a machine learning algorithm according to the real-time data of the number information of vehicles, the queuing information of the vehicles and the traffic light state;
(3) Adopting a reinforcement learning algorithm, wherein the input characteristics of the reinforcement learning algorithm comprise the average speed, the number of vehicles, the positions of the vehicles, the number of lanes, the type of lanes, the weather state, the accident state and the traffic efficiency of each lane of an intersection, selecting a traffic light state switching action in an action space for switching traffic light states according to intersection road condition state data and intersection traffic safety criteria by a scheduling model at a given moment, calculating a reward signal according to the traffic effect and the reward function of each lane of the intersection fed back by the environment, and maximizing the action selected by the model after multiple iterations so as to train the scheduling model;
(4) And taking intersection road condition state data as input, and outputting a traffic light state instruction and corresponding traffic light control signals through the trained scheduling model.
2. The reinforcement learning-based urban traffic congestion scheduling method according to claim 1, wherein: in the step (1), the running speed of the vehicle approaching the intersection is also obtained through a laser radar, and the environmental state information of the intersection is also obtained through a temperature sensor and a humidity sensor.
3. The reinforcement learning-based urban traffic congestion scheduling method according to claim 1, wherein: in the step (2), data preprocessing work of data cleaning and feature construction is performed on real-time data of vehicle quantity information, vehicle queuing information and traffic light states, and then any one machine learning algorithm in CNN, MLP, GBDT, SVM is utilized to extract structural real-time road condition features input as a scheduling model.
4. The reinforcement learning-based urban traffic congestion scheduling method according to claim 1, wherein: in step (2), the intersection prior knowledge includes road segment speed limit, steering limit, number of lanes, lane category, and traffic light switching duration.
5. The reinforcement learning-based urban traffic congestion scheduling method according to claim 1, wherein: in the step (3), the reinforcement learning algorithm comprises a Q-learning or time difference algorithm, the input characteristics of the reinforcement learning algorithm and the variables of the reward function are obtained by the road condition state data of the intersection in the step (2), the passing efficiency in the input characteristics of the reinforcement learning algorithm is calculated by a formula (1), and the variables of the reward function comprise the passing number, the waiting time of the vehicle, the average speed difference of the vehicles before and after passing and whether traffic lights are switched;
Figure FDA0004256796270000011
wherein efficiency is the overall passing efficiency of the vehicle, v car_avg For average speed of vehicles at intersection v lane_speed_limit Is the upper limit speed of the crossing.
6. The reinforcement learning-based urban traffic congestion scheduling method according to claim 1, wherein: in step (3), the intersection traffic safety criterion is a basic constraint on the safe traffic of the intersection, so as to ensure that traffic of each lane does not collide.
7. The reinforcement learning-based urban traffic congestion scheduling method according to claim 1, wherein: in the step (4), the road condition data of the intersection and the priori knowledge of the intersection are input into a scheduling model to obtain the target state of the traffic light, if the current traffic light state is consistent with the target state, the traffic light switching action is not performed, and otherwise, the traffic light is switched to the target state.
CN202210188427.3A 2022-02-28 2022-02-28 Urban traffic jam scheduling method based on reinforcement learning Active CN114550456B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210188427.3A CN114550456B (en) 2022-02-28 2022-02-28 Urban traffic jam scheduling method based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210188427.3A CN114550456B (en) 2022-02-28 2022-02-28 Urban traffic jam scheduling method based on reinforcement learning

Publications (2)

Publication Number Publication Date
CN114550456A CN114550456A (en) 2022-05-27
CN114550456B true CN114550456B (en) 2023-07-04

Family

ID=81678879

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210188427.3A Active CN114550456B (en) 2022-02-28 2022-02-28 Urban traffic jam scheduling method based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN114550456B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109493617A (en) * 2018-10-29 2019-03-19 沈阳天久信息技术工程有限公司 A kind of traffic signal optimization control method and device
CN109544913A (en) * 2018-11-07 2019-03-29 南京邮电大学 A kind of traffic lights dynamic timing algorithm based on depth Q e-learning
CN110114806A (en) * 2018-02-28 2019-08-09 华为技术有限公司 Signalized control method, relevant device and system
CN110164150A (en) * 2019-06-10 2019-08-23 浙江大学 A kind of method for controlling traffic signal lights based on time distribution and intensified learning
CN111047884A (en) * 2019-12-30 2020-04-21 西安理工大学 Traffic light control method based on fog calculation and reinforcement learning
CN111915894A (en) * 2020-08-06 2020-11-10 北京航空航天大学 Variable lane and traffic signal cooperative control method based on deep reinforcement learning
CN113287156A (en) * 2019-10-28 2021-08-20 乐人株式会社 Signal control device and signal control method based on reinforcement learning
CN113643528A (en) * 2021-07-01 2021-11-12 腾讯科技(深圳)有限公司 Signal lamp control method, model training method, system, device and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190347933A1 (en) * 2018-05-11 2019-11-14 Virtual Traffic Lights, LLC Method of implementing an intelligent traffic control apparatus having a reinforcement learning based partial traffic detection control system, and an intelligent traffic control apparatus implemented thereby

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110114806A (en) * 2018-02-28 2019-08-09 华为技术有限公司 Signalized control method, relevant device and system
CN109493617A (en) * 2018-10-29 2019-03-19 沈阳天久信息技术工程有限公司 A kind of traffic signal optimization control method and device
CN109544913A (en) * 2018-11-07 2019-03-29 南京邮电大学 A kind of traffic lights dynamic timing algorithm based on depth Q e-learning
CN110164150A (en) * 2019-06-10 2019-08-23 浙江大学 A kind of method for controlling traffic signal lights based on time distribution and intensified learning
CN113287156A (en) * 2019-10-28 2021-08-20 乐人株式会社 Signal control device and signal control method based on reinforcement learning
CN111047884A (en) * 2019-12-30 2020-04-21 西安理工大学 Traffic light control method based on fog calculation and reinforcement learning
CN111915894A (en) * 2020-08-06 2020-11-10 北京航空航天大学 Variable lane and traffic signal cooperative control method based on deep reinforcement learning
CN113643528A (en) * 2021-07-01 2021-11-12 腾讯科技(深圳)有限公司 Signal lamp control method, model training method, system, device and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
多智能体强化学习在城市交通信号控制中的研究与应用;武强;《中国博士学位论文全文数据库 工程科技II辑》;2021(第4期);C034-14 *

Also Published As

Publication number Publication date
CN114550456A (en) 2022-05-27

Similar Documents

Publication Publication Date Title
WO2021227502A1 (en) Method for traffic light and vehicle track control at signalized intersection
CN107507430B (en) Urban intersection traffic control method and system
CN105976621B (en) It is a kind of to guide the not parking device and method by intersection of vehicle based on car speed induction strategies
CN111445692A (en) Speed collaborative optimization method for intelligent networked automobile at signal-lamp-free intersection
CN108986471A (en) Intersection vehicles bootstrap technique under the conditions of mixed traffic
CN114973733B (en) Network-connected automatic vehicle track optimization control method under mixed flow at signal intersection
CN106781435A (en) A kind of Fei Xinkong intersections platooning passing method based on radio communication
CN206194129U (en) Device based on induced tactful guide car of vehicle velocity does not stop and to pass through crossing
WO2023035666A1 (en) Urban road network traffic light control method based on expected reward estimation
CN108171979A (en) A kind of tramcar whole day runs time optimization method and system
CN113570875A (en) Green wave vehicle speed calculation method, device, equipment and storage medium
CN114550456B (en) Urban traffic jam scheduling method based on reinforcement learning
CN112614357B (en) Intelligent vehicle intersection left-turn phase signal optimization method and device
CN113223324B (en) Control method for high-speed ramp entrance confluence
CN105741585A (en) Fuel-economizing-oriented vehicle track smoothing control method based on Internet-of-vehicles
CN116524745B (en) Cloud edge cooperative area traffic signal dynamic timing system and method
CN116935673A (en) Signal intersection vehicle passing method considering pedestrian crossing under network environment
CN116189454A (en) Traffic signal control method, device, electronic equipment and storage medium
Cheng RETRACTED: Intelligent traffic strategy based on 5G auto Autonomous driving
Zou et al. Vehicle Trajectory Control and Signal Timing Optimization of Isolated Intersection under V2X Environment
Peng et al. Exploring Highway Overtaking and Lane Changing Based on Soft Actor Critic for Discrete Algorithm
CN115083174B (en) Traffic signal lamp control method based on cooperative multi-agent reinforcement learning
Wang et al. Study of vehicle-road cooperative green wave traffic strategy for traffic signal intersections
Xu et al. A speed guidance strategy based on cooperative vehicle-infrastructure environment at signalized intersections
CN115482676B (en) Bus priority signal control method and system for guaranteeing positive point rate

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant