CN112258859A - Intersection traffic control optimization method based on time difference learning - Google Patents
Intersection traffic control optimization method based on time difference learning Download PDFInfo
- Publication number
- CN112258859A CN112258859A CN202011037914.7A CN202011037914A CN112258859A CN 112258859 A CN112258859 A CN 112258859A CN 202011037914 A CN202011037914 A CN 202011037914A CN 112258859 A CN112258859 A CN 112258859A
- Authority
- CN
- China
- Prior art keywords
- intersection
- learning
- time
- data set
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/07—Controlling traffic signals
- G08G1/08—Controlling traffic signals according to detected number or speed of vehicles
Abstract
An intersection signal control optimization method based on Time Difference (TD) learning comprises the following steps: 1) acquiring the number of vehicles in all lanes of an intersection and the state of a signal lamp of each lane according to a time sequence; 2) initializing relevant parameters of TD learning; 3) and traversing different learning parameters to obtain different Q value tables, and selecting an optimal Q value table 4) to select the optimal signal control scheme at the next moment of the intersection signal lamp, so as to obtain the optimal action from the Q value table. Compared with the prior art, the intersection traffic flow with high randomness is adapted through real-time signal control, and the traffic signal timing scheme designed by the invention can improve the traffic efficiency of the intersection compared with the traditional timing control timing scheme.
Description
Technical Field
The invention relates to the fields of traffic control engineering and artificial intelligence application, in particular to a time Difference learning (TD) method and a traffic signal control method.
Background
Nowadays, automobiles have moved to thousands of households. However, increasing automobile reserves do not have a compatible kangzhou avenue. Therefore, in the first-line city, the traffic congestion problem is increasingly severe. In urban road networks, traffic lights are used for controlling traffic at almost all intersections. And aiming at the timing schemes of the traffic lights, timing schemes are adopted. Then, for the large complex system of the urban road network, the change of the traffic flow is random. The unchanged traffic lights ignore the dynamic information of the road network, so that vehicles in the road network cannot pass efficiently, the urban trip experience of people is reduced, and precious natural energy is greatly lost. In recent years, the rapid development of artificial intelligence technology provides a full theoretical support for signal control, and the vigorous development of sensors such as radar and the like and the popularization of 5G communication technology provide a hardware basis for signal control tamping.
Disclosure of Invention
In order to overcome the defects of the prior art and solve the problem that the existing traffic signal timing scheme cannot well deal with the actual situation of road network traffic flow changing in real time, the invention provides an intersection traffic control optimization method based on time difference learning.
The technical scheme adopted by the invention for solving the technical problems is as follows:
an intersection traffic control optimization method based on time difference learning comprises the following steps:
1) by means of radar ranging, the number of vehicles is measured in the radar ranging range of each lane at each moment for a single intersection, and signal lamp state information at the current moment is recorded. Obtaining a data set in chronological order Nk,SkIn which S iskIs at time k, the crossingSignal light status of each lane of the mouth, NkThe number of vehicles in each lane of the intersection at the time K, wherein K is 1,2, …, and K is the number of data contained in the data set;
2) relevant parameters for initializing TD learning:
2.1) Q value TableAll the items are assigned with 0, and each table entry of the Q value table corresponds to one data of one vehicle number-light state data set;
2.2) λ: 0.1, TD learning parameters, reflecting the prospective strength of the training process;
2.3) γ: 0.99, discount factor;
2.4) ε: 0.001, convergence index;
2.5)r=-VkTD, reward value for learning;
3) using vehicle number-light state data set { Nk,SkTraining Q value table in TD learningUntil reaching the training index;
4) according to the obtained Q value tableThe intersection traffic control scheme based on TD learning is as follows: in the actual intersection, acquiring the number N of vehicles in each lane in the current intersection by using a radar sensornowAccording toSignal light state S that should be executed nextnextIs of the formula
Further, the process of the step 3) is as follows;
3.1) tabulating the Q values in chronological order of the data setUpdating the table entry according to the following formula
3.2) calculating a convergence index epsilon, wherein the difference value is calculated as follows, if epsilon is more than 0.001, continuing to execute the step 3.1
3.3) to the learning parameter λ +0.1 until λ 1, resulting in 10 different resultsSelecting the one with the largest total Q value
The technical conception of the invention is as follows: the method comprises the steps of firstly, collecting vehicle number information and signal lamp information on each lane of an intersection, training a data set through a TD learning algorithm, and obtaining the optimal execution action of a signal lamp at each state of the intersection, so that the method is applied to the actual intersection.
The invention has the beneficial effects that: the invention can effectively improve the vehicle passing condition at the intersection, improve the vehicle passing efficiency, reduce the vehicle delay time and relieve the traffic jam problem.
Drawings
FIG. 1 shows a training flow diagram of a TD learning algorithm applied to intersection signal control optimization;
fig. 2 shows a road network diagram constructed based on simulation software Vissim for example analysis below.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Referring to fig. 1 and 2, an intersection traffic control optimization method based on time difference learning includes the following steps:
1) by means of radar ranging, the number of vehicles is measured in the radar ranging range of each lane at each moment for a single intersection, and signal lamp state information at the current moment is recorded. Obtaining a data set in chronological order Nk,SkIn which S iskIs the signal light status of each lane at the intersection at time k, NkThe number of vehicles in each lane of the intersection at the time K, wherein K is 1,2, …, and K is the number of data contained in the data set;
2) relevant parameters for initializing TD learning:
2.1) Q value TableEach table entry of all the 0, Q value tables corresponds to one vehicle number-light state data
One data of a set;
2.2) λ: 0.1, TD learning parameters, reflecting the prospective strength of the training process;
2.3) γ: 0.99, discount factor;
2.4) ε: 0.001, convergence index;
2.5)r=-VkTD, reward value for learning;
3) using vehicle number-light state data set { Nk,SkTraining Q value table in TD learningUntil the training index is reached, the process is as follows:
3.1) tabulating the Q values in chronological order of the data setUpdating the table entry according to the following formula
3.2) calculating a convergence index epsilon, wherein the difference value is calculated as follows, if epsilon is more than 0.001, continuing to execute the step 3.1
3.3) to the learning parameter λ +0.1 until λ 1, resulting in 10 different resultsSelecting the one with the largest total Q value
4) According to the obtained Q value tableThe intersection traffic control scheme based on TD learning is as follows: in the actual intersection, acquiring the number N of vehicles in each lane in the current intersection by using a radar sensornowAccording toSignal light state S that should be executed nextnextIs of the formula
The embodiment takes the measured vehicle number of the intersection built by using simulation software Vissim as an embodiment, and the intersection traffic control optimization method based on time difference learning comprises the following steps:
1) and (3) by calling the Vissim interface, measuring the number of vehicles in the radar ranging range of each lane at each moment for a single intersection, and simultaneously recording the signal lamp state information at the current moment. Obtaining a data set in chronological order Nk,SkIn which S iskIs the signal light status of each lane at the intersection at time k, NkThe number of vehicles in each lane of the intersection at the time K, wherein K is 1,2, …, and K is the number of data contained in the data set;
2) relevant parameters for initializing TD learning:
2.1) Q value TableEach table entry of all the 0, Q value tables corresponds to one vehicle number-light state data
One data of a set;
2.2) λ: 0.1, TD learning parameters, reflecting the prospective strength of the training process;
2.3) γ: 0.99, discount factor;
2.4) ε: 0.001, convergence index;
2.5)r=-VkTD, reward value for learning;
3) using vehicle number-light state data set { Nk,SkTraining Q value table in TD learningUntil the training index is reached, the process is as follows:
3.1) tabulating the Q values in chronological order of the data setUpdating the table entry according to the following formula
3.2) calculating a convergence index epsilon, wherein the difference value is calculated as follows, if epsilon is more than 0.001, continuing to execute the step 3.1
3.3) to the learning parameter λ +0.1 until λ 1, resulting in 10 different resultsSelecting the one with the largest total Q value
4) According to the obtained Q value tableThe intersection traffic control scheme based on TD learning is as follows: in an actual intersection, acquiring the number N of vehicles in each lane at the current intersection by using a Vissim interfacenowAccording toSignal light state S that should be executed nextnextIs of the formula
With Vissim simulation data software as an embodiment, a traffic signal optimization scheme based on TD learning is obtained by using the method, and a simulation result shows that the average delay time on a road network is 8% shorter than that of a traditional timing control method.
While the foregoing has described the preferred embodiments of the present invention, it will be apparent that the invention is not limited to the embodiments described, but can be practiced with modification without departing from the essential spirit of the invention and without departing from the spirit of the invention.
Claims (2)
1. An intersection traffic control optimization method based on time difference learning is characterized by comprising the following steps:
1) by means of radar ranging, the number of vehicles is measured in the radar ranging range of each lane at each time for a single intersection, signal lamp state information of the current time is recorded, and a data set { N is obtained according to time sequencek,SkIn which S iskIs at time k, the transactionSignal light status of each lane of the fork, NkThe number of vehicles in each lane of the intersection at the time K, wherein K is 1,2, …, and K is the number of data contained in the data set;
2) relevant parameters for initializing TD learning:
2.1) Q value TableAll the items are assigned with 0, and each table entry of the Q value table corresponds to one data of one vehicle number-light state data set;
2.2) λ: 0.1, TD learning parameters, reflecting the prospective strength of the training process;
2.3) γ: 0.99, discount factor;
2.4) ε: 0.001, convergence index;
2.5)r=-VkTD, reward value for learning;
3) using vehicle number-light state data set { Nk,SkTraining Q value table in TD learningUntil reaching the training index;
4) according to the obtained Q value tableThe intersection traffic control scheme based on TD learning is as follows: in the actual intersection, acquiring the number N of vehicles in each lane in the current intersection by using a radar sensornowAccording toSignal light state S that should be executed nextnextIs of the formula
2. The intersection traffic control optimization method based on the time difference learning as claimed in claim 1, wherein the process of the step 3) is as follows;
3.1) tabulating the Q values in chronological order of the data setUpdating the table entry according to the following formula
3.2) calculating a convergence index epsilon, wherein the difference value is calculated as follows, if epsilon is more than 0.001, continuing to execute the step 3.1
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011037914.7A CN112258859A (en) | 2020-09-28 | 2020-09-28 | Intersection traffic control optimization method based on time difference learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011037914.7A CN112258859A (en) | 2020-09-28 | 2020-09-28 | Intersection traffic control optimization method based on time difference learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112258859A true CN112258859A (en) | 2021-01-22 |
Family
ID=74234141
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011037914.7A Pending CN112258859A (en) | 2020-09-28 | 2020-09-28 | Intersection traffic control optimization method based on time difference learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112258859A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090327011A1 (en) * | 2008-06-30 | 2009-12-31 | Autonomous Solutions, Inc. | Vehicle dispatching method and system |
CN108510764A (en) * | 2018-04-24 | 2018-09-07 | 南京邮电大学 | A kind of adaptive phase difference coordinated control system of Multiple Intersections and method based on Q study |
CN108805348A (en) * | 2018-06-05 | 2018-11-13 | 北京京东金融科技控股有限公司 | A kind of method and apparatus of intersection signal timing control optimization |
CN109559530A (en) * | 2019-01-07 | 2019-04-02 | 大连理工大学 | A kind of multi-intersection signal lamp cooperative control method based on Q value Transfer Depth intensified learning |
CN111489568A (en) * | 2019-01-25 | 2020-08-04 | 阿里巴巴集团控股有限公司 | Traffic signal lamp regulation and control method and device and computer readable storage medium |
-
2020
- 2020-09-28 CN CN202011037914.7A patent/CN112258859A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090327011A1 (en) * | 2008-06-30 | 2009-12-31 | Autonomous Solutions, Inc. | Vehicle dispatching method and system |
CN108510764A (en) * | 2018-04-24 | 2018-09-07 | 南京邮电大学 | A kind of adaptive phase difference coordinated control system of Multiple Intersections and method based on Q study |
CN108805348A (en) * | 2018-06-05 | 2018-11-13 | 北京京东金融科技控股有限公司 | A kind of method and apparatus of intersection signal timing control optimization |
CN109559530A (en) * | 2019-01-07 | 2019-04-02 | 大连理工大学 | A kind of multi-intersection signal lamp cooperative control method based on Q value Transfer Depth intensified learning |
CN111489568A (en) * | 2019-01-25 | 2020-08-04 | 阿里巴巴集团控股有限公司 | Traffic signal lamp regulation and control method and device and computer readable storage medium |
Non-Patent Citations (1)
Title |
---|
孙浩等: "基于深度强化学习的交通信号控制方法", 《计算机科学》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111260937A (en) | Cross traffic signal lamp control method based on reinforcement learning | |
CN111931905A (en) | Graph convolution neural network model and vehicle track prediction method using same | |
CN111739284B (en) | Traffic signal lamp intelligent timing method based on genetic algorithm optimization fuzzy control | |
CN112216127B (en) | Small road network traffic signal optimization method based on near-end strategy optimization | |
Fang et al. | FTPG: A fine-grained traffic prediction method with graph attention network using big trace data | |
CN108831168B (en) | Traffic signal lamp control method and system based on visual identification of associated intersection | |
CN113257016B (en) | Traffic signal control method and device and readable storage medium | |
CN111243271A (en) | Single-point intersection signal control method based on deep cycle Q learning | |
CN112265546A (en) | Networked automobile speed prediction method based on time-space sequence information | |
CN110718077A (en) | Signal lamp optimization timing method under action-evaluation mechanism | |
CN111081035A (en) | Traffic signal control method based on Q learning | |
Zeng et al. | Training reinforcement learning agent for traffic signal control under different traffic conditions | |
Li et al. | Deep imitation learning for traffic signal control and operations based on graph convolutional neural networks | |
CN114120670B (en) | Method and system for traffic signal control | |
Zhao et al. | Traffic signal control with deep reinforcement learning | |
CN112767680B (en) | Green wave traffic evaluation method based on trajectory data | |
CN114419884A (en) | Self-adaptive signal control method and system based on reinforcement learning and phase competition | |
CN109255948A (en) | A kind of divided lane wagon flow scale prediction method based on Kalman filtering | |
CN110530378B (en) | Vehicle positioning method based on MAP message set of V2X | |
CN112258859A (en) | Intersection traffic control optimization method based on time difference learning | |
CN111507499B (en) | Method, device and system for constructing model for prediction and testing method | |
CN115472023B (en) | Intelligent traffic light control method and device based on deep reinforcement learning | |
Luo et al. | Researches on intelligent traffic signal control based on deep reinforcement learning | |
CN112216126A (en) | Trunk traffic control optimization method based on SARSA | |
CN115331460A (en) | Large-scale traffic signal control method and device based on deep reinforcement learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20210122 |
|
WD01 | Invention patent application deemed withdrawn after publication |