CN110718077B - Signal lamp optimization timing method under action-evaluation mechanism - Google Patents
Signal lamp optimization timing method under action-evaluation mechanism Download PDFInfo
- Publication number
- CN110718077B CN110718077B CN201911066576.7A CN201911066576A CN110718077B CN 110718077 B CN110718077 B CN 110718077B CN 201911066576 A CN201911066576 A CN 201911066576A CN 110718077 B CN110718077 B CN 110718077B
- Authority
- CN
- China
- Prior art keywords
- intersection
- traffic
- model
- action
- intersections
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 238000011156 evaluation Methods 0.000 title claims abstract description 50
- 230000007246 mechanism Effects 0.000 title claims abstract description 20
- 238000005457 optimization Methods 0.000 title claims abstract description 16
- 230000009471 action Effects 0.000 claims abstract description 41
- 238000012544 monitoring process Methods 0.000 claims abstract description 9
- 238000013136 deep learning model Methods 0.000 claims description 20
- 238000012549 training Methods 0.000 claims description 15
- 238000013135 deep learning Methods 0.000 claims description 12
- 230000006870 function Effects 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 10
- 238000004364 calculation method Methods 0.000 claims description 7
- 230000001502 supplementing effect Effects 0.000 claims description 7
- 230000009916 joint effect Effects 0.000 claims description 6
- 239000011159 matrix material Substances 0.000 claims description 6
- 230000001133 acceleration Effects 0.000 claims description 3
- 238000000819 phase cycle Methods 0.000 claims description 3
- 229920006395 saturated elastomer Polymers 0.000 claims description 3
- 238000013526 transfer learning Methods 0.000 claims description 3
- 230000005540 biological transmission Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000008407 joint function Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 239000013589 supplement Substances 0.000 description 2
- 241001417517 Scatophagidae Species 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/07—Controlling traffic signals
- G08G1/08—Controlling traffic signals according to detected number or speed of vehicles
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/40—Business processes related to the transportation industry
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Human Resources & Organizations (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Marketing (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Development Economics (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Traffic Control Systems (AREA)
Abstract
The invention relates to a signal lamp optimization timing method under a action-evaluation mechanism, which comprises the following steps: establishing a motor vehicle microscopic traffic model for simulating a single motor vehicle according to the traffic track data, and establishing an intersection global traffic model for simulating the road traffic condition of the intersection according to the traffic flow data and the traffic monitoring video; establishing a traffic signal control model of the single intersection by taking a motor vehicle micro traffic model as a state space, a traffic signal control scheme as an action space and an intersection global traffic model as an action evaluation index; and performing optimization control by taking each intersection in the judgment area as a unit, and establishing an area traffic signal control model for optimizing the evaluation indexes of all motor vehicles in the judgment area according to each single intersection traffic signal control model. And hidden microscopic traffic models taking motor vehicles as units and intersection global traffic models taking intersections as units are extracted, so that multi-modal traffic data are more fully utilized.
Description
Technical Field
The invention relates to the technical field of road traffic, in particular to a signal lamp optimal timing method under a action-evaluation mechanism.
Background
Urban traffic plays an increasingly important role in the process of urbanization, and traffic congestion gradually becomes an important problem which puzzles various cities in the world. The search for ways to solve the problem of traffic congestion is becoming a common consensus of the present society.
The problem of intersection traffic in a large urban traffic network which is complicated and complicated is always the key point of research for solving urban congestion at home and abroad. However, the traffic signal control scheme of most urban intersections in China adopts a traditional timing control mode. In the traditional traffic signal control system, the signal lamp phase timers in all directions of each intersection are set to be the same fixed time, so that the old signal period is still maintained under the condition of great traffic flow difference, the crossing traffic efficiency is seriously reduced, and more serious traffic jam is caused.
Since the invention of the traffic signal lamp in the beginning of the 20 th century, the control research of the traffic signal lamps at home and abroad has achieved certain achievements. Among them, the Traffic light control systems that are successful are mainly foreign SCOOT (Split, Cycle, offset optimizing Technique, green ratio, period, and phase difference optimizing Technique), SCAT (sydney coordinated Adaptive Traffic control System), and the like. The SCOOT system has the defects of incapability of adjusting phase, tedious installation and incapability of effectively processing traffic flow with large random fluctuation; the SCATS system simply selects an optimal control scheme from the preselected schemes, and cannot effectively feed back real-time traffic flow.
Disclosure of Invention
The invention provides a signal lamp optimization timing method under a action-evaluation mechanism aiming at the technical problems in the prior art, and solves the problem that the traffic signal lamp control in the prior art cannot effectively feed back real-time traffic flow.
The technical scheme for solving the technical problems is as follows: a signal lamp optimization timing method under a action-evaluation mechanism comprises the following steps: step 1, establishing a motor vehicle microscopic traffic model simulating a single motor vehicle according to traffic track data, and establishing an intersection global traffic model simulating the road traffic condition of an intersection according to traffic flow data and a traffic monitoring video;
step 2, establishing a traffic signal control model of the single intersection by taking the motor vehicle micro traffic model as a state space, taking a traffic signal control scheme as an action space and taking the intersection global traffic model as an action evaluation index;
and 3, carrying out optimization control by taking each intersection in the judgment area as a unit, and establishing an area traffic signal control model which enables the evaluation indexes of all motor vehicles in the judgment area to be optimal according to each single intersection traffic signal control model.
The invention has the beneficial effects that: the action-evaluation mechanism is a self-learning process under an unsupervised condition, and can realize real-time processing on mass traffic data; through different processing of the multi-modal traffic data, a hidden microscopic traffic model taking a motor vehicle as a unit and an intersection global traffic model taking an intersection as a unit are extracted, so that the multi-modal traffic data are more fully utilized; a single intersection is used as an Agent, a motor vehicle microscopic model passing through the intersection is used as a state space of the intelligent Agent, and finally, the action effect is represented by combining all vehicle average delay indexes and an intersection global traffic model through action selection of an intersection signal control scheme, so that evaluation on intersection timing optimization is realized from two levels of vehicle and road states.
On the basis of the technical scheme, the invention can be further improved as follows.
Further, the process of establishing the motor vehicle micro traffic model in the step 1 comprises:
and extracting space-time trajectory characteristic data from the traffic trajectory data by adopting a deep learning method, wherein the space-time trajectory characteristic data comprises the position, the acceleration, the direction angle and the speed of the motor vehicle in the passing state of the intersection.
Further, the intersection global traffic model established in the step 1 also refers to traffic environment data; the traffic environment data comprises climate conditions, date attributes and emergency occurrence conditions;
the establishment process of the intersection global traffic model comprises the following steps: and extracting deep global traffic characteristics of the intersection road section by adopting a deep learning method based on the traffic flow data, the traffic monitoring video and the traffic environment data.
Further, the process of controlling and optimizing the traffic signal of the intersection in the single intersection traffic signal control model in the step 2 is as follows: and optimizing the green signal ratio of each phase under the condition that the phase sequence and the total period duration are fixed.
Further, the step 2 comprises:
step 201, constructing a deep learning model which takes a state s of an intersection controller Agent as input and takes value estimation Q (s, a) of all actions a of the intersection controller Agent under the state s as output by adopting a deep learning method;
step 202, establishing an intersection evaluation index: c. Ct=x1rt+x2D(q)+x3mt;
Wherein r istIs an intersection average delay index representing the congestion condition of the current intersection, D (q) is an intersection balance index, mtIs the intersection global traffic model, x1,x2,x3The average delay index of the intersection, the balance index of the intersection and the proportion of the global traffic model of the intersection in the evaluation index of the intersection are respectively set according to different requirements;
step 203, determining the loss function of the deep learning model iterative training as L ═ ct+1+(Q(st+1,at+1|θ)-Q(st,at|θ));
Wherein, ct+1Is an intersection evaluation index Q(s) after the intersection Agent actst,atI θ) is the pre-action deep learning model output, Q(s)t+1,at+1| θ) is the deep learning output after action;
and 204, under the original signal control scheme of the intersection, collecting a traffic state s of the current intersection, a controller action a of the current intersection, a current traffic state c and a traffic state s' of the intersection after action as training samples of the deep learning model, performing iterative training to minimize a loss function, and using the output Q (s, a) of the deep learning model as an evaluation index of the action of the intersection controller Agent.
Further, in step 202, the intersection average delay index rtAccording to the actual flow q and the saturated flow q of the intersectionsCalculating the signal control period length T and the green signal ratio lambda;
the intersection balance degree index D (q) is a long queue q in each direction between intersections1,q2…qi…qnThe inter variance.
Further, the step 3 comprises:
step 301, dividing regions according to the similarity of intersections; the intersection similarity is obtained by calculation according to the intersection distance, the traffic flow variance and the covariance among the intersections; the intersection distance is the minimum number of connected road sections between intersections;
302, supplementing and initializing missing data of the single intersection traffic control model in any divided region based on a transfer learning method;
step 303, carrying out optimal joint action search based on the collaborative map to obtain joint action a of each intersection in the area1,a2,…,ai,…,anEvaluation index c of each intersection in the area for actioniThe sum is the evaluation mechanism after action to determine the regional traffic signal control model; the vertex of the collaborative map is each intersection, the side of the collaborative map is a road between the intersections, and the collaborative map is solved by iteratively transferring the optimal information of the vertex between adjacent vertexes of the undirected graph.
Further, the step 301 includes:
step 30101, a one-dimensional traffic flow matrix is constructed based on traffic flow parameters in the time-series traffic flow of the intersection within 24 hours: [ q ] of1,q2,…,qn];
30102, calculating the covariance between any two intersections as the covariance of the traffic flow matrix;
30103, respectively calculating the traffic flow variance of any two intersections as their respective flow variation conditions;
and 30104, calculating intersection similarity by combining the intersection distance, the traffic flow variance and the covariance between the intersections, and setting an intersection similarity threshold to realize area division based on the intersection distance.
Further, the step 302 includes:
and step 30201, randomly selecting an intersection as a basic intersection in the area.
And step 30202, supplementing missing data in the training data of the single intersection traffic control model of the basic intersection based on the other relevant data in the intersection.
Step 30203, initially training the single intersection traffic control model of the basic intersection to serve as an initial single intersection traffic control model of other intersections in the area.
Further, in step 303, the optimal information transmitted from the adjacent vertex i to the adjacent vertex j in the collaborative map includes the maximum value of the union function between two vertices and the optimal information of the vertex i for sending information, and the union function f between the adjacent vertices i and jij(ai,aj) The calculation formula of (2) is as follows: f. ofij(ai,aj)=Q(si,ai|θi)×pij+Q(sj,aj|θj);
Wherein, Q(s)i,ai|θi) Outputting for the adjacent vertex deep learning model; p is a radical ofijThe moving probability of the motor vehicle passing through each vertex is obtained; q(s)j,aj|θj) And outputting the current vertex deep learning model.
The beneficial effect of adopting the further scheme is that: considering the mutual cooperation and mutual influence relation of the intersections in the area, and constructing the topological relation of each intersection in the area road network based on the cooperation map; based on the motor vehicle position prediction in the motor vehicle movement mode, the movement probability of the vehicle passing through each side is introduced into the optimal information transmission between adjacent vertexes in the collaborative map, and more accurate and precise information transmission in the region is realized.
Drawings
FIG. 1 is a flow chart of a signal lamp optimization timing method under an action-evaluation mechanism according to the present invention;
FIG. 2 is a flow diagram of an embodiment of a method for building a traffic model based on multimodal data provided by the present invention;
FIG. 3 is a flowchart of an embodiment of a method for establishing an intersection global traffic model according to the present invention;
FIG. 4 is a flowchart of an embodiment of a method for creating a regional traffic signal control model according to the present invention;
FIG. 5 is a flowchart of an embodiment of a method for missing data supplement and initialization of a single-intersection traffic control model according to the present invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.
Fig. 1 is a flowchart of a signal lamp optimization timing method under an action-evaluation mechanism according to the present invention, and as can be seen from fig. 1, the method includes:
And 2, establishing a traffic signal control model of the single intersection by taking the motor vehicle micro traffic model as a state space, taking the traffic signal control scheme as an action space and taking the global traffic model of the intersection as an action evaluation index.
And 3, carrying out optimization control by taking each intersection in the judgment area as a unit, and establishing an area traffic signal control model for optimizing the evaluation indexes of all motor vehicles in the judgment area according to each single intersection traffic signal control model.
According to the signal lamp optimization timing method under the action-evaluation mechanism, the action-evaluation mechanism is a self-learning process under an unsupervised condition, and mass traffic data can be processed in real time; through different processing of the multi-modal traffic data, a hidden microscopic traffic model taking a motor vehicle as a unit and an intersection global traffic model taking an intersection as a unit are extracted, so that the multi-modal traffic data are more fully utilized; a single intersection is used as an Agent, a motor vehicle microscopic model passing through the intersection is used as a state space of the intelligent Agent, and finally, the action effect is represented by combining all vehicle average delay indexes and an intersection global traffic model through action selection of an intersection signal control scheme, so that evaluation on intersection timing optimization is realized from two levels of vehicle and road states.
Example 1
Specifically, the step 1 is a process of a traffic model based on multi-modal data, the traffic model includes a motor vehicle micro traffic model and an intersection global traffic model, the micro traffic model simulates a single motor vehicle, dynamic variables of the model represent micro attributes, and the intersection global traffic model is used as a movement environment in a single intersection traffic signal control model. Fig. 2 is a flowchart of an embodiment of the method for building a traffic model based on multi-modal data according to the present invention, and fig. 2 shows that:
and extracting space-time trajectory characteristic data from the traffic trajectory data by adopting a deep learning method, wherein the space-time trajectory characteristic data comprises the position, the acceleration, the direction angle and the speed of the motor vehicle in the passing state of the intersection.
The intersection global traffic model also refers to traffic environment data; the traffic environment data includes climate conditions, date attributes, and incident occurrence conditions. The traffic monitoring video and the traffic flow data describe the road passing condition of the intersection on the global level, and the traffic environment data supplement the background knowledge of the road passing of the intersection. Therefore, based on the traffic monitoring video, the traffic flow data and the traffic environment data, the deep global traffic characteristics of the intersection road section are extracted by adopting a deep learning method.
The problem of dimension explosion of intersection signal control under an action-evaluation mechanism can be caused by the fact that the dimension of the traffic model is too large, and therefore a multi-layer deep traffic feature extraction model is constructed for multi-modal traffic data, so that dimension reduction of features in the traffic model and better understanding of features which cannot be expressed manually are achieved.
And 2, establishing a traffic signal control model of the single intersection by taking the motor vehicle micro traffic model as a state space, taking the traffic signal control scheme as an action space and taking the global traffic model of the intersection as an action evaluation index.
The traffic signal control model of the single intersection is optimized by a traffic signal control scheme taking the single intersection as an object under an action-evaluation mechanism, as shown in fig. 3, which is a flow chart of an embodiment of the method for establishing an intersection global traffic model provided by the invention, as can be seen from fig. 3, the traffic model obtained in the step 1 is used as an intersection controller Agent state space in signal control of the single intersection under the action-evaluation mechanism, and the traffic state is described from the perspective of a single motor vehicle and the intersection. The motor vehicles belong to moving objects observed by the intersection controller Agent, so that all the motor vehicles around the intersection are considered when the intersection controller Agent makes decisions. Meanwhile, the traffic state around the intersection also belongs to the state factor which needs to be considered when the intersection controller Agent makes a decision.
The intersection controller Agent is used as an intelligent Agent under an evaluation-action mechanism, after the current intersection state is observed, an action needs to be selected in an action space for execution, and the optimal action is selected at each moment, namely, the intersection traffic signal control optimization is realized.
Preferably, in the embodiment of the present invention, a process of controlling and optimizing a traffic signal of a single intersection in a traffic signal control model of the single intersection is as follows: and optimizing the green signal ratio of each phase under the condition that the phase sequence and the total period duration are fixed. The green signal ratio is the ratio of the green light time length of the phase to the period time length, and the traffic capacity of the intersection can be effectively increased by reasonably controlling the green signal ratio. The green signal ratio lambda is calculated by the formula:in the formula tgT is the period for the effective green time.
The evaluation index shows that the evaluation index has an incentive effect on the optimal action selection of the intersection controller Agent as a result of action selection under the action-evaluation mechanism, namely, the intersection controller Agent selects the action with the highest evaluation index.
Therefore, based on the original state of the intersectionThe order and period duration of each phase in the signal control scheme is used to control the phase to green-to-signal ratio lambda1,λ2…λi…λnThe parameters form the action space of the agent. Specifically, the step 2 includes:
step 201, a deep learning model is constructed by adopting a deep learning method, wherein the state s of the intersection controller Agent is used as input, and the value estimation Q (s, a) of all actions a of the intersection controller Agent in the state s is used as output.
Step 202, establishing an intersection evaluation index: c. Ct=x1rt+x2D(q)+x3mt。
Wherein r istIs an intersection average delay index representing the congestion condition of the current intersection, D (q) is an intersection balance index, mtIs an intersection global traffic model, x1,x2,x3The proportion of the average delay index of the intersection, the balance index of the intersection and the proportion of the global traffic model of the intersection in the evaluation index of the intersection are set according to different requirements, and the evaluation index of the intersection is obtained by adopting a weighted average mode.
Specifically, the average delay index r at the intersectiontAccording to the actual flow q and the saturated flow q of the intersectionsThe signal control period length T and the green signal ratio lambda are obtained through calculation, and specifically Webster and other delay models can be adopted.
The intersection balance degree index D (q) is a long queue q in each direction between intersections1,q2…qi…qnAnd the inter-variance realizes the balance of resource distribution in each direction of the intersection.
Step 203, determining the loss function of the deep learning model iterative training as L ═ ct+1+(Q(st+1,at+1|θ)-Q(st,at|θ))。
Wherein, ct+1Is an intersection evaluation index Q(s) after the intersection Agent actst,atI θ) is the pre-action deep learning model output, Q(s)t+1,at+1| θ) is the post-action deep learning output.
And 204, under the original signal control scheme of the intersection, collecting the traffic state s of the current intersection, the action a of a controller of the current intersection, the current traffic state c and the traffic state s' of the intersection after the action as training samples of a deep learning model, performing iterative training to minimize a loss function, and taking the output Q (s, a) of the deep learning model as an evaluation index of the action of the controller Agent of the intersection.
And 3, carrying out optimization control by taking each intersection in the judgment area as a unit, and establishing an area traffic signal control model for optimizing the evaluation indexes of all motor vehicles in the judgment area according to each single intersection traffic signal control model. Fig. 4 is a flowchart illustrating an embodiment of a method for establishing a regional traffic signal control model according to the present invention, and as can be seen from fig. 4, the method includes:
step 301, dividing regions according to the similarity of intersections; the intersection similarity is obtained by calculation according to the intersection distance, the traffic flow variance and the covariance among the intersections; the intersection distance is the minimum number of connected road sections between intersections.
The intersection similarity is closely related to the distance between intersections and traffic flow characteristics, and the closer the intersections are, the more orderly and integrated the motor vehicles can drive between the intersections, which indicates that the intersections are relatively high in similarity. Meanwhile, the bigger the traffic flow similarity between the intersections is, the greater the relevance between the two intersections is. Specifically, step 301 includes:
step 30101, a one-dimensional traffic flow matrix is constructed based on traffic flow parameters in the time-series traffic flow of the intersection within 24 hours: [ q ] of1,q2,…,qn]。
And 30102, calculating the covariance between any two intersections as the traffic flow matrix covariance.
And 30103, respectively calculating the traffic flow variance of any two intersections as the respective flow change conditions.
And 30104, calculating intersection similarity by combining the intersection distance, the traffic flow variance and the covariance between the intersections, and setting a proper intersection similarity threshold to realize area division based on the intersection distance.
And 302, supplementing and initializing missing data of the single intersection traffic control model in any divided region based on a transfer learning method.
Traffic flow in intersections in the regional road network is very similar, and the signal control model of the intersections in the same regional road network is high in similarity. As shown in fig. 5, which is a flowchart of an embodiment of a method for supplementing missing data and initializing a single-intersection traffic control model provided in the present invention, as can be seen from fig. 5, step 302 includes:
and step 30201, randomly selecting an intersection in the area as a basic intersection.
And step 30202, supplementing missing data in the training data of the single intersection traffic control model of the basic intersection based on the relevant data of other intersections.
And step 30203, primarily training a single intersection traffic control model of the basic intersection to serve as an initial single intersection traffic control model of other intersections in the area.
Step 303, carrying out optimal joint action search based on the collaborative map to obtain joint action a of each intersection in the area1,a2,…,ai,…,anEvaluation index c of each intersection in the area for actioniAnd the sum is an evaluation mechanism determination area traffic signal control model after action. The vertex of the collaborative map is each intersection, the edge of the collaborative map is a road between the intersections, and the collaborative map is solved by iteratively transferring the optimal information of the vertex between adjacent vertexes of the undirected graph.
Specifically, the optimal information transferred from adjacent vertex i to j in the collaborative map comprises the maximum value of the joint function between two vertices and the optimal information of the vertex i for sending information, and the joint function f between the adjacent vertices i and jij(ai,aj) The calculation formula of (2) is as follows: f. ofij(ai,aj)=Q(si,ai|θi)×pij+Q(sj,aj|θj)。
Wherein, Q(s)i,ai|θi) For depth learning of adjacent vertexesOutputting a learning model; p is a radical ofijThe moving probability of the motor vehicle passing through each vertex is obtained; q(s)j,aj|θj) And outputting the current vertex deep learning model.
The motor vehicle movement mode describes the movement state of the motor vehicle in the road network, and the motor vehicle movement prediction can be realized, so that the movement probability p of the motor vehicle passing through each vertex in the cooperation diagram is obtainedij。
Considering the mutual cooperation and mutual influence relation of the intersections in the area, and constructing the topological relation of each intersection in the area road network based on the cooperation map; based on the motor vehicle position prediction in the motor vehicle movement mode, the movement probability of the vehicle passing through each side is introduced into the optimal information transmission between adjacent vertexes in the collaborative map, and more accurate and precise information transmission in the region is realized.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
Claims (8)
1. A method for optimizing timing of signal lamps under action-evaluation mechanism is characterized by comprising the following steps:
step 1, establishing a motor vehicle microscopic traffic model simulating a single motor vehicle according to traffic track data, and establishing an intersection global traffic model simulating the road traffic condition of an intersection according to traffic flow data and a traffic monitoring video;
step 2, establishing a traffic signal control model of the single intersection by taking the motor vehicle micro traffic model as a state space, taking a traffic signal control scheme as an action space and taking the intersection global traffic model as an action evaluation index;
step 3, carrying out optimization control by taking each intersection in a judgment area as a unit, and establishing an area traffic signal control model for enabling the evaluation indexes of all motor vehicles in the judgment area to reach the optimal value according to each single intersection traffic signal control model;
the process of controlling and optimizing the intersection traffic signals in the single intersection traffic signal control model in the step 2 is as follows: optimizing the green-to-signal ratio of each phase under the condition that the phase sequence and the total period duration are fixed;
the step 3 comprises the following steps:
step 301, dividing regions according to the similarity of intersections; the intersection similarity is obtained by calculation according to the intersection distance, the traffic flow variance and the covariance among the intersections; the intersection distance is the minimum number of connected road sections between intersections;
302, supplementing and initializing missing data of the single intersection traffic control model in any divided region based on a transfer learning method;
step 303, carrying out optimal joint action search based on the collaborative map to obtain joint action a of each intersection in the area1,a2,…,ai,…,anEvaluation index c of each intersection in the area for actioniThe sum is the evaluation mechanism after action to determine the regional traffic signal control model; the vertex of the collaborative map is each intersection, the side of the collaborative map is a road between the intersections, and the collaborative map is solved by iteratively transferring the optimal information of the vertex between adjacent vertexes of the undirected graph.
2. The method of claim 1, wherein the step 1 of establishing the automotive micro traffic model comprises:
and extracting space-time trajectory characteristic data from the traffic trajectory data by adopting a deep learning method, wherein the space-time trajectory characteristic data comprises the position, the acceleration, the direction angle and the speed of the motor vehicle in the passing state of the intersection.
3. The method according to claim 1, wherein the intersection global traffic model established in step 1 is further referenced to traffic environment data; the traffic environment data comprises climate conditions, date attributes and emergency occurrence conditions;
the establishment process of the intersection global traffic model comprises the following steps: and extracting deep global traffic characteristics of the intersection road section by adopting a deep learning method based on the traffic flow data, the traffic monitoring video and the traffic environment data.
4. The method of claim 1, wherein the step 2 comprises:
step 201, constructing a deep learning model which takes a state s of an intersection controller Agent as input and takes value estimation Q (s, a) of all actions a of the intersection controller Agent under the state s as output by adopting a deep learning method;
step 202, establishing an intersection evaluation index: c. Ct=x1rt+x2D(q)+x3mt;
Wherein r istIs an intersection average delay index representing the congestion condition of the current intersection, D (q) is an intersection balance index, mtIs the intersection global traffic model, x1,x2,x3The average delay index of the intersection, the balance index of the intersection and the proportion of the global traffic model of the intersection in the evaluation index of the intersection are respectively set according to different requirements;
step 203, determining the loss function of the deep learning model iterative training as L ═ ct+1+(Q(st+1,at+1|θ)-Q(st,at|θ));
Wherein, ct+1Is an intersection evaluation index Q(s) after the intersection Agent actst,atI θ) is the pre-action deep learning model output, Q(s)t+1,at+1| θ) is the deep learning output after action;
and 204, under the original signal control scheme of the intersection, collecting a traffic state s of the current intersection, a controller action a of the current intersection, a current traffic state c and a traffic state s' of the intersection after action as training samples of the deep learning model, performing iterative training to minimize a loss function, and using the output Q (s, a) of the deep learning model as an evaluation index of the action of the intersection controller Agent.
5. The method according to claim 4, wherein in step 202, the intersection average delay indicator rtAccording to the actual flow q and the saturated flow q of the intersectionsCalculating the signal control period length T and the green signal ratio lambda;
the intersection balance degree index D (q) is a long queue q in each direction between intersections1,q2…qi…qnThe inter variance.
6. The method of claim 1, wherein the step 301 comprises:
step 30101, a one-dimensional traffic flow matrix is constructed based on traffic flow parameters in the time-series traffic flow of the intersection within the set time: [ q ] of1,q2,…,qn];
30102, calculating the covariance between any two intersections as the covariance of the traffic flow matrix;
30103, respectively calculating the traffic flow variance of any two intersections as their respective flow variation conditions;
and 30104, calculating intersection similarity by combining the intersection distance, the traffic flow variance and the covariance between the intersections, and setting an intersection similarity threshold to realize area division based on the intersection distance.
7. The method of claim 1, wherein the step 302 comprises:
step 30201, randomly selecting an intersection as a basic intersection in the area;
step 30202, supplementing missing data in the training data of the single intersection traffic control model of the basic intersection based on the relevant data of other intersections;
step 30203, initially training the single intersection traffic control model of the basic intersection to serve as an initial single intersection traffic control model of other intersections in the area.
8. The method according to claim 1, wherein in step 303, the optimal information transmitted between adjacent vertices i and j in the collaborative map comprises the maximum value of the union function between two vertices and the optimal information of the vertex i sending information, and the union function f between adjacent vertices i and jij(ai,aj) The calculation formula of (2) is as follows: f. ofij(ai,aj)=Q(si,ai|θi)×pij+Q(sj,aj|θj);
Wherein, Q(s)i,ai|θi) Outputting for the adjacent vertex deep learning model; p is a radical ofijThe moving probability of the motor vehicle passing through each vertex is obtained; q(s)j,aj|θj) And outputting the current vertex deep learning model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911066576.7A CN110718077B (en) | 2019-11-04 | 2019-11-04 | Signal lamp optimization timing method under action-evaluation mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911066576.7A CN110718077B (en) | 2019-11-04 | 2019-11-04 | Signal lamp optimization timing method under action-evaluation mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110718077A CN110718077A (en) | 2020-01-21 |
CN110718077B true CN110718077B (en) | 2020-08-07 |
Family
ID=69214716
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911066576.7A Active CN110718077B (en) | 2019-11-04 | 2019-11-04 | Signal lamp optimization timing method under action-evaluation mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110718077B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111462486B (en) * | 2020-03-31 | 2022-05-31 | 连云港杰瑞电子有限公司 | Intersection similarity measurement method based on traffic signal control |
CN112133109A (en) * | 2020-08-10 | 2020-12-25 | 北方工业大学 | Method for establishing single-cross-port multidirectional space occupancy balance control model |
CN112989715B (en) * | 2021-05-20 | 2021-08-03 | 北京理工大学 | Multi-signal-lamp vehicle speed planning method for fuel cell vehicle |
CN113643528B (en) * | 2021-07-01 | 2024-06-28 | 腾讯科技(深圳)有限公司 | Signal lamp control method, model training method, system, device and storage medium |
CN114464000B (en) * | 2022-02-21 | 2023-04-25 | 上海商汤科技开发有限公司 | Intersection traffic light control method, device, equipment and storage medium |
CN116994444B (en) * | 2023-09-26 | 2023-12-12 | 南京邮电大学 | Traffic light control method, system and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106910351A (en) * | 2017-04-19 | 2017-06-30 | 大连理工大学 | A kind of traffic signals self-adaptation control method based on deeply study |
CN109360429A (en) * | 2018-12-13 | 2019-02-19 | 武汉摩尔数据技术有限公司 | A kind of urban highway traffic dispatching method and system based on simulative optimization |
CN109472984A (en) * | 2018-12-27 | 2019-03-15 | 苏州科技大学 | Signalized control method, system and storage medium based on deeply study |
CN109544913A (en) * | 2018-11-07 | 2019-03-29 | 南京邮电大学 | A kind of traffic lights dynamic timing algorithm based on depth Q e-learning |
CN109559530A (en) * | 2019-01-07 | 2019-04-02 | 大连理工大学 | A kind of multi-intersection signal lamp cooperative control method based on Q value Transfer Depth intensified learning |
CN109670233A (en) * | 2018-12-14 | 2019-04-23 | 南京理工大学 | More Traffic Light Automatic Control methods based on deeply study |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10037689B2 (en) * | 2015-03-24 | 2018-07-31 | Donald Warren Taylor | Apparatus and system to manage monitored vehicular flow rate |
CN108470461B (en) * | 2018-03-27 | 2021-02-26 | 北京航空航天大学 | Traffic signal controller control effect online evaluation method and system |
-
2019
- 2019-11-04 CN CN201911066576.7A patent/CN110718077B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106910351A (en) * | 2017-04-19 | 2017-06-30 | 大连理工大学 | A kind of traffic signals self-adaptation control method based on deeply study |
CN109544913A (en) * | 2018-11-07 | 2019-03-29 | 南京邮电大学 | A kind of traffic lights dynamic timing algorithm based on depth Q e-learning |
CN109360429A (en) * | 2018-12-13 | 2019-02-19 | 武汉摩尔数据技术有限公司 | A kind of urban highway traffic dispatching method and system based on simulative optimization |
CN109670233A (en) * | 2018-12-14 | 2019-04-23 | 南京理工大学 | More Traffic Light Automatic Control methods based on deeply study |
CN109472984A (en) * | 2018-12-27 | 2019-03-15 | 苏州科技大学 | Signalized control method, system and storage medium based on deeply study |
CN109559530A (en) * | 2019-01-07 | 2019-04-02 | 大连理工大学 | A kind of multi-intersection signal lamp cooperative control method based on Q value Transfer Depth intensified learning |
Non-Patent Citations (1)
Title |
---|
强化学习在城市交通信号灯控制方法中的应用;刘义;《科技导报》;20190331;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN110718077A (en) | 2020-01-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110718077B (en) | Signal lamp optimization timing method under action-evaluation mechanism | |
CN111931905B (en) | Graph convolution neural network model and vehicle track prediction method using same | |
CN108847037B (en) | Non-global information oriented urban road network path planning method | |
CN112700664B (en) | Traffic signal timing optimization method based on deep reinforcement learning | |
CN109785619B (en) | Regional traffic signal coordination optimization control system and control method thereof | |
CN113643528B (en) | Signal lamp control method, model training method, system, device and storage medium | |
CN101789182B (en) | Traffic signal control system and method based on parallel simulation technique | |
CN111951549B (en) | Self-adaptive traffic signal lamp control method and system in networked vehicle environment | |
CN110570672B (en) | Regional traffic signal lamp control method based on graph neural network | |
CN108320516B (en) | Road traffic capacity evaluation method based on sharp point mutation and quantum particle swarm optimization | |
CN110182217A (en) | A kind of traveling task complexity quantitative estimation method towards complicated scene of overtaking other vehicles | |
CN109816027A (en) | Training method, device and the unmanned equipment of unmanned decision model | |
CN113312752B (en) | Traffic simulation method and device for main road priority control intersection | |
CN118172941B (en) | Traffic light control method and device based on wireless communication | |
CN113487889B (en) | Traffic state anti-disturbance generation method based on single intersection signal control of rapid gradient descent | |
CN109489679B (en) | Arrival time calculation method in navigation path | |
CN114038218A (en) | Chained feedback multi-intersection signal lamp decision system and method based on road condition information | |
CN118097989B (en) | Multi-agent traffic area signal control method based on digital twin | |
CN115171408A (en) | Traffic signal optimization control method | |
CN109859475A (en) | A kind of intersection signal control method based on DBSCAN Density Clustering, apparatus and system | |
CN113806857A (en) | High-speed train energy-saving braking method based on variational graph self-encoder | |
CN117877245A (en) | Novel heterogeneous mixed traffic flow model grading evaluation and construction method | |
CN112396829B (en) | Intersection congestion index calculation method based on forward radar data | |
CN117173888A (en) | Data-driven-based hybrid vehicle queue performance index online evaluation method | |
Yu et al. | Minimize pressure difference traffic signal control based on deep reinforcement learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |