CN109472984A - Signalized control method, system and storage medium based on deeply study - Google Patents

Signalized control method, system and storage medium based on deeply study Download PDF

Info

Publication number
CN109472984A
CN109472984A CN201811616142.5A CN201811616142A CN109472984A CN 109472984 A CN109472984 A CN 109472984A CN 201811616142 A CN201811616142 A CN 201811616142A CN 109472984 A CN109472984 A CN 109472984A
Authority
CN
China
Prior art keywords
crossing
traffic
information
movement
deeply
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811616142.5A
Other languages
Chinese (zh)
Inventor
傅启明
吴少波
高振
陈建平
钟珊
陆悠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou University of Science and Technology
Original Assignee
Suzhou University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou University of Science and Technology filed Critical Suzhou University of Science and Technology
Priority to CN201811616142.5A priority Critical patent/CN109472984A/en
Publication of CN109472984A publication Critical patent/CN109472984A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/07Controlling traffic signals

Abstract

The present invention relates to a kind of intelligent traffic lamp control methods based on deeply study, comprising: select center crossing, there are multiple peripheral crossings being connected to center crossing around the center crossing, obtain the traffic information and signal information at each crossing, establish the unimpeded state model of crossing congestion, Traffic signal control problem is modeled as a Markovian decision process, and define state therein, movement and immediately reward functions, establish return value function model, optimal policy is solved using DQN deeply learning algorithm, the traffic lights at each crossing are controlled using optimal policy.The above method adaptively can dynamically adjust the control strategy of traffic lights according to real-time traffic information.And adjustment is synchronized to multiple crossings simultaneously, maximizing plays the ability that is open to traffic at each crossing.

Description

Signalized control method, system and storage medium based on deeply study
Technical field
The present invention relates to Signalized control field, more particularly to the Signalized control method learnt based on deeply, System and storage medium.
Background technique
Early 20th century, first appears in the U.S. by the traffic lights of electrically activating, the traffic in the subsequent time Signal lamp technology continues to develop, its appearance enables the effective control of traffic, for the flow that relieves traffic congestion, improves road energy Power, reducing traffic accident has positive effect.
Social fast-developing, economic growth is rapid, and people's lives condition becomes more superior, automobile also become basically universal to Each family, this has undoubtedly aggravated the transport pressure of municipal highway, so that urban road becomes crowded, this point is especially embodied in At crossroad, since traditional traffic signal lamp system cannot timely adapt to road conditions complicated and changeable, it frequently can lead to ten The waste of the congestion at word crossing and a part of transport resource.
At present China city use Traffic signal control mode, with the continuous development in city, vehicle flowrate it is continuous Expand, defect occur in traditional traffic lights, first is that different vehicle flowrate arterial highways often occurs in crossroad when vehicle is let pass The clearance time is identical, easily causes vehicle to accumulate, causes traffic jam;Second is that when on arterial traffic without vehicle, exactly arterial highway It is open to traffic the time, commander's blind spot has been resulted within this time;Third is that can not change red green when this arterial highway vehicle flowrate is very big The time of lamp extend this arterial highway by the time, cause the vehicle of this arterial highway cannot be by thus causing vehicle accumulation.
With the continuous development of traffic lights technology, traffic lights technology of today compared with the past in its function It is greatly improved, Modern Traffic signal lamp control system is the region friendship for integrating computer, communication and control technology Messenger real-time interconnection control system.Can be achieved to satisfy the need the real-time control of oral sex messenger, carry out area coordination control model, center and Local optimal control, the real-time query of crossing state and monitoring, with belisha beacon fault location, timing scheme it is real-time It uploads and downloads, the functions such as the record of operation log and management, the Telnet control of multi-user and rights management.This very big journey The jam situation for alleviating crossroad spent and the generation for reducing crossroad traffic accident, provide for the daily trip of people Great convenience.However, traditional system still remains intelligence in terms of the adaptive adjustment to road conditions complicated and changeable Not enough, inconvenient for use, low efficiency and dependent on numerous deficiencies such as manual operation cannot meet the needs of practical application conscientiously.
Summary of the invention
Based on this, it is necessary to for the problem that traditional adaptive adjustment capability of Signalized control method is poor, provide one The intelligent traffic lamp control method that kind is learnt based on deeply.
A kind of intelligent traffic lamp control method based on deeply study, comprising:
There are multiple peripheral crossings being connected to center crossing at selection center crossing around the center crossing,
The traffic information and signal information at each crossing are obtained,
The unimpeded state model of crossing congestion is established,
Traffic signal control problem is modeled as a Markovian decision process, and defines state therein, movement And reward functions immediately,
Return value function model is established,
Optimal policy is solved using DQN deeply learning algorithm,
The traffic lights at each crossing are controlled using optimal policy.
The above method adaptively can dynamically adjust the control strategy of traffic lights according to real-time traffic information.And simultaneously Adjustment is synchronized to multiple crossings, maximizing plays the ability that is open to traffic at each crossing.
The quantity at the peripheral crossing is 4 in one of the embodiments, and described 4 peripheral crossings are along the center Crossing is circumferentially uniformly distributed.
The center crossing and peripheral crossing are all crossroad in one of the embodiments,.
The traffic information includes the queue length of vehicle and the average speed of each vehicle in one of the embodiments, Degree.
It is described in one of the embodiments, to establish the unimpeded state model of crossing congestion specifically:
Traffic signalization Agent uses deeply learning method, constructs convolution mind network QVFor current value network, and A mutually isostructural Q* is constructed as target value network, constructed convolutional neural networks include input layer, two convolutional layers Network, a full articulamentum and output layer, input layer are the current traffic information at each crossing and the picture of signal information, are incited somebody to action The picture of the picture of traffic information and signal information respectively by the feature that is obtained after different convolution layer networks and it is all can The movement of energy is connected entirely, and output layer is that the value of everything under current state s estimates that (s, a), experience replay remember pond and use Q In recording all sample<s, s ', a, r>, wherein s indicates that current road condition, a indicate the movement executed under current road condition, S ' indicates the next state moved to after execution movement a under s state, and r indicates that execution acts a at current road condition s Obtained return immediately.
It is described in one of the embodiments, that Traffic signal control problem is modeled as a Markov decisior process Journey, and state therein, movement and reward functions immediately are defined, specifically:
State indicates with s, and current traffic condition s is by convolutional neural networks from the traffic information picture and signal lamp of input The feature extracted in information picture indicates;
Movement indicates, if greensignal light is opened for G, red colored lamp signal lamp is opened for R with a, respectively to first direction and The straight and turning left signal lamp of second direction is defined, and first direction and second direction are mutually perpendicular to, and the movement a of t moment is used [first direction straight trip, first direction turn left, and second direction straight trip, second direction is turned left] indicates that then the single crossing of t moment can adopt The set of actions taken are as follows:
A={ [G, R, R, R], [R, G, R, R], [R, R, G, R], [R, R, R, G] };
Reward functions immediately indicate with r, the total number of each crossing stationary vehicle under statistic behavior s, it is every increase by one it is quiet As soon as vehicle only just obtains -1 award, one static vehicle of every reduction obtains one+1 award.
It is described in one of the embodiments, to establish return value function model, specifically:
If (s a) indicates that, using the return value of movement a at state s, (s is a) about R (s, phase a) to value function Q to R It hopes, then Q (s, a)=E [R (s, a)].
It is described in one of the embodiments, to solve optimal policy using DQN deeply learning algorithm, specifically:
Initialization memory playback unit, capacity is N, for storing trained sample;
Initialize current value network, random initializtion weight parameter ω;
Initialized target value network, structure and initialization weight are identical as current value network;
By the photo for showing road conditions by current value network, the Q (s, a) by current value network under free position s is obtained After calculating value function, movement a is selected using ∈-greedy strategy, i.e. making movement is denoted as one for each next state transfer Time step t, and the data that each time step is obtained (s, a, r, s ') deposit playback memory unit;
Define a loss function:
L (ω)=E [(r+ γ maxa ' Q (s ', a ';ω-)-Q(s,a;ω))2],
One (s, a, r, s ') is randomly selected from playback memory unit, it will (s, a), s ', r be transmitted to current value net respectively Network, target value network and L (ω) are updated L (ω) about ω, more new formula using stochastic gradient descent method are as follows:
A kind of computer storage medium is stored with an at least executable instruction, the executable finger in the storage medium Enabling makes processor execute the corresponding operation of intelligent traffic lamp control method based on deeply study.
A kind of intelligent traffic signal lamp control system based on deeply study, comprising:
The peripheral road that information acquisition unit centrally disposed crossing at the information acquisition unit and is connected with center crossing On mouth, the information acquisition unit is used to obtain the traffic information and signal information at each crossing;
Signalized control unit, for controlling the operating of traffic lights;
Terminal processing units, the terminal processing units are logical with the information acquisition unit and Signalized control unit respectively Letter connection, the terminal processing units are according to the executable following operation of information that information acquisition unit obtains:
The unimpeded state model of crossing congestion is established,
Traffic signal control problem is modeled as a Markovian decision process, and defines state therein, movement And reward functions immediately,
Return value function model is established,
Optimal policy is solved using DQN deeply learning algorithm,
Traffic lights are controlled by the Signalized control unit at each crossing using optimal policy.
Detailed description of the invention
Fig. 1 is the flow chart of the Signalized control method of the embodiment of the present invention.
Fig. 2 is the schematic diagram of 5 crossroads used in the Signalized control method of the embodiment of the present invention.
Fig. 3 is the information acquisition unit and signal lamp control at single crossing in the signal lamp control system of the embodiment of the present invention The schematic diagram that unit processed is connect with terminal processing units respectively.
Fig. 4 is the DQN algorithm training process schematic diagram in the Signalized control method of the embodiment of the present invention.
Specific embodiment
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, with reference to the accompanying drawing to the present invention Specific embodiment be described in detail.Many details are explained in the following description in order to fully understand this hair It is bright.But the invention can be embodied in many other ways as described herein, those skilled in the art can be not Similar improvement is done in the case where violating intension of the present invention, therefore the present invention is not limited by the specific embodiments disclosed below.
It should be noted that it can directly on the other element when element is referred to as " being fixed on " another element Or there may also be elements placed in the middle.When an element is considered as " connection " another element, it, which can be, is directly connected to To another element or it may be simultaneously present centering elements.
Unless otherwise defined, all technical and scientific terms used herein and belong to technical field of the invention The normally understood meaning of technical staff is identical.Term as used herein in the specification of the present invention is intended merely to description tool The purpose of the embodiment of body, it is not intended that in the limitation present invention.Term " and or " used herein includes one or more phases Any and all combinations of the listed item of pass.
As depicted in figs. 1 and 2, the embodiment provides a kind of intelligent traffic signals based on deeply study Lamp control method comprising:
S100, select center crossing, there are multiple peripheral crossings being connected to center crossing around the center crossing;
S200, the traffic information and signal information for obtaining each crossing;
S300, the unimpeded state model of crossing congestion is established;
S400, Traffic signal control problem is modeled as to a Markovian decision process, and define state therein, Movement and immediately reward functions;
S500, return value function model is established;
S600, optimal policy is solved using DQN deeply learning algorithm;
S700, the traffic lights that each crossing is controlled using optimal policy.
The above method adaptively can dynamically adjust the control strategy of traffic lights according to real-time traffic information.And simultaneously Adjustment is synchronized to multiple crossings, maximizing plays the ability that is open to traffic at each crossing.
Further, the above method is with trained continuous progress, until the end of training process, obtained plan Slightly, the effect for alleviating crossroad congestion can be gradually increased.The above method is adapted to the road conditions at crossing and independent of spy Fixed environmental model.In the region that can especially guarantee five crossroads composition centered on a crossing, traffic fortune Movement Capabilities maximize, and the traffic capacity for being not limited solely to single crossing maximizes.
It it is appreciated that the quantity at above-mentioned peripheral crossing can be multiple, such as can be 4.Above-mentioned 4 peripheries crossing Arrangement can also there are many.For example, described 4 peripheral crossings are circumferentially uniformly distributed along the center crossing.Fig. 2 gives one kind Embodiment, in the embodiment, terminal processes crossing is above-mentioned Center Road mouthful, 4 peripheral crossings be respectively crossing 1, crossing 2, Crossing 3 and crossing 4.4 crossings are located at due east, due west, due south and the direct north at center crossing.
Further, the form at above-mentioned each crossing can be various ways.Such as shown in Fig. 2, each crossing is all Crossroad.Namely there is first direction and second that is open to traffic to be open to traffic direction, first be open to traffic direction and second direction that is open to traffic it is mutual Vertically.In Fig. 2, first is open to traffic direction for east-west direction, and second is open to traffic direction for North and South direction.
In the present embodiment, the traffic information includes the queue length of vehicle and the average speed of each vehicle.Each vehicle The vehicle platoon length separate computations in road.The vehicle platoon length of left-hand rotation and Through Lane can be calculated.For example, crossing 1 East turn left lane queue length be 25m.
It is described to establish the unimpeded state model of crossing congestion in step S300 in the present embodiment specifically:
Traffic signalization Agent uses deeply learning method, constructs convolution mind network QVFor current value network, and A mutually isostructural Q* is constructed as target value network, constructed convolutional neural networks include input layer, two convolutional layers Network, a full articulamentum and output layer, input layer are the current traffic information at each crossing and the picture of signal information, are incited somebody to action The picture of the picture of traffic information and signal information respectively by the feature that is obtained after different convolution layer networks and it is all can The movement of energy is connected entirely, and output layer is that the value of everything under current state s estimates that (s, a), experience replay remember pond and use Q In recording all sample<s, s ', a, r>, wherein s indicates that current road condition, a indicate the movement executed under current road condition, S ' indicates the next state moved to after execution movement a under s state, and r indicates that execution acts a at current road condition s Obtained return immediately.
It is described that Traffic signal control problem is modeled as a markov in above-mentioned steps S400 in the present embodiment Decision process, and state therein, movement and reward functions immediately are defined, specifically:
State indicates with s, and current traffic condition s is by convolutional neural networks from the traffic information picture and signal lamp of input The feature extracted in information picture indicates.Specifically, the traffic information picture pixels of input are 227*227, to its every 1*1's Pixel defines in the following way, if wherein there is vehicle, enabling the region is 1, will if enabling the region is 0 without vehicle It is 11*11 that traffic information picture passes through convolution kernel respectively, and the three-layer coil lamination of 5*5,3*3, the dimension of final output feature is 8192, then the feature extracted with signal information picture indicates the state at current crossing jointly, with two time steps for one group, no The traffic behavior at a certain moment is only depicted, can more reflect the dynamic rule of traffic behavior.
Movement indicates, if greensignal light is opened for G, red colored lamp signal lamp is opened for R with a, respectively to first direction and The straight and turning left signal lamp of second direction is defined, and first direction and second direction are mutually perpendicular to, and the movement a of t moment is used [first direction straight trip, first direction turn left, and second direction straight trip, second direction is turned left] indicates that then the single crossing of t moment can adopt The set of actions taken are as follows:
A={ [G, R, R, R], [R, G, R, R], [R, R, G, R], [R, R, R, G] };
So adoptable movement shares 4 at state s if crossing has 55=1024 kinds of possibility.
Reward functions immediately indicate with r, the total number of each crossing stationary vehicle under statistic behavior s, it is every increase by one it is quiet As soon as vehicle only just obtains -1 award, one static vehicle of every reduction obtains one+1 award.Final purpose is So that it is that five static vehicles in crossing reach minimum that award is maximum.
It is described to establish return value function model in above-mentioned steps S500 in the present embodiment, specifically:
If (s a) indicates that, using the return value of movement a at state s, (s is a) about R (s, phase a) to value function Q to R It hopes, then Q (s, a)=E [R (s, a)].
It is described to solve optimal policy, tool using DQN deeply learning algorithm in above-mentioned steps S600 in the present embodiment Body are as follows:
Initialization memory playback unit, capacity is N, for storing trained sample;
Initialize current value network, random initializtion weight parameter ω;
Initialized target value network, structure and initialization weight are identical as current value network;
By the photo for showing road conditions by current value network, the Q (s, a) by current value network under free position s is obtained After calculating value function, movement a is selected using ∈-greedy strategy, i.e. making movement is denoted as one for each next state transfer Time step t, and the data that each time step is obtained (s, a, r, s ') deposit playback memory unit;
Define a loss function:
L (ω)=E [(r+ γ maxa ' Q (s ', a ';ω-)-Q(s,a;ω))2],
One (s, a, r, s ') is randomly selected from playback memory unit, it will (s, a), s ', r be transmitted to current value net respectively Network, target value network and L (ω) are updated L (ω) about ω, more new formula using stochastic gradient descent method are as follows:
The embodiments of the present invention also provide a kind of computer storage medium, at least one is stored in the storage medium can It executes instruction, the executable instruction makes processor execute the intelligent traffic lamp control based on deeply study The corresponding operation of method.
The embodiments of the present invention also provide a kind of intelligent traffic signal lamp control systems based on deeply study, should System includes:
The peripheral road that information acquisition unit centrally disposed crossing at the information acquisition unit and is connected with center crossing On mouth, the information acquisition unit is used to obtain the traffic information and signal information at each crossing;
Signalized control unit, for controlling the operating of traffic lights;
Terminal processing units, the terminal processing units are logical with the information acquisition unit and Signalized control unit respectively Letter connection, the terminal processing units are according to the executable following operation of information that information acquisition unit obtains:
The unimpeded state model of crossing congestion is established,
Traffic signal control problem is modeled as a Markovian decision process, and defines state therein, movement And reward functions immediately,
Return value function model is established,
Optimal policy is solved using DQN deeply learning algorithm,
Traffic lights are controlled by the Signalized control unit at each crossing using optimal policy.
It is one group that above system, which is by adjacent multiple crossings, and each group of crossing positioned at center may be configured as terminal Crossing is handled, by the signal information picture at each crossing of traffic information picture and synchronization of each crossroad with two A time step is one group and is transmitted to terminal processing units.Time step can determine according to practical crossing congestion degree in above system It is fixed.Congestion level i.e. traffic information can be defined by the queue length of the vehicle at crossing and the average speed of all vehicles.It can Dynamic adjustment is carried out according to the actual situation.Such as: the queue length of vehicle is greater than 25m, and average speed is less than 10km/h, then the time Step-length can be set as 5s.The queue length of vehicle is less than 25m, and average speed is less than 10km/h, then time step can be set as 5s.Vehicle Queue length be greater than 25m, average speed be greater than 10km/h, then time step can be set as 10s.The queue length of vehicle is less than 25m, average speed are greater than 10km/h, then time step can be set as 10s.
Further, optimal policy can be calculated according to respective algorithms in terminal processing units.For example, by traffic information Pass through two convolutional neural networks respectively with signal information, Markovian decision process is constructed simultaneously by the method for intensified learning Optimal policy is solved, so that current demand signal lamp control system is made most suitable movement according to optimal policy.
In the present embodiment, the traffic information includes the queue length of vehicle and the average speed of each vehicle.Each vehicle The vehicle platoon length in road can separate computations.The vehicle platoon length of left-hand rotation and Through Lane can be calculated.For example, crossing Lane queue length turn left as 25m in 1 east.
Further, at the centrally disposed crossing of terminal processing units.In this way, being more advantageous to large-scale use Terminal processing units are centrally disposed at crossing during data transmission, can also make transmission loss most by above system It is small.
Specifically, by taking Fig. 2 as an example.Four information acquisition units and two letters can be set in each crossroad in the system Signal lamp control unit.Terminal processes crossing is additionally provided with terminal processing units.Each information acquisition unit includes supporting USB transmission Electronic camera and the first communication module being connect with the electronic camera, in this way setting can captured in real-time crossing road conditions letter Breath.Each Signalized control unit includes traffic controller and the second communication module that is connected with traffic controller.It is described It is connected between second communication module and first communication module by wifi network.The terminal processing units include data processing group Part and the third communication module being connected with data handling component.The third communication module and second communication module pass through wifi network Network connection.The data handling component is connect with third communication module by USB interface.It is appreciated that above-mentioned each element it Between connection type be not limited to aforesaid way.It can also be realized between respective element using existing interface and connection type Connection.
In the present embodiment, the first communication module use SKW77-WIFI module, the electronic camera with it is described It is communicated to connect between SKW77-WIFI module by USB interface.
In the present embodiment, the second communication module uses SKW77-WIFI module.Second communication module is communicated with first It is connected between module by wifi network.
In the present embodiment, the traffic controller and the second communication module are communicated to connect by USB interface.
In the present embodiment, the third communication module uses SKW77-WIFI module.Third communication module is communicated with second It is connected between module by wifi network.
In the present embodiment, the data handling component is NVIDIA Jetson TK1 developer component.Data handling component with It is communicated to connect between the third communication module by USB interface.
The specific workflow of above system of the invention are as follows:
Above-mentioned electronic camera acquires the traffic information and signal information at corresponding crossing in real time.
Above-mentioned first communication module is connected with second communication module by wifi network.By first communication module by road conditions Information and signal lamp information is transmitted to second communication module.
Above-mentioned second communication module and third communication module are communicated by wifi network.By second communication module by road conditions Information and signal lamp information is transmitted to third communication module.
Above-mentioned third communication module and data handling component are communicated to connect by USB interface.It will by third communication module Traffic information and signal information are transmitted to data handling component.
After the data handling component receives traffic information and signal information, according to the traffic information at each crossing and Signal information establishes the unimpeded state model of crossing congestion.
Traffic signal control problem is modeled as a markov decision process model, to state therein, movement And reward functions are modeled immediately.
Establish return value function model.
Optimal policy is solved using DQN deeply learning algorithm.
Traffic lights are controlled by the Signalized control unit at each crossing using optimal policy.
The present invention establishes environmental model to received data information by terminal processing units, is obtained most according to DQN algorithm Excellent signal lamp regulation and control scheme.According to the vehicle flowrate automatic adjusument traffic lights of current crossroad, do not need to manually provide Learning sample.Using DQN algorithm on-line study optimal correction strategy, update by stochastic gradient descent method to loss function, Restrain the parameter of current value network gradually.The present invention is significant compared with the traffic light control system of existing fixation Advantage is: 1) can be for random complicated road conditions dynamic corrections optimal policy;2) as trained continuous progress is until training The end of process, its function of alleviating crossroad congestion of the obtained strategy of system can become better and better;3) system is adapted to The road conditions at crossing and independent of specific environmental model;4) guarantee five crossroads composition centered on a crossing Traffic capacity maximizes and is not limited solely to single crossing in region.
Each technical characteristic of embodiment described above can be combined arbitrarily, for simplicity of description, not to above-mentioned reality It applies all possible combination of each technical characteristic in example to be all described, as long as however, the combination of these technical characteristics is not deposited In contradiction, all should be considered as described in this specification.
The embodiments described above only express several embodiments of the present invention, and the description thereof is more specific and detailed, but simultaneously It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art It says, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to protection of the invention Range.Therefore, the scope of protection of the patent of the invention shall be subject to the appended claims.

Claims (10)

1. a kind of intelligent traffic lamp control method based on deeply study characterized by comprising
There are multiple peripheral crossings being connected to center crossing at selection center crossing around the center crossing,
The traffic information and signal information at each crossing are obtained,
The unimpeded state model of crossing congestion is established,
Traffic signal control problem is modeled as a Markovian decision process, and define state therein, movement and Reward functions immediately,
Return value function model is established,
Optimal policy is solved using DQN deeply learning algorithm,
The traffic lights at each crossing are controlled using optimal policy.
2. the intelligent traffic lamp control method according to claim 1 based on deeply study, which is characterized in that The quantity at the periphery crossing is 4, and described 4 peripheral crossings are circumferentially uniformly distributed along the center crossing.
3. the intelligent traffic lamp control method according to claim 2 based on deeply study, which is characterized in that The center crossing and peripheral crossing are all crossroad.
4. the intelligent traffic lamp control method according to claim 1 based on deeply study, which is characterized in that The traffic information includes the queue length of vehicle and the average speed of each vehicle.
5. the intelligent traffic lamp control method according to claim 1 based on deeply study, which is characterized in that It is described to establish the unimpeded state model of crossing congestion specifically:
Traffic signalization Agent uses deeply learning method, constructs convolution mind network QVFor current value network, and construct One mutually isostructural Q* as target value network, constructed convolutional neural networks include input layer, two convolution layer networks, One full articulamentum and output layer, input layer is the current traffic information at each crossing and the picture of signal information, by road conditions The picture of the picture of information and signal information respectively by the feature that is obtained after different convolution layer networks with it is all possible Movement is connected entirely, and output layer is that the value of everything under current state s estimates that (s, a), experience replay memory pond is for remembering by Q Record all sample<s, s ', a, r>, wherein s indicates that current road condition, a indicate the movement executed under current road condition, s ' table Show the next state moved to after execution movement a under s state, r indicates that execution movement a is obtained at current road condition s Return immediately.
6. the intelligent traffic lamp control method according to claim 1 based on deeply study, which is characterized in that It is described that Traffic signal control problem is modeled as a Markovian decision process, and define state therein, movement and Reward functions immediately, specifically:
State indicates with s, and current traffic condition s is by convolutional neural networks from the traffic information picture and signal information of input The feature extracted in picture indicates;
Movement, is indicated, if greensignal light is opened for G, red colored lamp signal lamp is opened for R, respectively to first direction and second with a The straight and turning left signal lamp in direction is defined, and first direction and second direction are mutually perpendicular to, and the movement a of t moment uses [first Direction straight trip, first direction turn left, and second direction straight trip, second direction is turned left] it indicates, then the single crossing of t moment can take Set of actions are as follows:
A={ [G, R, R, R], [R, G, R, R], [R, R, G, R], [R, R, R, G] };
Reward functions immediately indicate with r, the total number of each crossing stationary vehicle under statistic behavior s,
As soon as every award for increasing a static vehicle and just obtaining -1, one static vehicle of every reduction obtain one+1 Award.
7. the intelligent traffic lamp control method according to claim 1 based on deeply study, which is characterized in that It is described to establish return value function model, specifically:
If (s a) indicates at state s using the return value of movement a R, and (s is a) about R (s, expectation a), then Q to value function Q (s, a)=E [R (s, a)].
8. the intelligent traffic lamp control method according to claim 1 based on deeply study, which is characterized in that It is described to solve optimal policy using DQN deeply learning algorithm, specifically:
Initialization memory playback unit, capacity is N, for storing trained sample;
Initialize current value network, random initializtion weight parameter ω;
Initialized target value network, structure and initialization weight are identical as current value network;
By the photo for showing road conditions by current value network, the Q (s, a) by current value network query function under free position s is obtained Out after value function, movement a is selected using ∈-greedy strategy, i.e. making movement is denoted as a time for each next state transfer Step t, and the data that each time step is obtained (s, a, r, s ') deposit playback memory unit;
Define a loss function:
L (ω)=E [(r+ γ maxa ' Q (s ', a ';ω-)-Q(s,a;ω))2],
One (s, a, r, s ') is randomly selected from playback memory unit, it will (s, a), s ', r be transmitted to current value network respectively, mesh Scale value network and L (ω) are updated L (ω) about ω, more new formula using stochastic gradient descent method are as follows:
9. a kind of computer storage medium, which is characterized in that an at least executable instruction is stored in the storage medium, it is described The intelligence that executable instruction executes processor as claimed in any of claims 1 to 8 in one of claims based on deeply study is handed over The corresponding operation of ventilating signal lamp control method.
10. a kind of intelligent traffic signal lamp control system based on deeply study characterized by comprising
The peripheral crossing that information acquisition unit centrally disposed crossing at the information acquisition unit and is connected with center crossing On, the information acquisition unit is used to obtain the traffic information and signal information at each crossing;
Signalized control unit, for controlling the operating of traffic lights;
Terminal processing units, the terminal processing units connect with the information acquisition unit and Signalized control unit communication respectively It connects, the terminal processing units are according to the executable following operation of information that information acquisition unit obtains:
The unimpeded state model of crossing congestion is established,
Traffic signal control problem is modeled as a Markovian decision process, and define state therein, movement and Reward functions immediately,
Return value function model is established,
Optimal policy is solved using DQN deeply learning algorithm,
Traffic lights are controlled by the Signalized control unit at each crossing using optimal policy.
CN201811616142.5A 2018-12-27 2018-12-27 Signalized control method, system and storage medium based on deeply study Pending CN109472984A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811616142.5A CN109472984A (en) 2018-12-27 2018-12-27 Signalized control method, system and storage medium based on deeply study

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811616142.5A CN109472984A (en) 2018-12-27 2018-12-27 Signalized control method, system and storage medium based on deeply study

Publications (1)

Publication Number Publication Date
CN109472984A true CN109472984A (en) 2019-03-15

Family

ID=65677259

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811616142.5A Pending CN109472984A (en) 2018-12-27 2018-12-27 Signalized control method, system and storage medium based on deeply study

Country Status (1)

Country Link
CN (1) CN109472984A (en)

Cited By (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110021168A (en) * 2019-04-18 2019-07-16 上海科技大学 The stepped strategy method of real-time intelligent traffic management is realized under a kind of car networking
CN110047278A (en) * 2019-03-30 2019-07-23 北京交通大学 A kind of self-adapting traffic signal control system and method based on deeply study
CN110060475A (en) * 2019-04-17 2019-07-26 清华大学 A kind of multi-intersection signal lamp cooperative control method based on deeply study
CN110136456A (en) * 2019-05-12 2019-08-16 苏州科技大学 Traffic lights anti-clogging control method and system based on deeply study
CN110164151A (en) * 2019-06-21 2019-08-23 西安电子科技大学 Traffic lamp control method based on distributed deep-cycle Q network
CN110299008A (en) * 2019-07-19 2019-10-01 浙江工业大学 A kind of traffic flow multistep forecasting method based on intensified learning
CN110363295A (en) * 2019-06-28 2019-10-22 电子科技大学 A kind of intelligent vehicle multilane lane-change method based on DQN
CN110428615A (en) * 2019-07-12 2019-11-08 中国科学院自动化研究所 Learn isolated intersection traffic signal control method, system, device based on deeply
CN110444028A (en) * 2019-09-06 2019-11-12 科大讯飞股份有限公司 Multiple Intersections Signalized control method, device and equipment
CN110491146A (en) * 2019-08-21 2019-11-22 浙江工业大学 A kind of traffic signal control scheme real-time recommendation method based on deep learning
CN110503839A (en) * 2019-10-21 2019-11-26 江苏广宇科技产业发展有限公司 Method and system based on single device coordinated control Multiple Intersections traffic signals
CN110531681A (en) * 2019-09-17 2019-12-03 山东建筑大学 Room lighting data acquisition control system and method based on deeply study
CN110718077A (en) * 2019-11-04 2020-01-21 武汉理工大学 Signal lamp optimization timing method under action-evaluation mechanism
CN110930734A (en) * 2019-11-30 2020-03-27 天津大学 Intelligent idle traffic indicator lamp control method based on reinforcement learning
CN110936954A (en) * 2020-01-02 2020-03-31 南京航空航天大学 Intelligent vehicle prediction decision fusion method considering vehicle bidirectional interaction
CN110969872A (en) * 2019-12-18 2020-04-07 上海天壤智能科技有限公司 Traffic signal control method and system based on reinforcement learning and graph attention network
CN111028504A (en) * 2019-11-27 2020-04-17 天津易华录信息技术有限公司 Urban expressway intelligent traffic control method and system
CN111081035A (en) * 2019-12-17 2020-04-28 扬州市鑫通智能信息技术有限公司 Traffic signal control method based on Q learning
CN111243299A (en) * 2020-01-20 2020-06-05 浙江工业大学 Single cross port signal control method based on 3 DQN-PSER algorithm
CN111564048A (en) * 2020-04-28 2020-08-21 郑州大学 Traffic signal lamp control method and device, electronic equipment and storage medium
CN111696348A (en) * 2020-06-05 2020-09-22 南京云创大数据科技股份有限公司 Multifunctional intelligent signal control system and method
CN111696370A (en) * 2020-06-16 2020-09-22 西安电子科技大学 Traffic light control method based on heuristic deep Q network
CN112216128A (en) * 2020-09-28 2021-01-12 航天科工广信智能技术有限公司 Large-scale road network traffic signal control method based on deep Q learning neural network
CN112365724A (en) * 2020-04-13 2021-02-12 北方工业大学 Continuous intersection signal cooperative control method based on deep reinforcement learning
CN112380761A (en) * 2020-10-20 2021-02-19 珠海米枣智能科技有限公司 Building environment controller based on reinforcement learning and control method
CN112614343A (en) * 2020-12-11 2021-04-06 多伦科技股份有限公司 Traffic signal control method and system based on random strategy gradient and electronic equipment
CN112863206A (en) * 2021-01-07 2021-05-28 北京大学 Traffic signal lamp control method and system based on reinforcement learning
CN113287156A (en) * 2019-10-28 2021-08-20 乐人株式会社 Signal control device and signal control method based on reinforcement learning
CN113380054A (en) * 2021-06-09 2021-09-10 湖南大学 Traffic signal lamp control method and system based on reinforcement learning
CN113487887A (en) * 2021-07-23 2021-10-08 京东城市(北京)数字科技有限公司 Signal lamp control method and device, electronic equipment and storage medium
CN113625561A (en) * 2021-07-29 2021-11-09 浙江大学 Domain coordination multi-agent system cooperation control method based on reinforcement learning
CN113628458A (en) * 2021-08-10 2021-11-09 四川易方智慧科技有限公司 Traffic signal lamp optimization method based on group intelligent reinforcement learning
CN113763723A (en) * 2021-09-06 2021-12-07 武汉理工大学 Traffic signal lamp control system and method based on reinforcement learning and dynamic timing
CN114038218A (en) * 2021-12-28 2022-02-11 江苏泰坦智慧科技有限公司 Chained feedback multi-intersection signal lamp decision system and method based on road condition information
CN114120670A (en) * 2021-11-25 2022-03-01 支付宝(杭州)信息技术有限公司 Method and system for traffic signal control
CN117135655A (en) * 2023-08-15 2023-11-28 华中科技大学 Intelligent OFDMA resource scheduling method, system and terminal of delay-sensitive WiFi

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150100530A1 (en) * 2013-10-08 2015-04-09 Google Inc. Methods and apparatus for reinforcement learning
WO2017004626A1 (en) * 2015-07-01 2017-01-05 The Board Of Trustees Of The Leland Stanford Junior University Systems and methods for providing reinforcement learning in a deep learning system
CN106842925A (en) * 2017-01-20 2017-06-13 清华大学 A kind of locomotive smart steering method and system based on deeply study
CN108831168A (en) * 2018-06-01 2018-11-16 江苏数翰信息科技有限公司 A kind of method for controlling traffic signal lights and system based on association crossing visual identity

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150100530A1 (en) * 2013-10-08 2015-04-09 Google Inc. Methods and apparatus for reinforcement learning
WO2017004626A1 (en) * 2015-07-01 2017-01-05 The Board Of Trustees Of The Leland Stanford Junior University Systems and methods for providing reinforcement learning in a deep learning system
CN106842925A (en) * 2017-01-20 2017-06-13 清华大学 A kind of locomotive smart steering method and system based on deeply study
CN108831168A (en) * 2018-06-01 2018-11-16 江苏数翰信息科技有限公司 A kind of method for controlling traffic signal lights and system based on association crossing visual identity

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JUNTAO GAO: "Adaptive Traffic Signal Control: Deep Reinforcement Learning Algorithm with Experience Replay and Target Network", 《ARXIV》, pages 1 - 10 *

Cited By (52)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110047278B (en) * 2019-03-30 2021-06-08 北京交通大学 Adaptive traffic signal control system and method based on deep reinforcement learning
CN110047278A (en) * 2019-03-30 2019-07-23 北京交通大学 A kind of self-adapting traffic signal control system and method based on deeply study
CN110060475A (en) * 2019-04-17 2019-07-26 清华大学 A kind of multi-intersection signal lamp cooperative control method based on deeply study
CN110021168A (en) * 2019-04-18 2019-07-16 上海科技大学 The stepped strategy method of real-time intelligent traffic management is realized under a kind of car networking
CN110021168B (en) * 2019-04-18 2021-08-27 上海科技大学 Grading decision method for realizing real-time intelligent traffic management under Internet of vehicles
CN110136456A (en) * 2019-05-12 2019-08-16 苏州科技大学 Traffic lights anti-clogging control method and system based on deeply study
CN110164151A (en) * 2019-06-21 2019-08-23 西安电子科技大学 Traffic lamp control method based on distributed deep-cycle Q network
CN110363295A (en) * 2019-06-28 2019-10-22 电子科技大学 A kind of intelligent vehicle multilane lane-change method based on DQN
CN110428615A (en) * 2019-07-12 2019-11-08 中国科学院自动化研究所 Learn isolated intersection traffic signal control method, system, device based on deeply
CN110428615B (en) * 2019-07-12 2021-06-22 中国科学院自动化研究所 Single intersection traffic signal control method, system and device based on deep reinforcement learning
CN110299008A (en) * 2019-07-19 2019-10-01 浙江工业大学 A kind of traffic flow multistep forecasting method based on intensified learning
CN110299008B (en) * 2019-07-19 2020-11-13 浙江工业大学 Traffic flow multi-step prediction method based on reinforcement learning
CN110491146B (en) * 2019-08-21 2020-08-21 浙江工业大学 Deep learning-based traffic signal control scheme real-time recommendation method
CN110491146A (en) * 2019-08-21 2019-11-22 浙江工业大学 A kind of traffic signal control scheme real-time recommendation method based on deep learning
CN110444028A (en) * 2019-09-06 2019-11-12 科大讯飞股份有限公司 Multiple Intersections Signalized control method, device and equipment
CN110531681A (en) * 2019-09-17 2019-12-03 山东建筑大学 Room lighting data acquisition control system and method based on deeply study
CN110503839A (en) * 2019-10-21 2019-11-26 江苏广宇科技产业发展有限公司 Method and system based on single device coordinated control Multiple Intersections traffic signals
CN113287156B (en) * 2019-10-28 2023-08-18 乐路股份有限公司 Signal control device and signal control method based on reinforcement learning
CN113287156A (en) * 2019-10-28 2021-08-20 乐人株式会社 Signal control device and signal control method based on reinforcement learning
US11823573B2 (en) 2019-10-28 2023-11-21 Laon Road Inc. Signal control apparatus and signal control method based on reinforcement learning
CN110718077B (en) * 2019-11-04 2020-08-07 武汉理工大学 Signal lamp optimization timing method under action-evaluation mechanism
CN110718077A (en) * 2019-11-04 2020-01-21 武汉理工大学 Signal lamp optimization timing method under action-evaluation mechanism
CN111028504A (en) * 2019-11-27 2020-04-17 天津易华录信息技术有限公司 Urban expressway intelligent traffic control method and system
CN110930734A (en) * 2019-11-30 2020-03-27 天津大学 Intelligent idle traffic indicator lamp control method based on reinforcement learning
CN111081035A (en) * 2019-12-17 2020-04-28 扬州市鑫通智能信息技术有限公司 Traffic signal control method based on Q learning
CN110969872A (en) * 2019-12-18 2020-04-07 上海天壤智能科技有限公司 Traffic signal control method and system based on reinforcement learning and graph attention network
CN110936954A (en) * 2020-01-02 2020-03-31 南京航空航天大学 Intelligent vehicle prediction decision fusion method considering vehicle bidirectional interaction
CN110936954B (en) * 2020-01-02 2020-12-01 南京航空航天大学 Intelligent vehicle prediction decision fusion method considering vehicle bidirectional interaction
CN111243299B (en) * 2020-01-20 2020-12-15 浙江工业大学 Single cross port signal control method based on 3 DQN-PSER algorithm
CN111243299A (en) * 2020-01-20 2020-06-05 浙江工业大学 Single cross port signal control method based on 3 DQN-PSER algorithm
CN112365724A (en) * 2020-04-13 2021-02-12 北方工业大学 Continuous intersection signal cooperative control method based on deep reinforcement learning
CN111564048A (en) * 2020-04-28 2020-08-21 郑州大学 Traffic signal lamp control method and device, electronic equipment and storage medium
CN111696348A (en) * 2020-06-05 2020-09-22 南京云创大数据科技股份有限公司 Multifunctional intelligent signal control system and method
CN111696370A (en) * 2020-06-16 2020-09-22 西安电子科技大学 Traffic light control method based on heuristic deep Q network
CN112216128A (en) * 2020-09-28 2021-01-12 航天科工广信智能技术有限公司 Large-scale road network traffic signal control method based on deep Q learning neural network
CN112380761B (en) * 2020-10-20 2024-01-26 珠海米枣智能科技有限公司 Building environment controller and control method based on reinforcement learning
CN112380761A (en) * 2020-10-20 2021-02-19 珠海米枣智能科技有限公司 Building environment controller based on reinforcement learning and control method
WO2022121510A1 (en) * 2020-12-11 2022-06-16 多伦科技股份有限公司 Stochastic policy gradient-based traffic signal control method and system, and electronic device
CN112614343A (en) * 2020-12-11 2021-04-06 多伦科技股份有限公司 Traffic signal control method and system based on random strategy gradient and electronic equipment
CN112863206B (en) * 2021-01-07 2022-08-09 北京大学 Traffic signal lamp control method and system based on reinforcement learning
CN112863206A (en) * 2021-01-07 2021-05-28 北京大学 Traffic signal lamp control method and system based on reinforcement learning
CN113380054A (en) * 2021-06-09 2021-09-10 湖南大学 Traffic signal lamp control method and system based on reinforcement learning
CN113487887A (en) * 2021-07-23 2021-10-08 京东城市(北京)数字科技有限公司 Signal lamp control method and device, electronic equipment and storage medium
CN113625561A (en) * 2021-07-29 2021-11-09 浙江大学 Domain coordination multi-agent system cooperation control method based on reinforcement learning
CN113625561B (en) * 2021-07-29 2023-09-26 浙江大学 Domain coordination multi-agent system cooperative control method based on reinforcement learning
CN113628458A (en) * 2021-08-10 2021-11-09 四川易方智慧科技有限公司 Traffic signal lamp optimization method based on group intelligent reinforcement learning
CN113763723A (en) * 2021-09-06 2021-12-07 武汉理工大学 Traffic signal lamp control system and method based on reinforcement learning and dynamic timing
CN113763723B (en) * 2021-09-06 2023-01-17 武汉理工大学 Traffic signal lamp control system and method based on reinforcement learning and dynamic timing
CN114120670A (en) * 2021-11-25 2022-03-01 支付宝(杭州)信息技术有限公司 Method and system for traffic signal control
CN114120670B (en) * 2021-11-25 2024-03-26 支付宝(杭州)信息技术有限公司 Method and system for traffic signal control
CN114038218A (en) * 2021-12-28 2022-02-11 江苏泰坦智慧科技有限公司 Chained feedback multi-intersection signal lamp decision system and method based on road condition information
CN117135655A (en) * 2023-08-15 2023-11-28 华中科技大学 Intelligent OFDMA resource scheduling method, system and terminal of delay-sensitive WiFi

Similar Documents

Publication Publication Date Title
CN109472984A (en) Signalized control method, system and storage medium based on deeply study
CN113409579B (en) Intelligent city traffic control system based on AI internet of things technology
CN110136456A (en) Traffic lights anti-clogging control method and system based on deeply study
CN108831168B (en) Traffic signal lamp control method and system based on visual identification of associated intersection
CN110390246A (en) A kind of video analysis method in side cloud environment
CN108804983A (en) Traffic signal light condition recognition methods, device, vehicle-mounted control terminal and motor vehicle
CN105654744B (en) A kind of improvement traffic signal control method based on Q study
CN106846837A (en) A kind of traffic light intelligent control system, traffic lights intelligent control method and device
CN107507430A (en) A kind of urban road crossing traffic control method and system
CN109087517A (en) Intelligent signal lamp control method and system based on big data
CN111710177B (en) Intelligent traffic signal lamp networking cooperative optimization control system and control method
CN110365787A (en) A kind of application container simultaneously optimizes layout method based on the edge calculations of micro services frame
CN105872075B (en) A method of internet of things equipment is mapped to smart city resource model
CN112419762A (en) Internet of things platform-based reinforcement learning intelligent traffic signal lamp control method and system
CN108156388A (en) Power consumption control method and photographic device
CN106448171A (en) Ponding road prediction method and device
CN108549952A (en) Optimization method and device for double-layer path of vehicle-mounted unmanned aerial vehicle
CN109003460A (en) Traffic lights Optimization Scheduling and system
CN112258865B (en) Intelligent red and green signal lamp control system based on Internet of vehicles V2X
CN108133604A (en) A kind of traffic lights dynamic realtime dispatching method based on traffic characteristic
CN111785043A (en) Intersection control method for intelligent internet connection
CN105281957B (en) A kind of method and server of the access device in Internet of Things
CN108205622A (en) The authority control method and device of a kind of application program for mobile terminal
CN112785162A (en) High-precision map crowdsourcing data quality assessment method and system based on intelligent vehicle semantics
CN109637123A (en) A kind of complexity traffic environment downlink people living things feature recognition and traffic control system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination