CN109472984A - Signalized control method, system and storage medium based on deeply study - Google Patents
Signalized control method, system and storage medium based on deeply study Download PDFInfo
- Publication number
- CN109472984A CN109472984A CN201811616142.5A CN201811616142A CN109472984A CN 109472984 A CN109472984 A CN 109472984A CN 201811616142 A CN201811616142 A CN 201811616142A CN 109472984 A CN109472984 A CN 109472984A
- Authority
- CN
- China
- Prior art keywords
- crossing
- traffic
- information
- movement
- deeply
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/07—Controlling traffic signals
Abstract
The present invention relates to a kind of intelligent traffic lamp control methods based on deeply study, comprising: select center crossing, there are multiple peripheral crossings being connected to center crossing around the center crossing, obtain the traffic information and signal information at each crossing, establish the unimpeded state model of crossing congestion, Traffic signal control problem is modeled as a Markovian decision process, and define state therein, movement and immediately reward functions, establish return value function model, optimal policy is solved using DQN deeply learning algorithm, the traffic lights at each crossing are controlled using optimal policy.The above method adaptively can dynamically adjust the control strategy of traffic lights according to real-time traffic information.And adjustment is synchronized to multiple crossings simultaneously, maximizing plays the ability that is open to traffic at each crossing.
Description
Technical field
The present invention relates to Signalized control field, more particularly to the Signalized control method learnt based on deeply,
System and storage medium.
Background technique
Early 20th century, first appears in the U.S. by the traffic lights of electrically activating, the traffic in the subsequent time
Signal lamp technology continues to develop, its appearance enables the effective control of traffic, for the flow that relieves traffic congestion, improves road energy
Power, reducing traffic accident has positive effect.
Social fast-developing, economic growth is rapid, and people's lives condition becomes more superior, automobile also become basically universal to
Each family, this has undoubtedly aggravated the transport pressure of municipal highway, so that urban road becomes crowded, this point is especially embodied in
At crossroad, since traditional traffic signal lamp system cannot timely adapt to road conditions complicated and changeable, it frequently can lead to ten
The waste of the congestion at word crossing and a part of transport resource.
At present China city use Traffic signal control mode, with the continuous development in city, vehicle flowrate it is continuous
Expand, defect occur in traditional traffic lights, first is that different vehicle flowrate arterial highways often occurs in crossroad when vehicle is let pass
The clearance time is identical, easily causes vehicle to accumulate, causes traffic jam;Second is that when on arterial traffic without vehicle, exactly arterial highway
It is open to traffic the time, commander's blind spot has been resulted within this time;Third is that can not change red green when this arterial highway vehicle flowrate is very big
The time of lamp extend this arterial highway by the time, cause the vehicle of this arterial highway cannot be by thus causing vehicle accumulation.
With the continuous development of traffic lights technology, traffic lights technology of today compared with the past in its function
It is greatly improved, Modern Traffic signal lamp control system is the region friendship for integrating computer, communication and control technology
Messenger real-time interconnection control system.Can be achieved to satisfy the need the real-time control of oral sex messenger, carry out area coordination control model, center and
Local optimal control, the real-time query of crossing state and monitoring, with belisha beacon fault location, timing scheme it is real-time
It uploads and downloads, the functions such as the record of operation log and management, the Telnet control of multi-user and rights management.This very big journey
The jam situation for alleviating crossroad spent and the generation for reducing crossroad traffic accident, provide for the daily trip of people
Great convenience.However, traditional system still remains intelligence in terms of the adaptive adjustment to road conditions complicated and changeable
Not enough, inconvenient for use, low efficiency and dependent on numerous deficiencies such as manual operation cannot meet the needs of practical application conscientiously.
Summary of the invention
Based on this, it is necessary to for the problem that traditional adaptive adjustment capability of Signalized control method is poor, provide one
The intelligent traffic lamp control method that kind is learnt based on deeply.
A kind of intelligent traffic lamp control method based on deeply study, comprising:
There are multiple peripheral crossings being connected to center crossing at selection center crossing around the center crossing,
The traffic information and signal information at each crossing are obtained,
The unimpeded state model of crossing congestion is established,
Traffic signal control problem is modeled as a Markovian decision process, and defines state therein, movement
And reward functions immediately,
Return value function model is established,
Optimal policy is solved using DQN deeply learning algorithm,
The traffic lights at each crossing are controlled using optimal policy.
The above method adaptively can dynamically adjust the control strategy of traffic lights according to real-time traffic information.And simultaneously
Adjustment is synchronized to multiple crossings, maximizing plays the ability that is open to traffic at each crossing.
The quantity at the peripheral crossing is 4 in one of the embodiments, and described 4 peripheral crossings are along the center
Crossing is circumferentially uniformly distributed.
The center crossing and peripheral crossing are all crossroad in one of the embodiments,.
The traffic information includes the queue length of vehicle and the average speed of each vehicle in one of the embodiments,
Degree.
It is described in one of the embodiments, to establish the unimpeded state model of crossing congestion specifically:
Traffic signalization Agent uses deeply learning method, constructs convolution mind network QVFor current value network, and
A mutually isostructural Q* is constructed as target value network, constructed convolutional neural networks include input layer, two convolutional layers
Network, a full articulamentum and output layer, input layer are the current traffic information at each crossing and the picture of signal information, are incited somebody to action
The picture of the picture of traffic information and signal information respectively by the feature that is obtained after different convolution layer networks and it is all can
The movement of energy is connected entirely, and output layer is that the value of everything under current state s estimates that (s, a), experience replay remember pond and use Q
In recording all sample<s, s ', a, r>, wherein s indicates that current road condition, a indicate the movement executed under current road condition,
S ' indicates the next state moved to after execution movement a under s state, and r indicates that execution acts a at current road condition s
Obtained return immediately.
It is described in one of the embodiments, that Traffic signal control problem is modeled as a Markov decisior process
Journey, and state therein, movement and reward functions immediately are defined, specifically:
State indicates with s, and current traffic condition s is by convolutional neural networks from the traffic information picture and signal lamp of input
The feature extracted in information picture indicates;
Movement indicates, if greensignal light is opened for G, red colored lamp signal lamp is opened for R with a, respectively to first direction and
The straight and turning left signal lamp of second direction is defined, and first direction and second direction are mutually perpendicular to, and the movement a of t moment is used
[first direction straight trip, first direction turn left, and second direction straight trip, second direction is turned left] indicates that then the single crossing of t moment can adopt
The set of actions taken are as follows:
A={ [G, R, R, R], [R, G, R, R], [R, R, G, R], [R, R, R, G] };
Reward functions immediately indicate with r, the total number of each crossing stationary vehicle under statistic behavior s, it is every increase by one it is quiet
As soon as vehicle only just obtains -1 award, one static vehicle of every reduction obtains one+1 award.
It is described in one of the embodiments, to establish return value function model, specifically:
If (s a) indicates that, using the return value of movement a at state s, (s is a) about R (s, phase a) to value function Q to R
It hopes, then Q (s, a)=E [R (s, a)].
It is described in one of the embodiments, to solve optimal policy using DQN deeply learning algorithm, specifically:
Initialization memory playback unit, capacity is N, for storing trained sample;
Initialize current value network, random initializtion weight parameter ω;
Initialized target value network, structure and initialization weight are identical as current value network;
By the photo for showing road conditions by current value network, the Q (s, a) by current value network under free position s is obtained
After calculating value function, movement a is selected using ∈-greedy strategy, i.e. making movement is denoted as one for each next state transfer
Time step t, and the data that each time step is obtained (s, a, r, s ') deposit playback memory unit;
Define a loss function:
L (ω)=E [(r+ γ maxa ' Q (s ', a ';ω-)-Q(s,a;ω))2],
One (s, a, r, s ') is randomly selected from playback memory unit, it will (s, a), s ', r be transmitted to current value net respectively
Network, target value network and L (ω) are updated L (ω) about ω, more new formula using stochastic gradient descent method are as follows:
A kind of computer storage medium is stored with an at least executable instruction, the executable finger in the storage medium
Enabling makes processor execute the corresponding operation of intelligent traffic lamp control method based on deeply study.
A kind of intelligent traffic signal lamp control system based on deeply study, comprising:
The peripheral road that information acquisition unit centrally disposed crossing at the information acquisition unit and is connected with center crossing
On mouth, the information acquisition unit is used to obtain the traffic information and signal information at each crossing;
Signalized control unit, for controlling the operating of traffic lights;
Terminal processing units, the terminal processing units are logical with the information acquisition unit and Signalized control unit respectively
Letter connection, the terminal processing units are according to the executable following operation of information that information acquisition unit obtains:
The unimpeded state model of crossing congestion is established,
Traffic signal control problem is modeled as a Markovian decision process, and defines state therein, movement
And reward functions immediately,
Return value function model is established,
Optimal policy is solved using DQN deeply learning algorithm,
Traffic lights are controlled by the Signalized control unit at each crossing using optimal policy.
Detailed description of the invention
Fig. 1 is the flow chart of the Signalized control method of the embodiment of the present invention.
Fig. 2 is the schematic diagram of 5 crossroads used in the Signalized control method of the embodiment of the present invention.
Fig. 3 is the information acquisition unit and signal lamp control at single crossing in the signal lamp control system of the embodiment of the present invention
The schematic diagram that unit processed is connect with terminal processing units respectively.
Fig. 4 is the DQN algorithm training process schematic diagram in the Signalized control method of the embodiment of the present invention.
Specific embodiment
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, with reference to the accompanying drawing to the present invention
Specific embodiment be described in detail.Many details are explained in the following description in order to fully understand this hair
It is bright.But the invention can be embodied in many other ways as described herein, those skilled in the art can be not
Similar improvement is done in the case where violating intension of the present invention, therefore the present invention is not limited by the specific embodiments disclosed below.
It should be noted that it can directly on the other element when element is referred to as " being fixed on " another element
Or there may also be elements placed in the middle.When an element is considered as " connection " another element, it, which can be, is directly connected to
To another element or it may be simultaneously present centering elements.
Unless otherwise defined, all technical and scientific terms used herein and belong to technical field of the invention
The normally understood meaning of technical staff is identical.Term as used herein in the specification of the present invention is intended merely to description tool
The purpose of the embodiment of body, it is not intended that in the limitation present invention.Term " and or " used herein includes one or more phases
Any and all combinations of the listed item of pass.
As depicted in figs. 1 and 2, the embodiment provides a kind of intelligent traffic signals based on deeply study
Lamp control method comprising:
S100, select center crossing, there are multiple peripheral crossings being connected to center crossing around the center crossing;
S200, the traffic information and signal information for obtaining each crossing;
S300, the unimpeded state model of crossing congestion is established;
S400, Traffic signal control problem is modeled as to a Markovian decision process, and define state therein,
Movement and immediately reward functions;
S500, return value function model is established;
S600, optimal policy is solved using DQN deeply learning algorithm;
S700, the traffic lights that each crossing is controlled using optimal policy.
The above method adaptively can dynamically adjust the control strategy of traffic lights according to real-time traffic information.And simultaneously
Adjustment is synchronized to multiple crossings, maximizing plays the ability that is open to traffic at each crossing.
Further, the above method is with trained continuous progress, until the end of training process, obtained plan
Slightly, the effect for alleviating crossroad congestion can be gradually increased.The above method is adapted to the road conditions at crossing and independent of spy
Fixed environmental model.In the region that can especially guarantee five crossroads composition centered on a crossing, traffic fortune
Movement Capabilities maximize, and the traffic capacity for being not limited solely to single crossing maximizes.
It it is appreciated that the quantity at above-mentioned peripheral crossing can be multiple, such as can be 4.Above-mentioned 4 peripheries crossing
Arrangement can also there are many.For example, described 4 peripheral crossings are circumferentially uniformly distributed along the center crossing.Fig. 2 gives one kind
Embodiment, in the embodiment, terminal processes crossing is above-mentioned Center Road mouthful, 4 peripheral crossings be respectively crossing 1, crossing 2,
Crossing 3 and crossing 4.4 crossings are located at due east, due west, due south and the direct north at center crossing.
Further, the form at above-mentioned each crossing can be various ways.Such as shown in Fig. 2, each crossing is all
Crossroad.Namely there is first direction and second that is open to traffic to be open to traffic direction, first be open to traffic direction and second direction that is open to traffic it is mutual
Vertically.In Fig. 2, first is open to traffic direction for east-west direction, and second is open to traffic direction for North and South direction.
In the present embodiment, the traffic information includes the queue length of vehicle and the average speed of each vehicle.Each vehicle
The vehicle platoon length separate computations in road.The vehicle platoon length of left-hand rotation and Through Lane can be calculated.For example, crossing 1
East turn left lane queue length be 25m.
It is described to establish the unimpeded state model of crossing congestion in step S300 in the present embodiment specifically:
Traffic signalization Agent uses deeply learning method, constructs convolution mind network QVFor current value network, and
A mutually isostructural Q* is constructed as target value network, constructed convolutional neural networks include input layer, two convolutional layers
Network, a full articulamentum and output layer, input layer are the current traffic information at each crossing and the picture of signal information, are incited somebody to action
The picture of the picture of traffic information and signal information respectively by the feature that is obtained after different convolution layer networks and it is all can
The movement of energy is connected entirely, and output layer is that the value of everything under current state s estimates that (s, a), experience replay remember pond and use Q
In recording all sample<s, s ', a, r>, wherein s indicates that current road condition, a indicate the movement executed under current road condition,
S ' indicates the next state moved to after execution movement a under s state, and r indicates that execution acts a at current road condition s
Obtained return immediately.
It is described that Traffic signal control problem is modeled as a markov in above-mentioned steps S400 in the present embodiment
Decision process, and state therein, movement and reward functions immediately are defined, specifically:
State indicates with s, and current traffic condition s is by convolutional neural networks from the traffic information picture and signal lamp of input
The feature extracted in information picture indicates.Specifically, the traffic information picture pixels of input are 227*227, to its every 1*1's
Pixel defines in the following way, if wherein there is vehicle, enabling the region is 1, will if enabling the region is 0 without vehicle
It is 11*11 that traffic information picture passes through convolution kernel respectively, and the three-layer coil lamination of 5*5,3*3, the dimension of final output feature is
8192, then the feature extracted with signal information picture indicates the state at current crossing jointly, with two time steps for one group, no
The traffic behavior at a certain moment is only depicted, can more reflect the dynamic rule of traffic behavior.
Movement indicates, if greensignal light is opened for G, red colored lamp signal lamp is opened for R with a, respectively to first direction and
The straight and turning left signal lamp of second direction is defined, and first direction and second direction are mutually perpendicular to, and the movement a of t moment is used
[first direction straight trip, first direction turn left, and second direction straight trip, second direction is turned left] indicates that then the single crossing of t moment can adopt
The set of actions taken are as follows:
A={ [G, R, R, R], [R, G, R, R], [R, R, G, R], [R, R, R, G] };
So adoptable movement shares 4 at state s if crossing has 55=1024 kinds of possibility.
Reward functions immediately indicate with r, the total number of each crossing stationary vehicle under statistic behavior s, it is every increase by one it is quiet
As soon as vehicle only just obtains -1 award, one static vehicle of every reduction obtains one+1 award.Final purpose is
So that it is that five static vehicles in crossing reach minimum that award is maximum.
It is described to establish return value function model in above-mentioned steps S500 in the present embodiment, specifically:
If (s a) indicates that, using the return value of movement a at state s, (s is a) about R (s, phase a) to value function Q to R
It hopes, then Q (s, a)=E [R (s, a)].
It is described to solve optimal policy, tool using DQN deeply learning algorithm in above-mentioned steps S600 in the present embodiment
Body are as follows:
Initialization memory playback unit, capacity is N, for storing trained sample;
Initialize current value network, random initializtion weight parameter ω;
Initialized target value network, structure and initialization weight are identical as current value network;
By the photo for showing road conditions by current value network, the Q (s, a) by current value network under free position s is obtained
After calculating value function, movement a is selected using ∈-greedy strategy, i.e. making movement is denoted as one for each next state transfer
Time step t, and the data that each time step is obtained (s, a, r, s ') deposit playback memory unit;
Define a loss function:
L (ω)=E [(r+ γ maxa ' Q (s ', a ';ω-)-Q(s,a;ω))2],
One (s, a, r, s ') is randomly selected from playback memory unit, it will (s, a), s ', r be transmitted to current value net respectively
Network, target value network and L (ω) are updated L (ω) about ω, more new formula using stochastic gradient descent method are as follows:
The embodiments of the present invention also provide a kind of computer storage medium, at least one is stored in the storage medium can
It executes instruction, the executable instruction makes processor execute the intelligent traffic lamp control based on deeply study
The corresponding operation of method.
The embodiments of the present invention also provide a kind of intelligent traffic signal lamp control systems based on deeply study, should
System includes:
The peripheral road that information acquisition unit centrally disposed crossing at the information acquisition unit and is connected with center crossing
On mouth, the information acquisition unit is used to obtain the traffic information and signal information at each crossing;
Signalized control unit, for controlling the operating of traffic lights;
Terminal processing units, the terminal processing units are logical with the information acquisition unit and Signalized control unit respectively
Letter connection, the terminal processing units are according to the executable following operation of information that information acquisition unit obtains:
The unimpeded state model of crossing congestion is established,
Traffic signal control problem is modeled as a Markovian decision process, and defines state therein, movement
And reward functions immediately,
Return value function model is established,
Optimal policy is solved using DQN deeply learning algorithm,
Traffic lights are controlled by the Signalized control unit at each crossing using optimal policy.
It is one group that above system, which is by adjacent multiple crossings, and each group of crossing positioned at center may be configured as terminal
Crossing is handled, by the signal information picture at each crossing of traffic information picture and synchronization of each crossroad with two
A time step is one group and is transmitted to terminal processing units.Time step can determine according to practical crossing congestion degree in above system
It is fixed.Congestion level i.e. traffic information can be defined by the queue length of the vehicle at crossing and the average speed of all vehicles.It can
Dynamic adjustment is carried out according to the actual situation.Such as: the queue length of vehicle is greater than 25m, and average speed is less than 10km/h, then the time
Step-length can be set as 5s.The queue length of vehicle is less than 25m, and average speed is less than 10km/h, then time step can be set as 5s.Vehicle
Queue length be greater than 25m, average speed be greater than 10km/h, then time step can be set as 10s.The queue length of vehicle is less than
25m, average speed are greater than 10km/h, then time step can be set as 10s.
Further, optimal policy can be calculated according to respective algorithms in terminal processing units.For example, by traffic information
Pass through two convolutional neural networks respectively with signal information, Markovian decision process is constructed simultaneously by the method for intensified learning
Optimal policy is solved, so that current demand signal lamp control system is made most suitable movement according to optimal policy.
In the present embodiment, the traffic information includes the queue length of vehicle and the average speed of each vehicle.Each vehicle
The vehicle platoon length in road can separate computations.The vehicle platoon length of left-hand rotation and Through Lane can be calculated.For example, crossing
Lane queue length turn left as 25m in 1 east.
Further, at the centrally disposed crossing of terminal processing units.In this way, being more advantageous to large-scale use
Terminal processing units are centrally disposed at crossing during data transmission, can also make transmission loss most by above system
It is small.
Specifically, by taking Fig. 2 as an example.Four information acquisition units and two letters can be set in each crossroad in the system
Signal lamp control unit.Terminal processes crossing is additionally provided with terminal processing units.Each information acquisition unit includes supporting USB transmission
Electronic camera and the first communication module being connect with the electronic camera, in this way setting can captured in real-time crossing road conditions letter
Breath.Each Signalized control unit includes traffic controller and the second communication module that is connected with traffic controller.It is described
It is connected between second communication module and first communication module by wifi network.The terminal processing units include data processing group
Part and the third communication module being connected with data handling component.The third communication module and second communication module pass through wifi network
Network connection.The data handling component is connect with third communication module by USB interface.It is appreciated that above-mentioned each element it
Between connection type be not limited to aforesaid way.It can also be realized between respective element using existing interface and connection type
Connection.
In the present embodiment, the first communication module use SKW77-WIFI module, the electronic camera with it is described
It is communicated to connect between SKW77-WIFI module by USB interface.
In the present embodiment, the second communication module uses SKW77-WIFI module.Second communication module is communicated with first
It is connected between module by wifi network.
In the present embodiment, the traffic controller and the second communication module are communicated to connect by USB interface.
In the present embodiment, the third communication module uses SKW77-WIFI module.Third communication module is communicated with second
It is connected between module by wifi network.
In the present embodiment, the data handling component is NVIDIA Jetson TK1 developer component.Data handling component with
It is communicated to connect between the third communication module by USB interface.
The specific workflow of above system of the invention are as follows:
Above-mentioned electronic camera acquires the traffic information and signal information at corresponding crossing in real time.
Above-mentioned first communication module is connected with second communication module by wifi network.By first communication module by road conditions
Information and signal lamp information is transmitted to second communication module.
Above-mentioned second communication module and third communication module are communicated by wifi network.By second communication module by road conditions
Information and signal lamp information is transmitted to third communication module.
Above-mentioned third communication module and data handling component are communicated to connect by USB interface.It will by third communication module
Traffic information and signal information are transmitted to data handling component.
After the data handling component receives traffic information and signal information, according to the traffic information at each crossing and
Signal information establishes the unimpeded state model of crossing congestion.
Traffic signal control problem is modeled as a markov decision process model, to state therein, movement
And reward functions are modeled immediately.
Establish return value function model.
Optimal policy is solved using DQN deeply learning algorithm.
Traffic lights are controlled by the Signalized control unit at each crossing using optimal policy.
The present invention establishes environmental model to received data information by terminal processing units, is obtained most according to DQN algorithm
Excellent signal lamp regulation and control scheme.According to the vehicle flowrate automatic adjusument traffic lights of current crossroad, do not need to manually provide
Learning sample.Using DQN algorithm on-line study optimal correction strategy, update by stochastic gradient descent method to loss function,
Restrain the parameter of current value network gradually.The present invention is significant compared with the traffic light control system of existing fixation
Advantage is: 1) can be for random complicated road conditions dynamic corrections optimal policy;2) as trained continuous progress is until training
The end of process, its function of alleviating crossroad congestion of the obtained strategy of system can become better and better;3) system is adapted to
The road conditions at crossing and independent of specific environmental model;4) guarantee five crossroads composition centered on a crossing
Traffic capacity maximizes and is not limited solely to single crossing in region.
Each technical characteristic of embodiment described above can be combined arbitrarily, for simplicity of description, not to above-mentioned reality
It applies all possible combination of each technical characteristic in example to be all described, as long as however, the combination of these technical characteristics is not deposited
In contradiction, all should be considered as described in this specification.
The embodiments described above only express several embodiments of the present invention, and the description thereof is more specific and detailed, but simultaneously
It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art
It says, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to protection of the invention
Range.Therefore, the scope of protection of the patent of the invention shall be subject to the appended claims.
Claims (10)
1. a kind of intelligent traffic lamp control method based on deeply study characterized by comprising
There are multiple peripheral crossings being connected to center crossing at selection center crossing around the center crossing,
The traffic information and signal information at each crossing are obtained,
The unimpeded state model of crossing congestion is established,
Traffic signal control problem is modeled as a Markovian decision process, and define state therein, movement and
Reward functions immediately,
Return value function model is established,
Optimal policy is solved using DQN deeply learning algorithm,
The traffic lights at each crossing are controlled using optimal policy.
2. the intelligent traffic lamp control method according to claim 1 based on deeply study, which is characterized in that
The quantity at the periphery crossing is 4, and described 4 peripheral crossings are circumferentially uniformly distributed along the center crossing.
3. the intelligent traffic lamp control method according to claim 2 based on deeply study, which is characterized in that
The center crossing and peripheral crossing are all crossroad.
4. the intelligent traffic lamp control method according to claim 1 based on deeply study, which is characterized in that
The traffic information includes the queue length of vehicle and the average speed of each vehicle.
5. the intelligent traffic lamp control method according to claim 1 based on deeply study, which is characterized in that
It is described to establish the unimpeded state model of crossing congestion specifically:
Traffic signalization Agent uses deeply learning method, constructs convolution mind network QVFor current value network, and construct
One mutually isostructural Q* as target value network, constructed convolutional neural networks include input layer, two convolution layer networks,
One full articulamentum and output layer, input layer is the current traffic information at each crossing and the picture of signal information, by road conditions
The picture of the picture of information and signal information respectively by the feature that is obtained after different convolution layer networks with it is all possible
Movement is connected entirely, and output layer is that the value of everything under current state s estimates that (s, a), experience replay memory pond is for remembering by Q
Record all sample<s, s ', a, r>, wherein s indicates that current road condition, a indicate the movement executed under current road condition, s ' table
Show the next state moved to after execution movement a under s state, r indicates that execution movement a is obtained at current road condition s
Return immediately.
6. the intelligent traffic lamp control method according to claim 1 based on deeply study, which is characterized in that
It is described that Traffic signal control problem is modeled as a Markovian decision process, and define state therein, movement and
Reward functions immediately, specifically:
State indicates with s, and current traffic condition s is by convolutional neural networks from the traffic information picture and signal information of input
The feature extracted in picture indicates;
Movement, is indicated, if greensignal light is opened for G, red colored lamp signal lamp is opened for R, respectively to first direction and second with a
The straight and turning left signal lamp in direction is defined, and first direction and second direction are mutually perpendicular to, and the movement a of t moment uses [first
Direction straight trip, first direction turn left, and second direction straight trip, second direction is turned left] it indicates, then the single crossing of t moment can take
Set of actions are as follows:
A={ [G, R, R, R], [R, G, R, R], [R, R, G, R], [R, R, R, G] };
Reward functions immediately indicate with r, the total number of each crossing stationary vehicle under statistic behavior s,
As soon as every award for increasing a static vehicle and just obtaining -1, one static vehicle of every reduction obtain one+1
Award.
7. the intelligent traffic lamp control method according to claim 1 based on deeply study, which is characterized in that
It is described to establish return value function model, specifically:
If (s a) indicates at state s using the return value of movement a R, and (s is a) about R (s, expectation a), then Q to value function Q
(s, a)=E [R (s, a)].
8. the intelligent traffic lamp control method according to claim 1 based on deeply study, which is characterized in that
It is described to solve optimal policy using DQN deeply learning algorithm, specifically:
Initialization memory playback unit, capacity is N, for storing trained sample;
Initialize current value network, random initializtion weight parameter ω;
Initialized target value network, structure and initialization weight are identical as current value network;
By the photo for showing road conditions by current value network, the Q (s, a) by current value network query function under free position s is obtained
Out after value function, movement a is selected using ∈-greedy strategy, i.e. making movement is denoted as a time for each next state transfer
Step t, and the data that each time step is obtained (s, a, r, s ') deposit playback memory unit;
Define a loss function:
L (ω)=E [(r+ γ maxa ' Q (s ', a ';ω-)-Q(s,a;ω))2],
One (s, a, r, s ') is randomly selected from playback memory unit, it will (s, a), s ', r be transmitted to current value network respectively, mesh
Scale value network and L (ω) are updated L (ω) about ω, more new formula using stochastic gradient descent method are as follows:
9. a kind of computer storage medium, which is characterized in that an at least executable instruction is stored in the storage medium, it is described
The intelligence that executable instruction executes processor as claimed in any of claims 1 to 8 in one of claims based on deeply study is handed over
The corresponding operation of ventilating signal lamp control method.
10. a kind of intelligent traffic signal lamp control system based on deeply study characterized by comprising
The peripheral crossing that information acquisition unit centrally disposed crossing at the information acquisition unit and is connected with center crossing
On, the information acquisition unit is used to obtain the traffic information and signal information at each crossing;
Signalized control unit, for controlling the operating of traffic lights;
Terminal processing units, the terminal processing units connect with the information acquisition unit and Signalized control unit communication respectively
It connects, the terminal processing units are according to the executable following operation of information that information acquisition unit obtains:
The unimpeded state model of crossing congestion is established,
Traffic signal control problem is modeled as a Markovian decision process, and define state therein, movement and
Reward functions immediately,
Return value function model is established,
Optimal policy is solved using DQN deeply learning algorithm,
Traffic lights are controlled by the Signalized control unit at each crossing using optimal policy.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811616142.5A CN109472984A (en) | 2018-12-27 | 2018-12-27 | Signalized control method, system and storage medium based on deeply study |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811616142.5A CN109472984A (en) | 2018-12-27 | 2018-12-27 | Signalized control method, system and storage medium based on deeply study |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109472984A true CN109472984A (en) | 2019-03-15 |
Family
ID=65677259
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811616142.5A Pending CN109472984A (en) | 2018-12-27 | 2018-12-27 | Signalized control method, system and storage medium based on deeply study |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109472984A (en) |
Cited By (36)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110021168A (en) * | 2019-04-18 | 2019-07-16 | 上海科技大学 | The stepped strategy method of real-time intelligent traffic management is realized under a kind of car networking |
CN110047278A (en) * | 2019-03-30 | 2019-07-23 | 北京交通大学 | A kind of self-adapting traffic signal control system and method based on deeply study |
CN110060475A (en) * | 2019-04-17 | 2019-07-26 | 清华大学 | A kind of multi-intersection signal lamp cooperative control method based on deeply study |
CN110136456A (en) * | 2019-05-12 | 2019-08-16 | 苏州科技大学 | Traffic lights anti-clogging control method and system based on deeply study |
CN110164151A (en) * | 2019-06-21 | 2019-08-23 | 西安电子科技大学 | Traffic lamp control method based on distributed deep-cycle Q network |
CN110299008A (en) * | 2019-07-19 | 2019-10-01 | 浙江工业大学 | A kind of traffic flow multistep forecasting method based on intensified learning |
CN110363295A (en) * | 2019-06-28 | 2019-10-22 | 电子科技大学 | A kind of intelligent vehicle multilane lane-change method based on DQN |
CN110428615A (en) * | 2019-07-12 | 2019-11-08 | 中国科学院自动化研究所 | Learn isolated intersection traffic signal control method, system, device based on deeply |
CN110444028A (en) * | 2019-09-06 | 2019-11-12 | 科大讯飞股份有限公司 | Multiple Intersections Signalized control method, device and equipment |
CN110491146A (en) * | 2019-08-21 | 2019-11-22 | 浙江工业大学 | A kind of traffic signal control scheme real-time recommendation method based on deep learning |
CN110503839A (en) * | 2019-10-21 | 2019-11-26 | 江苏广宇科技产业发展有限公司 | Method and system based on single device coordinated control Multiple Intersections traffic signals |
CN110531681A (en) * | 2019-09-17 | 2019-12-03 | 山东建筑大学 | Room lighting data acquisition control system and method based on deeply study |
CN110718077A (en) * | 2019-11-04 | 2020-01-21 | 武汉理工大学 | Signal lamp optimization timing method under action-evaluation mechanism |
CN110930734A (en) * | 2019-11-30 | 2020-03-27 | 天津大学 | Intelligent idle traffic indicator lamp control method based on reinforcement learning |
CN110936954A (en) * | 2020-01-02 | 2020-03-31 | 南京航空航天大学 | Intelligent vehicle prediction decision fusion method considering vehicle bidirectional interaction |
CN110969872A (en) * | 2019-12-18 | 2020-04-07 | 上海天壤智能科技有限公司 | Traffic signal control method and system based on reinforcement learning and graph attention network |
CN111028504A (en) * | 2019-11-27 | 2020-04-17 | 天津易华录信息技术有限公司 | Urban expressway intelligent traffic control method and system |
CN111081035A (en) * | 2019-12-17 | 2020-04-28 | 扬州市鑫通智能信息技术有限公司 | Traffic signal control method based on Q learning |
CN111243299A (en) * | 2020-01-20 | 2020-06-05 | 浙江工业大学 | Single cross port signal control method based on 3 DQN-PSER algorithm |
CN111564048A (en) * | 2020-04-28 | 2020-08-21 | 郑州大学 | Traffic signal lamp control method and device, electronic equipment and storage medium |
CN111696348A (en) * | 2020-06-05 | 2020-09-22 | 南京云创大数据科技股份有限公司 | Multifunctional intelligent signal control system and method |
CN111696370A (en) * | 2020-06-16 | 2020-09-22 | 西安电子科技大学 | Traffic light control method based on heuristic deep Q network |
CN112216128A (en) * | 2020-09-28 | 2021-01-12 | 航天科工广信智能技术有限公司 | Large-scale road network traffic signal control method based on deep Q learning neural network |
CN112365724A (en) * | 2020-04-13 | 2021-02-12 | 北方工业大学 | Continuous intersection signal cooperative control method based on deep reinforcement learning |
CN112380761A (en) * | 2020-10-20 | 2021-02-19 | 珠海米枣智能科技有限公司 | Building environment controller based on reinforcement learning and control method |
CN112614343A (en) * | 2020-12-11 | 2021-04-06 | 多伦科技股份有限公司 | Traffic signal control method and system based on random strategy gradient and electronic equipment |
CN112863206A (en) * | 2021-01-07 | 2021-05-28 | 北京大学 | Traffic signal lamp control method and system based on reinforcement learning |
CN113287156A (en) * | 2019-10-28 | 2021-08-20 | 乐人株式会社 | Signal control device and signal control method based on reinforcement learning |
CN113380054A (en) * | 2021-06-09 | 2021-09-10 | 湖南大学 | Traffic signal lamp control method and system based on reinforcement learning |
CN113487887A (en) * | 2021-07-23 | 2021-10-08 | 京东城市(北京)数字科技有限公司 | Signal lamp control method and device, electronic equipment and storage medium |
CN113625561A (en) * | 2021-07-29 | 2021-11-09 | 浙江大学 | Domain coordination multi-agent system cooperation control method based on reinforcement learning |
CN113628458A (en) * | 2021-08-10 | 2021-11-09 | 四川易方智慧科技有限公司 | Traffic signal lamp optimization method based on group intelligent reinforcement learning |
CN113763723A (en) * | 2021-09-06 | 2021-12-07 | 武汉理工大学 | Traffic signal lamp control system and method based on reinforcement learning and dynamic timing |
CN114038218A (en) * | 2021-12-28 | 2022-02-11 | 江苏泰坦智慧科技有限公司 | Chained feedback multi-intersection signal lamp decision system and method based on road condition information |
CN114120670A (en) * | 2021-11-25 | 2022-03-01 | 支付宝(杭州)信息技术有限公司 | Method and system for traffic signal control |
CN117135655A (en) * | 2023-08-15 | 2023-11-28 | 华中科技大学 | Intelligent OFDMA resource scheduling method, system and terminal of delay-sensitive WiFi |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150100530A1 (en) * | 2013-10-08 | 2015-04-09 | Google Inc. | Methods and apparatus for reinforcement learning |
WO2017004626A1 (en) * | 2015-07-01 | 2017-01-05 | The Board Of Trustees Of The Leland Stanford Junior University | Systems and methods for providing reinforcement learning in a deep learning system |
CN106842925A (en) * | 2017-01-20 | 2017-06-13 | 清华大学 | A kind of locomotive smart steering method and system based on deeply study |
CN108831168A (en) * | 2018-06-01 | 2018-11-16 | 江苏数翰信息科技有限公司 | A kind of method for controlling traffic signal lights and system based on association crossing visual identity |
-
2018
- 2018-12-27 CN CN201811616142.5A patent/CN109472984A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150100530A1 (en) * | 2013-10-08 | 2015-04-09 | Google Inc. | Methods and apparatus for reinforcement learning |
WO2017004626A1 (en) * | 2015-07-01 | 2017-01-05 | The Board Of Trustees Of The Leland Stanford Junior University | Systems and methods for providing reinforcement learning in a deep learning system |
CN106842925A (en) * | 2017-01-20 | 2017-06-13 | 清华大学 | A kind of locomotive smart steering method and system based on deeply study |
CN108831168A (en) * | 2018-06-01 | 2018-11-16 | 江苏数翰信息科技有限公司 | A kind of method for controlling traffic signal lights and system based on association crossing visual identity |
Non-Patent Citations (1)
Title |
---|
JUNTAO GAO: "Adaptive Traffic Signal Control: Deep Reinforcement Learning Algorithm with Experience Replay and Target Network", 《ARXIV》, pages 1 - 10 * |
Cited By (52)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110047278B (en) * | 2019-03-30 | 2021-06-08 | 北京交通大学 | Adaptive traffic signal control system and method based on deep reinforcement learning |
CN110047278A (en) * | 2019-03-30 | 2019-07-23 | 北京交通大学 | A kind of self-adapting traffic signal control system and method based on deeply study |
CN110060475A (en) * | 2019-04-17 | 2019-07-26 | 清华大学 | A kind of multi-intersection signal lamp cooperative control method based on deeply study |
CN110021168A (en) * | 2019-04-18 | 2019-07-16 | 上海科技大学 | The stepped strategy method of real-time intelligent traffic management is realized under a kind of car networking |
CN110021168B (en) * | 2019-04-18 | 2021-08-27 | 上海科技大学 | Grading decision method for realizing real-time intelligent traffic management under Internet of vehicles |
CN110136456A (en) * | 2019-05-12 | 2019-08-16 | 苏州科技大学 | Traffic lights anti-clogging control method and system based on deeply study |
CN110164151A (en) * | 2019-06-21 | 2019-08-23 | 西安电子科技大学 | Traffic lamp control method based on distributed deep-cycle Q network |
CN110363295A (en) * | 2019-06-28 | 2019-10-22 | 电子科技大学 | A kind of intelligent vehicle multilane lane-change method based on DQN |
CN110428615A (en) * | 2019-07-12 | 2019-11-08 | 中国科学院自动化研究所 | Learn isolated intersection traffic signal control method, system, device based on deeply |
CN110428615B (en) * | 2019-07-12 | 2021-06-22 | 中国科学院自动化研究所 | Single intersection traffic signal control method, system and device based on deep reinforcement learning |
CN110299008A (en) * | 2019-07-19 | 2019-10-01 | 浙江工业大学 | A kind of traffic flow multistep forecasting method based on intensified learning |
CN110299008B (en) * | 2019-07-19 | 2020-11-13 | 浙江工业大学 | Traffic flow multi-step prediction method based on reinforcement learning |
CN110491146B (en) * | 2019-08-21 | 2020-08-21 | 浙江工业大学 | Deep learning-based traffic signal control scheme real-time recommendation method |
CN110491146A (en) * | 2019-08-21 | 2019-11-22 | 浙江工业大学 | A kind of traffic signal control scheme real-time recommendation method based on deep learning |
CN110444028A (en) * | 2019-09-06 | 2019-11-12 | 科大讯飞股份有限公司 | Multiple Intersections Signalized control method, device and equipment |
CN110531681A (en) * | 2019-09-17 | 2019-12-03 | 山东建筑大学 | Room lighting data acquisition control system and method based on deeply study |
CN110503839A (en) * | 2019-10-21 | 2019-11-26 | 江苏广宇科技产业发展有限公司 | Method and system based on single device coordinated control Multiple Intersections traffic signals |
CN113287156B (en) * | 2019-10-28 | 2023-08-18 | 乐路股份有限公司 | Signal control device and signal control method based on reinforcement learning |
CN113287156A (en) * | 2019-10-28 | 2021-08-20 | 乐人株式会社 | Signal control device and signal control method based on reinforcement learning |
US11823573B2 (en) | 2019-10-28 | 2023-11-21 | Laon Road Inc. | Signal control apparatus and signal control method based on reinforcement learning |
CN110718077B (en) * | 2019-11-04 | 2020-08-07 | 武汉理工大学 | Signal lamp optimization timing method under action-evaluation mechanism |
CN110718077A (en) * | 2019-11-04 | 2020-01-21 | 武汉理工大学 | Signal lamp optimization timing method under action-evaluation mechanism |
CN111028504A (en) * | 2019-11-27 | 2020-04-17 | 天津易华录信息技术有限公司 | Urban expressway intelligent traffic control method and system |
CN110930734A (en) * | 2019-11-30 | 2020-03-27 | 天津大学 | Intelligent idle traffic indicator lamp control method based on reinforcement learning |
CN111081035A (en) * | 2019-12-17 | 2020-04-28 | 扬州市鑫通智能信息技术有限公司 | Traffic signal control method based on Q learning |
CN110969872A (en) * | 2019-12-18 | 2020-04-07 | 上海天壤智能科技有限公司 | Traffic signal control method and system based on reinforcement learning and graph attention network |
CN110936954A (en) * | 2020-01-02 | 2020-03-31 | 南京航空航天大学 | Intelligent vehicle prediction decision fusion method considering vehicle bidirectional interaction |
CN110936954B (en) * | 2020-01-02 | 2020-12-01 | 南京航空航天大学 | Intelligent vehicle prediction decision fusion method considering vehicle bidirectional interaction |
CN111243299B (en) * | 2020-01-20 | 2020-12-15 | 浙江工业大学 | Single cross port signal control method based on 3 DQN-PSER algorithm |
CN111243299A (en) * | 2020-01-20 | 2020-06-05 | 浙江工业大学 | Single cross port signal control method based on 3 DQN-PSER algorithm |
CN112365724A (en) * | 2020-04-13 | 2021-02-12 | 北方工业大学 | Continuous intersection signal cooperative control method based on deep reinforcement learning |
CN111564048A (en) * | 2020-04-28 | 2020-08-21 | 郑州大学 | Traffic signal lamp control method and device, electronic equipment and storage medium |
CN111696348A (en) * | 2020-06-05 | 2020-09-22 | 南京云创大数据科技股份有限公司 | Multifunctional intelligent signal control system and method |
CN111696370A (en) * | 2020-06-16 | 2020-09-22 | 西安电子科技大学 | Traffic light control method based on heuristic deep Q network |
CN112216128A (en) * | 2020-09-28 | 2021-01-12 | 航天科工广信智能技术有限公司 | Large-scale road network traffic signal control method based on deep Q learning neural network |
CN112380761B (en) * | 2020-10-20 | 2024-01-26 | 珠海米枣智能科技有限公司 | Building environment controller and control method based on reinforcement learning |
CN112380761A (en) * | 2020-10-20 | 2021-02-19 | 珠海米枣智能科技有限公司 | Building environment controller based on reinforcement learning and control method |
WO2022121510A1 (en) * | 2020-12-11 | 2022-06-16 | 多伦科技股份有限公司 | Stochastic policy gradient-based traffic signal control method and system, and electronic device |
CN112614343A (en) * | 2020-12-11 | 2021-04-06 | 多伦科技股份有限公司 | Traffic signal control method and system based on random strategy gradient and electronic equipment |
CN112863206B (en) * | 2021-01-07 | 2022-08-09 | 北京大学 | Traffic signal lamp control method and system based on reinforcement learning |
CN112863206A (en) * | 2021-01-07 | 2021-05-28 | 北京大学 | Traffic signal lamp control method and system based on reinforcement learning |
CN113380054A (en) * | 2021-06-09 | 2021-09-10 | 湖南大学 | Traffic signal lamp control method and system based on reinforcement learning |
CN113487887A (en) * | 2021-07-23 | 2021-10-08 | 京东城市(北京)数字科技有限公司 | Signal lamp control method and device, electronic equipment and storage medium |
CN113625561A (en) * | 2021-07-29 | 2021-11-09 | 浙江大学 | Domain coordination multi-agent system cooperation control method based on reinforcement learning |
CN113625561B (en) * | 2021-07-29 | 2023-09-26 | 浙江大学 | Domain coordination multi-agent system cooperative control method based on reinforcement learning |
CN113628458A (en) * | 2021-08-10 | 2021-11-09 | 四川易方智慧科技有限公司 | Traffic signal lamp optimization method based on group intelligent reinforcement learning |
CN113763723A (en) * | 2021-09-06 | 2021-12-07 | 武汉理工大学 | Traffic signal lamp control system and method based on reinforcement learning and dynamic timing |
CN113763723B (en) * | 2021-09-06 | 2023-01-17 | 武汉理工大学 | Traffic signal lamp control system and method based on reinforcement learning and dynamic timing |
CN114120670A (en) * | 2021-11-25 | 2022-03-01 | 支付宝(杭州)信息技术有限公司 | Method and system for traffic signal control |
CN114120670B (en) * | 2021-11-25 | 2024-03-26 | 支付宝(杭州)信息技术有限公司 | Method and system for traffic signal control |
CN114038218A (en) * | 2021-12-28 | 2022-02-11 | 江苏泰坦智慧科技有限公司 | Chained feedback multi-intersection signal lamp decision system and method based on road condition information |
CN117135655A (en) * | 2023-08-15 | 2023-11-28 | 华中科技大学 | Intelligent OFDMA resource scheduling method, system and terminal of delay-sensitive WiFi |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109472984A (en) | Signalized control method, system and storage medium based on deeply study | |
CN113409579B (en) | Intelligent city traffic control system based on AI internet of things technology | |
CN110136456A (en) | Traffic lights anti-clogging control method and system based on deeply study | |
CN108831168B (en) | Traffic signal lamp control method and system based on visual identification of associated intersection | |
CN110390246A (en) | A kind of video analysis method in side cloud environment | |
CN108804983A (en) | Traffic signal light condition recognition methods, device, vehicle-mounted control terminal and motor vehicle | |
CN105654744B (en) | A kind of improvement traffic signal control method based on Q study | |
CN106846837A (en) | A kind of traffic light intelligent control system, traffic lights intelligent control method and device | |
CN107507430A (en) | A kind of urban road crossing traffic control method and system | |
CN109087517A (en) | Intelligent signal lamp control method and system based on big data | |
CN111710177B (en) | Intelligent traffic signal lamp networking cooperative optimization control system and control method | |
CN110365787A (en) | A kind of application container simultaneously optimizes layout method based on the edge calculations of micro services frame | |
CN105872075B (en) | A method of internet of things equipment is mapped to smart city resource model | |
CN112419762A (en) | Internet of things platform-based reinforcement learning intelligent traffic signal lamp control method and system | |
CN108156388A (en) | Power consumption control method and photographic device | |
CN106448171A (en) | Ponding road prediction method and device | |
CN108549952A (en) | Optimization method and device for double-layer path of vehicle-mounted unmanned aerial vehicle | |
CN109003460A (en) | Traffic lights Optimization Scheduling and system | |
CN112258865B (en) | Intelligent red and green signal lamp control system based on Internet of vehicles V2X | |
CN108133604A (en) | A kind of traffic lights dynamic realtime dispatching method based on traffic characteristic | |
CN111785043A (en) | Intersection control method for intelligent internet connection | |
CN105281957B (en) | A kind of method and server of the access device in Internet of Things | |
CN108205622A (en) | The authority control method and device of a kind of application program for mobile terminal | |
CN112785162A (en) | High-precision map crowdsourcing data quality assessment method and system based on intelligent vehicle semantics | |
CN109637123A (en) | A kind of complexity traffic environment downlink people living things feature recognition and traffic control system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |