EP3425608A1 - Commande de signal de circulation utilisant de multiples catégories de q-learning - Google Patents

Commande de signal de circulation utilisant de multiples catégories de q-learning Download PDF

Info

Publication number
EP3425608A1
EP3425608A1 EP18179505.5A EP18179505A EP3425608A1 EP 3425608 A1 EP3425608 A1 EP 3425608A1 EP 18179505 A EP18179505 A EP 18179505A EP 3425608 A1 EP3425608 A1 EP 3425608A1
Authority
EP
European Patent Office
Prior art keywords
traffic
intersection
learning
traffic pattern
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
EP18179505.5A
Other languages
German (de)
English (en)
Other versions
EP3425608B1 (fr
Inventor
Ying Liu
Lei Liu
Wei-Peng Chen
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US15/641,168 external-priority patent/US10002530B1/en
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Publication of EP3425608A1 publication Critical patent/EP3425608A1/fr
Application granted granted Critical
Publication of EP3425608B1 publication Critical patent/EP3425608B1/fr
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/07Controlling traffic signals
    • G08G1/08Controlling traffic signals according to detected number or speed of vehicles
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • G08G1/0129Traffic data processing for creating historical data or processing based on historical data
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0137Measuring and analyzing of parameters relative to traffic conditions for specific applications
    • G08G1/0145Measuring and analyzing of parameters relative to traffic conditions for specific applications for active traffic flow control
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • G08G1/0133Traffic data processing for classifying traffic situation

Definitions

  • the described technology relates generally to control of traffic signals.
  • Traffic congestion is becoming an increasing concern. Traffic congestion typically results from increased use of the roads by vehicles, and is characterized by slower vehicle speeds, longer trip times, and increased vehicular queuing. Traffic signals have been widely deployed in an attempt to help alleviate traffic congestion. Proper functioning traffic signals need to not only ensure that traffic moves smoothly and safely, but that pedestrians are protected when crossing the roads.
  • Various traffic signal control techniques have been proposed. These techniques can be generally categorized as fixed time control, dynamic control, coordinated control, and adaptive control. Fixed time control is rather simple in that traffic signals are changed after a fixed time period. The time period can be pre-configured to different values for different times in a day. Dynamic control incorporates the use of input from detectors, such as sensors, to adjust the traffic signal timing. These detectors can inform the traffic signal controller whether vehicles are present.
  • Coordinated control is coordinated control of multiple traffic signals, typically by a master controller, which accounts for changing traffic patterns in real-time. Cameras and sensors are used to detect real-time traffic information, and the central controller uses this information to do real-time optimization.
  • One optimization is a "green wave,” which is a long string of green lights that allows vehicles to travel long distances without encountering a red light.
  • Adaptive control incorporates actual traffic demand in the control of traffic signals. Sensors and cameras are used to determine the number of vehicles at an intersection and how long the vehicles have been waiting. The traffic signal controller at this intersection uses this information to control the traffic signal at this intersection, while coordinating its decision with controllers at other intersections.
  • An example method may include clustering historical traffic data into multiple traffic pattern clusters, and generating multiple Q-learning categories, where each Q-learning category of the multiple Q-learning categories corresponds to a traffic pattern cluster of the multiple traffic pattern clusters.
  • the method may also include determining a first Q-learning category of the multiple Q-learning categories to use in controlling traffic signals at an intersection based at least in part on a first traffic data of the intersection, where the first Q-learning category corresponds to a first traffic pattern cluster, and the first traffic data corresponds to the first traffic pattern cluster.
  • the method may additionally include generating a first control action for the traffic signals at the intersection based at least in part on the first Q-learning category.
  • This disclosure is generally drawn, inter alia, to technologies, including methods, apparatus, systems, devices, and/or computer program products related to traffic signal control that incorporates non-motorized traffic information.
  • Some embodiments of the technology may utilize multiple Q-learning categories in the control of traffic signals. Some embodiments of the technology may adjust a learning rate of Q-learning utilized to control traffic signals.
  • Non-motorized traffic is the presence on or use of roadways by non-motorized users, such as pedestrians, bicyclists, and equestrians. Unlike conventional traffic control systems that require manual activation by non-motorized users to inform of their presence at intersections, the presence of such non-motorized users are autonomously detected and accounted for in the control of traffic signals at intersections. The autonomous detection of non-motorized users allows for the safe and efficient flow of both motorized and non-motorized traffic through intersections where traffic flow is being controlled by traffic signals.
  • sensors such as cameras, video cameras, etc. are deployed at intersections to autonomously acquire images (e.g., video images, video feed, etc.) from which the presence of motorized and non-motorized users at the intersections may be determined.
  • Images that include or show the presence of motorized users may be referred to or classified as motorized user presence data.
  • Images that include or show the presence of non-motorized users may be referred to or classified as non-motorized user presence data.
  • images that include or show the presence of both motorized and non-motorized users may be referred to or classified as both motorized user presence data and non-motorized user presence data.
  • Other data from which presence of motorized users may be determined may also be referred to or classified as motorized user presence data.
  • other data from which presence of non-motorized users may be determined may also be referred to or classified as non-motorized user presence data.
  • the autonomously acquired images may also be used to determine different queues of non-motorized users and the lengths of the different queues of non-motorized users, for example, from position and/or direction of travel of the non-motorized users.
  • the images may also be used to determine the presence of motorized users (e.g., queue of motorized users) at the intersection, and the different queues of motorized users and the lengths of the different queues of motorized users, for example, from position and/or direction of travel of the motorized users.
  • Traffic signals at an intersection may include traffic signals for motorized traffic, and traffic signals for non-motorized traffic.
  • the traffic signals at an intersection may be controlled by a control agent (an "agent").
  • An agent may be configured to generate control actions for both the traffic signals for motorized traffic and the traffic signals for non-motorized traffic at an intersection based on the presence of motorized users and non-motorized users at the intersection. For example, the agent may process the motorized user presence data and the non-motorized user presence data to determine the presence of motorized user queues and queue lengths, and the presence of non-motorized user queues and queue lengths present at the intersection. The agent may generate control actions for the traffic signals based at least in part on the presence of the non-motorized user queues and queue lengths.
  • the agent may apply Q-learning, which is a model-free reinforcement learning technique, to generate the control actions for the traffic signals.
  • Q-learning can be used to determine an optimal action-selection policy for any given (finite) Markov decision process (MDP).
  • MDP Markov decision process
  • Q-learning works by learning an action-value function that provides an expected utility of taking a given action (e.g., generating a given control action) in a given state (e.g., given state of the traffic signals) and following the optimal policy thereafter.
  • policies may be to minimize the length of all queues, both motorized and non-motorized user queues, at the intersection, optimize motorized traffic flow through the intersection, optimize non-motorized traffic flow through the intersection, prioritize traffic flow in a specific direction through the intersection, prioritize public transportation through the intersection, optimize global traffic flow, optimize emission utility, optimize congestion utility, and the like.
  • the agent may apply one or more conditions on the operation of the traffic signals in generating the control actions for the traffic signals.
  • certain control actions may not directly follow some other control actions.
  • the traffic signal that is directing the pedestrians to cross the intersection should maintain its action (e.g., green light) for a sufficient period of time while the pedestrians are crossing the intersection.
  • a traffic signal that is controlling (e.g., stopping) the flow of motorized users across the flow of pedestrians should not turn green.
  • flow of motorized traffic in one direction through the intersection may not be followed by a flow of motorized traffic in another direction through the intersection.
  • the conditions may be different depending on the region. For example, the conditions for a four-way intersection may be different than the conditions for a three-way intersection. As another example, conditions in the United States may be different than the conditions in Japan.
  • the agent may incorporate historical traffic data in generating the control actions for the traffic signals at the intersection.
  • the historical traffic data may include traffic statistics at different time periods in a day (e.g., traffic between 7:00 AM to 9:00 AM is heavier than between 10:00 AM to 11:00 AM), traffic statistics in the same time period on different days (e.g., traffic between 10:00 AM to 12 Noon on weekdays or on weekends), etc.
  • the historical traffic data may be data of the same intersection (e.g., the intersection being controlled by the agent).
  • the historical traffic data may be data of another, different intersection.
  • the historical data may be data of multiple intersections.
  • the agent may apply an autoregressive integrated moving average (ARIMA) model to calculate estimated instantaneous rewards based on historical traffic data, and integrate the calculated instantaneous rewards in the generating of the control actions.
  • ARIMA autoregressive integrated moving average
  • the agent may incorporate multiple Q-learning categories in the Q-learning technique to determine an optimal action-selection policy.
  • Q-learning determines an optimal action based on historical traffic data (e.g., motorized traffic information and non-motorized traffic information).
  • traffic patterns may vary, sometimes significantly, at different times during a day. For example, the traffic pattern at an intersection may be different during commute and non-commute hours, during midday and midnight, etc.
  • the traffic patterns may also vary when an event (e.g., automobile accident, musical concert, sporting event, holiday, etc.) occurs.
  • Q-learning may initially consider the old (prior) traffic pattern data as the historical traffic data.
  • the historical traffic data of the old traffic pattern may not be as useful in determining an optimal action for the new traffic pattern, which may result in Q-learning taking a longer time to determine an optimal action for the new traffic pattern.
  • historical traffic data may be clustered into one or more traffic pattern clusters based on one or more criteria, such as motorized traffic arrival rate, non-motorized traffic arrival rate, queue lengths, occurrence of an accident, occurrence of a special event, average motorized traffic waiting time, non-motorized traffic waiting time, and the like.
  • historical traffic data may be analyzed to identify characteristics of different traffic patterns.
  • Historical traffic data may be clustered into traffic pattern clusters based on the identified characteristics.
  • Each of the different traffic pattern clusters, and the historical traffic data associated with or corresponding to the each of the different traffic pattern clusters may be assigned to a corresponding Q-learning category.
  • a Q-learning category for a specific traffic pattern cluster may be associated with its own historical traffic data, including the state/action/reward history records (e.g., historical traffic data records including the observed states of the traffic signals, generated control actions for the traffic signals, and the calculated rewards), associated with the specific traffic pattern.
  • the state/action/reward history records e.g., historical traffic data records including the observed states of the traffic signals, generated control actions for the traffic signals, and the calculated rewards
  • a person such as an operator knowledgeable in traffic analysis, may specify the number of Q-learning categories by specifying the properties that define or characterize the different traffic patterns.
  • traffic patterns may include, without limitation, ultralow traffic, low traffic, normal traffic, heavy traffic, ultra heavy traffic, traffic jam, accident, and the like.
  • the different traffic patterns may be characterized by traffic rates, such as the number of motorized users (e.g., the number of vehicles) at or coming into an intersection at a particular time slot, the number of non-motorized users at or coming into the intersection at a particular time slot, and/or the like, including combinations thereof, or range of traffic rates.
  • the different traffic patterns may be characterized by queue lengths, average motorized traffic waiting time at the intersection, non-motorized traffic waiting time at the intersection, and/or the like.
  • Historical traffic data may then be analyzed to determine the different traffic pattern clusters (e.g., the different Q-leaning categories) based on the thresholds.
  • the specified thresholds may determine the number of different traffic patterns, which in turn determines the number of traffic pattern clusters and corresponding Q-learning categories.
  • the operator may specify the number of Q-learning categories based on a policy or policies.
  • the thresholds may be automatically determined, for example, based on heuristic analysis of the historical traffic data.
  • historical traffic data may be clustered into traffic pattern clusters based on queue lengths (e.g., motorized user queue lengths and/or non-motorized user queue lengths) at each intersection as the queue lengths vary over time (e.g., vary over each time slot).
  • Data in the same traffic pattern cluster may exhibit similar properties.
  • the lengths of queues in the same traffic pattern cluster may have the same or similar lengths.
  • Clustering may identify the change in queue lengths (e.g., potentially substantial change in queue lengths to differentiate one traffic pattern cluster from another traffic pattern cluster), and divide the queue lengths into different traffic pattern clusters (e.g., cluster groups) based on the queue lengths.
  • a clustering technique such as " Clustering by passing messages between data points," Frey, Brendan J., and Dueck, Dilbert, Science 315.5814 (16 February 2007): 972-6 , may be used to cluster the historical traffic data.
  • each traffic pattern cluster may be associated with an event.
  • An event may be any occurrence that may have an impact on traffic.
  • the traffic data when a sporting event, such as a football game, a soccer match, and/or the like, is occurring may be different from traffic data when the sporting event is not occurring.
  • the traffic data on weekday mornings that indicate a morning traffic jam event may be due to people going to work.
  • the traffic data when a concert is occurring may be different from traffic data when the concert is not occurring.
  • historical traffic data of a traffic pattern cluster may be compared to traffic data that occurred during an event, and the traffic pattern cluster may be associated with the event based on the similarity between the historical traffic data of the traffic pattern cluster and the traffic data that occurred during the event.
  • Each event e.g., traffic pattern cluster
  • first traffic pattern e.g., first traffic pattern cluster
  • traffic data may be observed that falsely indicates a change to a second traffic pattern (e.g., a second traffic pattern cluster), quickly followed by observance of traffic data associated or consistent with the first traffic pattern (e.g., first traffic pattern cluster).
  • the agent may unnecessarily switch from the current first Q-learning category to a second Q-learning category corresponding to the second traffic pattern (e.g., second traffic pattern cluster), and then quickly switch back to the first Q-learning category. Switching to a Q-learning category based on observed traffic data that are false positives may cause unnecessary and/or frequent Q-learning category changes.
  • the agent may check to determine that a new traffic pattern cluster or new event occurred in a specific number of consecutive time slots before changing Q-learning categories. In some embodiments, the agent may check to determine that a new traffic pattern cluster or new event occurred in at least a certain percentage or number of a specific number of consecutive time slots before changing Q-learning categories. In some embodiments, the specific number of consecutive time slots may vary or be determined based on the frequency of the occurrence of the traffic pattern cluster that corresponds to the new traffic data (e.g., based on how often the traffic pattern cluster that corresponds to the new traffic data occurred in the past).
  • the agent may vary the learning rate of Q-learning.
  • the learning rate determines to what extent the newly observed traffic data overrides the older traffic data in learning the action-value function.
  • a learning rate of 0 causes Q-learning to not learn anything (e.g., newly observed traffic data is not considered), while a learning rate of 1 causes Q-learning to consider only the newly observed traffic data.
  • Traffic data may be categorized into different traffic patterns, and the agent may vary the learning rate based on the frequency of traffic pattern changes (e.g., frequency of change in the traffic pattern).
  • the agent may increase the learning rate (e.g., set the learning rate closer to 1) so that Q-learning considers more of the recent traffic data. Conversely, if the traffic pattern is not changing but remaining the same over a long period of time (e.g., many time slots), the agent may gradually decrease the learning rate so that Q-learning considers more of the historical traffic data.
  • the learning rate e.g., set the learning rate closer to 1
  • the agent may gradually decrease the learning rate so that Q-learning considers more of the historical traffic data.
  • the agent may transmit or provide its traffic data (e.g., traffic data of the intersection) to one or more neighbor agents (e.g., agents that control neighbor intersections). This allows the neighbor agents to incorporate traffic data of this intersection in generating control actions for the traffic signals at the neighbor intersections. Additionally or alternatively, the agent may transmit or provide its traffic data to a central controller for use by the central controller and/or dissemination by the central controller, for example, to other agents. This allows for the propagation and use of traffic data of one intersection to one or more agents at other intersections.
  • traffic data e.g., traffic data of the intersection
  • neighbor agents e.g., agents that control neighbor intersections
  • the agent may transmit or provide its traffic data to a central controller for use by the central controller and/or dissemination by the central controller, for example, to other agents. This allows for the propagation and use of traffic data of one intersection to one or more agents at other intersections.
  • the agent may incorporate traffic data of one or more neighbor intersections in generating the control actions for the traffic signals at the intersection.
  • a neighbor agent e.g., an agent controlling a neighbor intersection
  • a coordinator system may transmit or provide traffic data, such as real-time traffic statistics, historical traffic statistics, etc., of one or more intersections for use by the agent. Integration of neighboring intersection traffic data, including traffic data of larger geographical areas, may allow the agent to coordinate the control with different agents to improve traffic signal control efficiency.
  • the coordinator system may transmit or provide motorized user route information and/or non-motorized user route information for use by the agent in generating the control actions. For example, people may be encouraged (e.g., provided certain benefits, such as reduced travel time due to traffic light control in their favor) to provide and share their route information to improve their travel experience.
  • the coordinator system may then collect this information from, for example, mobile applications, cell phones, global positioning system (GPS) units, vehicle navigation systems, etc., of these users.
  • GPS global positioning system
  • the coordinator system may use traffic data of one or more intersections to determine improved routes for some or all of the people who have shared their route information. Additionally or alternatively, the coordinator system may provide some or all of the collected user information to the agents for use in generating the control actions.
  • the coordinator system may receive information regarding intended destinations from self-driving (autonomous) vehicles. Using this information, the coordinator system may recommend candidates routes to the intended destinations to the self-driving vehicles. Additionally or alternatively, the coordinator system may share the route information with the agents to optimize the traffic flow.
  • FIG. 1 illustrates selected components of an intersection 100 controlled by traffic signals, arranged in accordance with at least some embodiments described herein.
  • Intersection 100 is a four-way, +-shaped intersection, and includes traffic signals 102a, 102b, 102c, and 102d (collectively referred to herein as traffic signals 102), crosswalk signals 104a, 104b, 104c, 104d, 104e, 104f, 104g, and 104h (collectively referred to herein as crosswalk signals 104), and sensors 106a and 106b (collectively referred to herein as sensors 106).
  • intersection 100 The number of components depicted in intersection 100 is for illustration, and one skilled in the art will appreciate that there may be a different number of traffic signals 102, crosswalk signals 104, and sensors 106. As depicted, intersection 100 is coupled to an agent 108 whose task is to control the flow of traffic through intersection 100.
  • Traffic signals 102 are traffic signals that direct the flow of motorized traffic through intersection 100.
  • traffic signal 102a may direct the flow of motorized users in the east-west direction
  • traffic signal 102b may direct the flow of motorized users in the south-north direction
  • traffic signal 102c may direct the flow of motorized users in the west-east direction
  • traffic signal 102d may direct the flow of motorized users in the north-south direction.
  • Crosswalk signals 104 are traffic signals that direct the flow of non-motorized traffic through intersection 100.
  • crosswalk signals 104a and 104b may direct the flow of non-motorized users in the east/west direction on the north side of intersection 100
  • crosswalk signals 104c and 104d may direct the flow of non-motorized users in the north/south direction on the east side of intersection 100
  • crosswalk signals 104e and 104f may direct the flow of non-motorized users in the east/west direction on the south side of intersection 100
  • crosswalk signals 104g and 104h may direct the flow of non-motorized users in the north/south direction on the west side of intersection 100.
  • Sensors 106 may be configured to autonomously detect the presence of motorized and non-motorized users at or approaching intersection 100.
  • sensors 106 may be video cameras that are configured to acquire images of intersection 100 from which motorized user presence and non-motorized user presence may be determined. The images may be classified as motorized user presence data, non-motorized presence data, or both. The acquired images may be provided to agent 108 for processing. Agent 108 is further described below in conjunction with FIG. 3 .
  • at least some of sensors 106 may be air quality monitors, metal detectors, infrared detectors, crosswalk buttons, etc.
  • FIG. 2 illustrates an overview of an environment 200 and selected devices in environment 200, arranged in accordance with at least some embodiments described herein.
  • Environment 200 may include one or more agents 108a-108n, further described below in conjunction with FIG. 3 .
  • Agents 108a-108n may be individually referred to herein as agent 108 or collectively referred to herein as agents 108.
  • the number of agents depicted in environment 200 is for illustration, and one skilled in the art will appreciate that there may be a different number of agents 108.
  • Agents 108a-108n are illustrated as operating in a networked environment using logical connections to each other and one or more remote computing systems, e.g., a coordinator system 202, through a network 204.
  • Network 204 can be a local area network, a wide area network, the Internet, and/or other wired or wireless networks.
  • FIG. 3 illustrates selected components of agent 108, arranged in accordance with at least some embodiments described herein.
  • agent 108 includes a sensor module 302, a control action computation module 304, a signal control module 306, a communication module 308, and an information data store 310.
  • additional components (not illustrated) or a subset of the illustrated components can be employed without deviating from the scope of the claimed technology.
  • Sensor module 302 may be configured to communicate with the sensors deployed at the intersection to receive (obtain) sensor data from the sensors. For example, in instances where the sensors are video cameras, sensor module 302 may receive the images and/or video feeds from the coupled sensors. In some embodiments, sensor module 302 may be configured to control the coupled sensors. For example, sensor module 302 may send the sensors instructions to operate the sensors (e.g., power on, power off, reboot, positioning and/or movement instructions, etc.).
  • Control action computation module 304 may be configured to control the traffic signals deployed at the intersection. For example, control action computation module 304 may generate a control action that directs the operation of the traffic signals at the intersection based on the sensor data obtained by sensor module 302. Accordingly, control action computation module 304 is able to generate control actions for the traffic signals (the traffic signals for motorized traffic and the traffic signals for non-motorized traffic) that account for the presence of motorized traffic and non-motorized traffic at the intersection. In some embodiments, control action computation module 304 may apply one or more conditions in generating the control actions for the traffic signals. Additionally or alternatively, control action computation module 304 may incorporate traffic data from one or more other agents (e.g., agents controlling other intersections) in generating the control actions for the traffic signals. Additionally or alternatively, control action computation module 304 may incorporate historical traffic data of the intersection and/or of one or more other intersections in generating the control actions for the traffic signals.
  • agents e.g., agents controlling other intersections
  • control action computation module 304 may apply Q-learning to generate the control actions for the traffic signals that consider both motorized users and non-motorized users at an intersection.
  • Q-learning can be used to determine an optimal action-selection policy for any given (finite) Markov decision process (MDP).
  • MDP Markov decision process
  • Q-learning works by learning an action-value function that provides an expected utility of taking a given action (e.g., generating a given control action) in a given state (e.g., given state of the traffic signals) and following the optimal policy thereafter.
  • a transportation network (e.g., network of roads including intersections) may be abstracted into a directed graph.
  • FIG. 4 illustrates an example directed graph corresponding to a segment of an example transportation network. Each intersection may be represented by a vertex in the directed graph, and a road may correspond to an edge in the directed graph.
  • the flows (e.g., directed connections) may represent traffic, such as vehicular traffic, where q ij is the queue length from intersection i to intersection j. For example, as depicted in the directed graph, q 41 is the queue length from intersection 4 to intersection 1.
  • the various states of the traffic signals deployed at the intersection may be based on the number of vehicles and the number of pedestrians in the various queues in the incoming directions to the intersection.
  • S t i,d is the state of the traffic signals at intersection i, at day d and time t
  • q t ji,d is the queue length for vehicles from intersection j to i, at day d and time t
  • m t ji,d,L is the queue length for pedestrians at the left side from intersection j to i
  • the action set of possible actions, A, for the traffic signals at an intersection may be designed based on the traffic rules applicable to the location of the intersection.
  • the action set may include eight possible actions,
  • 8, as dictated by the applicable traffic rules. Accordingly, only ⁇ a i ⁇ A
  • i 1, 2, ..., 8 ⁇ actions may be chosen at any time slot, as illustrated by the example action set for the four-way, +-shape intersection ( FIG. 6 ). As depicted in FIG.
  • L ij is the traffic signal that controls the flow of traffic from region i to region j
  • L' ij is the left turn traffic signal that controls the flow of traffic from region i to region j. Note that the intersection itself is represented as region 1.
  • the possible actions e.g., the control actions for the traffic signals
  • the action set may include three possible actions,
  • 3, as dictated by the applicable traffic rules. Accordingly, only ⁇ a i ⁇ A
  • i 1, 2, 3 ⁇ actions may be chosen at any time slot, as illustrated by the example action set for the three-way, T-shape intersection ( FIG. 8 ).
  • L ij is the traffic signal that controls the flow of traffic from region i to region j
  • L' ij is the left turn traffic signal that controls the flow of traffic from region i to region j. Note that the intersection itself is represented as region 1.
  • the possible actions e.g., the control actions for the traffic signals
  • the time slot can be set based on operational policy. For example, the time slot can be set to a relatively longer time duration (e.g., 5 to 10 seconds) to avoid having to frequently change the traffic signals. Conversely, the time slot can be set to a relatively shorter time duration (e.g., 1 second) to obtain a faster Q-learning algorithm convergence.
  • Q-learning works by learning an action-value function that provides an expected utility of taking a given action (e.g., generating a given control action) in a given state (e.g., given state of the traffic signals) and following the optimal policy thereafter.
  • Q (S t i,d, a t i,d ) is the Q value for a given state-action pair (S t i,d , a t i,ad ;
  • R t i,d is the reward at intersection i, at day d and time t, as expressed by equation [1];
  • a t i,d is the action at intersection i, at day d and time t;
  • S t i,d is the state at intersection i, at day d and time t;
  • is the learning rate (0 ⁇ ⁇ ⁇ 1),
  • ⁇ j ⁇ Ni q t ji,d is the incoming vehicular queues from intersection j to intersection i;
  • ) ⁇ k ⁇ Nj q t kj,d ) is the total vehicular queues at all neighbor intersection j's, including the outgoing vehicular traffic from intersection i to intersection j;
  • ⁇ j ⁇ Ni (m t ji,d,L + m t ji,d,R ) is the total pedestrian queues at intersection i.
  • is 4 for a four-way, +-shaped intersection, and
  • the total of the weights at the neighboring intersections at day d and time t, W t d sum up to 1.
  • the reward, R t i,d is the queue lengths at intersection i, at day d and time t. Accordingly, as the objective is to minimize the lengths of all queues at intersection i, an action that minimizes R t i,d may be chosen.
  • historical traffic data may be incorporated in the determination of the actions.
  • R t i,d is the rewards at intersection i, at day d and time t
  • n is the lag operator
  • p is the number of autoregressive terms (e.g., the number of days of historical traffic data to consider)
  • q is the number of days for the moving-average terms
  • ⁇ n are the parameters (e.g., weights) of the autoregressive part of the model
  • ⁇ n are the parameters (e.g., weights) of
  • traffic data of an intersection may be broadcast to neighbor intersections. Accordingly, traffic data from neighboring intersections may be incorporated to determine the actions at a particular intersection.
  • the traffic data is queue lengths
  • T t j is the average vehicular queue length at time t from all neighbor intersections of intersection j
  • q t kj,d is the queue length for vehicles from intersection k to j, at day d and time t
  • is the number of neighboring intersections of intersection j. It follows that the sum of all the neighbors' average queue lengths is ⁇ T t j , which can replace the middle additive term in equation [1] above.
  • one or more conditions may be applied in determining an action.
  • the conditions on the actions may specify that, if the action at day d and time t is 1 (a t i,d , i ⁇ 1), then the action at day d at time t+1 cannot be 5, 6, 7, or 8 (a t+1 i,d , j ⁇ ⁇ 5, 6, 7, 8 ⁇ ); if the action at day d and time t is 5 (a t i,d, i ⁇ 5), then the action at day d at time t+1 cannot be 1, 2, 3, or 4 (a t+1 i,d , j ⁇ ⁇ 1,2, 3, 4 ⁇ ); if the action at day d and time t is 2 or 3 (a t i,d, i ⁇ ⁇ 2, 3 ⁇ ), then the action at day d at time t+1 cannot be 5 (a t t i,d, i ⁇ ⁇ 2, 3 ⁇ ), then the action at day d at time t+1 cannot be 5 (
  • one solution to account for the conditions may be to sort the Q values in ascending order (e.g., priority queue), then select the smallest one that satisfies the conditions.
  • An example of another condition may be that a traffic signal that is directing pedestrians to cross an intersection should not turn red while the pedestrians are crossing the intersection.
  • One solution to account for this condition may be to set the time slot to a longer duration to provide sufficient time for pedestrians to cross the intersection.
  • Another solution may be to maintain the current time slot (e.g., the relatively short duration), but change actions only when no pedestrian is crossing the intersection. For example, sensors deployed at the intersections may be able to provide information that may be used to determine whether a pedestrian is crossing the intersection.
  • Another solution may be to not change the action for a specific number of time slots if a pedestrian is crossing the intersection.
  • Signal control module 306 may be configured to communicate with the traffic signals at the intersection to control (direct) operation of the traffic signals based on the control action generated by control action computation module 304.
  • signal control module 306 may control operation of the traffic signals by transmitting instructions (e.g., electrical signals or other signals depending on the type of traffic signal, etc.) that direct the operation of the traffic signals.
  • Communication module 308 may be configured to couple to one or more remote computing devices or computing systems, such as, by way of example, other remote agents 108, coordinator system 202, etc. Accordingly, communication module 308 may facilitate communication by agent 108 with one or more external components.
  • control action computation module 304 may utilize communication module 308 to communicate with a neighboring agent, for example, to receive traffic data of the neighboring intersection.
  • sensor module 302 and/or signal control module 306 may utilize communication module 308 to communicate with the sensors and/or the traffic signals, respectively.
  • Information data store 310 may be configured to store data, such as, by way of example, traffic data, sensor data, or other data that may be used by agent 108.
  • Information data store 310 may be implemented using any computer-readable storage media suitable for carrying or having data or data structures stored thereon.
  • FIG. 9 illustrates selected components of coordinator system 202, arranged in accordance with at least some embodiments described herein.
  • coordinator system 202 includes a central intelligence module 902, a communication module 904, and an information data store 906.
  • additional components (not illustrated) or a subset of the illustrated components can be employed without deviating from the scope of the claimed technology.
  • Central intelligence module 902 may be configured to communicate with one or more agents 108 to receive (obtain) traffic data (e.g., current traffic data, historical traffic data, sensor data, operating data, etc.) from agents 108.
  • Traffic data e.g., current traffic data, historical traffic data, sensor data, operating data, etc.
  • Central intelligence module 902 may also provide traffic data to one or more agents 108, for example, for use in generating control actions and/or otherwise controlling the respective intersections.
  • central intelligence module 902 may be configured to provide route information for use by one or more agents. For example, motorized users and/or non-motorized users may provide their travel route information. Central intelligence module 902 may process the travel route information to determine the travel route information relevant to a geographic area (e.g., one or more intersections, etc.). Central intelligence module 902 can then provide the agent or agents controlling the one or more intersections the relevant travel route information for use by the agent or agents, for example, to generate the control actions for the traffic signals.
  • a geographic area e.g., one or more intersections, etc.
  • Communication module 904 may be configured to couple to one or more remote computing devices or computing systems, such as, by way of example, one or more agents, one or more other coordinator systems, one or more traffic control systems, sources of remote data, etc. Similar to communication module 308 discussed above, communication module 904 facilitates communication by coordinator system 202 with one or more external components. For example, central intelligence module 902 may utilize communication module 904 to communicate with an agent, for example, to receive traffic data of the intersection being controlled by the agent.
  • Information data store 906 may be configured to store data, such as, by way of example, traffic data, motorized user data, non-motorized user data, or other data that may be used by coordinator system 202. Similar to information data store 310, information data store 906 may be implemented using any computer-readable storage media suitable for carrying or having data or data structures stored thereon.
  • FIG. 10 illustrates selected components of an example general purpose computing system 1000, which may be used to generate control actions for traffic signals at an intersection, arranged in accordance with at least some embodiments described herein.
  • Computing system 1000 may be configured to implement or direct one or more operations associated with some or all of the components and/or modules associated with agent 108 of FIG. 3 and/or coordinator system 202 of FIG. 9 .
  • Computing system 1000 may include a processor 1002, a memory 1004, and a data storage 1006.
  • Processor 1002, memory 1004, and data storage 1006 may be communicatively coupled.
  • processor 1002 may include any suitable special-purpose or general-purpose computer, computing entity, or computing or processing device including various computer hardware, firmware, or software modules, and may be configured to execute instructions, such as program instructions, stored on any applicable computer-readable storage media.
  • processor 1002 may include a microprocessor, a microcontroller, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a Field-Programmable Gate Array (FPGA), or any other digital or analog circuitry configured to interpret and/or to execute program instructions and/or to process data.
  • DSP digital signal processor
  • ASIC application-specific integrated circuit
  • FPGA Field-Programmable Gate Array
  • processor 1002 may include any number of processors and/or processor cores configured to, individually or collectively, perform or direct performance of any number of operations described in the present disclosure. Additionally, one or more of the processors may be present on one or more different electronic devices, such as different servers.
  • processor 1002 may be configured to interpret and/or execute program instructions and/or process data stored in memory 1004, data storage 1006, or memory 1004 and data storage 1006. In some embodiments, processor 1002 may fetch program instructions from data storage 1006 and load the program instructions in memory 1004. After the program instructions are loaded into memory 1004, processor 1002 may execute the program instructions.
  • any one or more of the components and/or modules of agent 108 and/or coordinator system 202 may be included in data storage 1006 as program instructions.
  • Processor 1002 may fetch some or all of the program instructions from the data storage 1006 and may load the fetched program instructions in memory 1004. Subsequent to loading the program instructions into memory 1004, processor 1002 may execute the program instructions such that the computing system may implement the operations as directed by the instructions.
  • Memory 1004 and data storage 1006 may include computer-readable storage media for carrying or having computer-executable instructions or data structures stored thereon.
  • Such computer-readable storage media may include any available media that may be accessed by a general-purpose or special-purpose computer, such as processor 1002.
  • Such computer-readable storage media may include tangible or non-transitory computer-readable storage media including Random Access Memory (RAM), Read-Only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Compact Disc Read-Only Memory (CD-ROM) or other optical disk storage, magnetic disk storage or other magnetic storage devices, flash memory devices (e.g., solid state memory devices), or any other storage medium which may be used to carry or store particular program code in the form of computer-executable instructions or data structures and which may be accessed by a general-purpose or special-purpose computer. Combinations of the above may also be included within the scope of computer-readable storage media.
  • Computer-executable instructions may include, for example, instructions and data configured to cause processor 1002 to perform a certain operation or group of operations.
  • computing system 1000 may include any number of other components that may not be explicitly illustrated or described herein.
  • FIG. 11 is a flow diagram 1100 that illustrates an example process to generate control actions for traffic signals at an intersection based at least in part on a non-motorized user queue length that may be performed by an agent such as agent 108 of FIG. 3 , arranged in accordance with at least some embodiments described herein.
  • Example processes and methods may include one or more operations, functions or actions as illustrated by one or more of blocks 1102, 1104, 1106, 1108, and/or 1110, and may in some embodiments be performed by a computing system such as computing system 1000 of FIG. 10 .
  • the operations described in blocks 1102-1110 may also be stored as computer-executable instructions in a computer-readable medium such as memory 1004 and/or data storage 1006 of computing system 1000.
  • the example process to generate control actions for traffic signals at an intersection based at least in part on a non-motorized user queue length may begin with block 1102 ("Determine Motorized User Queue Length"), where an agent configured to control the traffic signals deployed at the intersection may determine the lengths of the motorized user queues at the intersection.
  • the agent may transmit the traffic data (e.g., the lengths of the motorized user queues), for example, to one or more neighboring agents and/or one or more traffic coordinator systems.
  • Block 1102 may be followed by block 1104 ("Determine Non-Motorized User Queue Length"), where the agent configured to control the traffic signals deployed at the intersection may determine the lengths of the non-motorized user queues at the intersection.
  • the agent may transmit the traffic data (e.g., the lengths of the non-motorized user queues), for example, to one or more neighboring agents and/or one or more traffic coordinator systems.
  • Block 1104 may be followed by block 1106 ("Determine a Control Action"), where the agent configured to control the traffic signals deployed at the intersection may determine an action (control action) for the traffic signals at the intersection based on the determined lengths of the motorized user queues and the non-motorized user queues.
  • the agent may incorporate historical traffic data of the intersection and/or one or more other intersections in determining the action.
  • the agent may incorporate traffic data of one or more neighboring intersections in determining the action.
  • Block 1106 may be followed by decision block 1108 ("Control Action Satisfies a Condition?"), where the agent configured to control the traffic signals deployed at the intersection may determine whether the action satisfies a condition (e.g., a condition placed on the operation of the traffic signals). If the agent determines that the action does not satisfy any one of the conditions, decision block 1108 may be followed by block 1106 where the agent may determine another action for the traffic signals at the intersection.
  • a condition e.g., a condition placed on the operation of the traffic signals.
  • decision block 1108 may be followed by block 1110 ("Generate the Control Action"), where the agent configured to control the traffic signals deployed at the intersection may control the traffic signals at the intersection according to the action (e.g., cause signal control module 306 to control operation of the traffic signals in a manner consistent with the action).
  • the agent configured to control the traffic signals deployed at the intersection may control the traffic signals at the intersection according to the action (e.g., cause signal control module 306 to control operation of the traffic signals in a manner consistent with the action).
  • FIG. 12 illustrates an example architecture for multiple Q-learning categories, arranged in accordance with at least some embodiments described herein.
  • the architecture may include a data collection layer 1202, a clustering layer 1204, and a traffic signal control layer 1206.
  • Data collection layer 1202 may include one or more computing systems operating in a networked environment 1208.
  • network environment 1208 may include one or more server computing systems 1210a-1210flogically connected to each other and a central data store 1212, through a network including one or more network devices 1214a and 1214b.
  • Server computing systems 1210a-1210f may be individually referred to herein as server computing system 1210 or collectively referred to herein as server computing systems 1210.
  • Network devices 1214a and 1214b may be individually referred to herein as network device 1214 or collectively referred to herein as network devices 1214.
  • Central data store 1212 may be configured to store data, such as, by way of example, traffic data, sensor data, and other data that may be used by server computing systems 1210.
  • Central data store 1212 may be implemented using any computer-readable storage media suitable for carrying or having data or data structures stored thereon.
  • Network devices 1214 may be a computing system or device, such as a router or other networking device, which facilitates the sending and receiving of data and/or information (e.g., data packets) through and between one or more networks.
  • the network may be a local area network, a wide area network, the Internet, and/or other wired or wireless networks.
  • each server computing system 1210 may include an agent, such as agent 108, and may be configured to control the flow of traffic through an intersection, such as intersection 100 of FIG. 1 .
  • server computing system 1210 may be deployed at an intersection, and may send the traffic data of the intersection to central data store 1212.
  • the traffic data may include information regarding the control of traffic through the intersection, such as traffic signal state information, control actions, data from the sensors deployed at the intersection, and the like.
  • Central data store 1212 may store or be a repository of the historical traffic data provided by server computing systems 1210.
  • Server computing system 1210 deployed at each intersection may be configured to consult central data store 1212 to perform the traffic data analysis and control decisions (e.g., generate a control action for the traffic signals at the intersection) to control the flow of traffic through the intersection.
  • server computing system 1210 may store some or all of the traffic data in a local data store, such as information data store 310 of FIG. 3 .
  • Clustering layer 1204 may include the groups or clusters of the historical traffic data at each intersection.
  • the historical traffic data of each intersection may be grouped or clustered into one or more traffic pattern clusters.
  • Each traffic pattern cluster may include the state/action/reward history records associated with the respective traffic pattern cluster.
  • clustering layer 1204 may include historical traffic data clustered into multiple traffic pattern clusters at one or more intersections 1216a-1216c. Intersections 1216a-1216c may be individually referred to herein as intersection 1216 or collectively referred to herein as intersections 1216.
  • server computing system 1210 e.g., server computing system 1210a at intersection 1216 (e.g., intersection 1216a) may cluster the historical traffic data of intersection 1216 (e.g., intersection 1216a) into one or more traffic pattern clusters.
  • server computing system 1210 at intersection 1216 may consider historical traffic data of one or more neighboring intersections in generating the traffic pattern clusters for intersection 1216.
  • historical traffic data of neighboring intersections may be assigned weights in accordance with the relevancy of the historical traffic data. For example, the assigned weights may be based on the closeness or nearness of the neighbor intersection to intersection 1216.
  • Traffic signal control layer 1206 may include the Q-learning categories that may be applied (e.g., used) to generate the control actions for the traffic signals. As depicted, each Q-learning category may be associated with a corresponding traffic pattern cluster at each intersection 1216. Accordingly, each Q-learning category may be used to determine control actions for the traffic signals at an intersection based on the historical traffic data, including the state/action/reward history records, associated with the traffic pattern cluster that corresponds to the respective Q-learning category.
  • server computing system 1210 e.g., server computing system 1210b at intersection 1216 (e.g., intersection 1216b) at intersection 1216 (e.g., intersection 1216b) may select a Q-learning category based on traffic data currently observed at intersection 1216 (e.g., intersection 1216b), and apply the selected Q-learning category to generate a control action for the traffic signals at intersection 1216 (e.g., intersection 1216a).
  • environment 1208 may include any number of server computing systems, central data stores, and/or network devices, such as hundreds or thousands server computing systems, more than one central data stores, and a different number of network devices.
  • environment 1208 may also include one or more other computing systems, such as coordinator system 202.
  • clustering layer 1204 may include a different number of intersections, and each intersection may include a different number of traffic pattern clusters.
  • each intersection 1216 is illustrated as including the same number of traffic pattern clusters, one or more intersections 1216 may include a different number of traffic pattern clusters.
  • FIG. 13 illustrates an example traffic pattern cluster Q-learning category association at an intersection 1300, arranged in accordance with at least some embodiments described herein.
  • intersection 1300 includes four traffic pattern clusters, TP1 1302a, TP2 1302b, TP3 1302 c, and TP4 1302d, and four Q-learning categories, Q 1 1304a, Q 2 1304b, Q 3 1304c, and Q 4 1304d.
  • Q-learning category Q 1 1304a corresponds to traffic pattern cluster TP1 1302a
  • Q-learning category Q 2 1304b corresponds to traffic pattern cluster TP2 1302b
  • Q-learning category Q 3 1304c corresponds to traffic pattern cluster TP3 1302c
  • Q-learning category Q 4 1304d corresponds to traffic pattern cluster TP4 1302d.
  • an operator with knowledge of traffic data analysis may determine the number of Q-learning categories and the corresponding thresholds for each of the Q-learning categories, for example, based on historical traffic statistics and/or operational policies.
  • Examples of historical traffic statistics may include motorized traffic arrival rate or the number of arriving motorized vehicles for any time window in a day, non-motorized traffic arrival rate or the number of arriving non-motorized users for any time window in a day, queue lengths at intersections for any time window in a day, average frequency of accident occurrence on a particular road, average motorized traffic waiting time for any time window in a day, average non-motorized traffic waiting time for any time window in a day, types of events held at a nearby location, and the like.
  • Examples of operational policies may include maximum green / red light time for motorized traffic, maximum green / red light time for non-motorized traffic, minimum green / red light time for motorized traffic, minimum green / red light time for non-motorized traffic, priority for different directions (e.g. if the intersection interconnects a very wide road with multiple lanes and a narrow road with only one lane), expected average queue length for motorized and non-motorized traffic, and the like.
  • the thresholds may include the properties that define or characterize the different traffic patterns corresponding to each of the Q-learning categories.
  • the operator may input or provide the number of Q-learning categories and the corresponding thresholds for each of the Q-learning categories to a computing system, such as server computing system 1210, at intersection 1300.
  • the computing system may then cluster (e.g., categorize) the historical traffic data of intersection 1300 into traffic pattern clusters in accordance with the specified (e.g., input) thresholds to generate the specified number of Q-learning categories.
  • any one or more of the numerous conventional clustering techniques may be utilized to cluster the historical traffic data of the intersection.
  • the operator may then determine the number of Q-learning categories and the corresponding thresholds for each of the Q-learning categories based on an analysis of the clustered historical traffic data.
  • the operator may determine that four Q-learning categories, one each of normal traffic, heavy traffic, low traffic, and traffic jam, may be appropriate at intersection 1300 based on an analysis of the historical traffic statistics (e.g., historical traffic data) of intersection 1300.
  • the operator may specify the thresholds, such as traffic arriving rates, queue lengths, average waiting time at the intersection, and/or the like, for the four Q-learning categories.
  • the thresholds may include a traffic rate Threshold1, a traffic rate Threshold2, and a traffic rate Threshold3 that delineate (e.g., define) the four Q-learning categories.
  • Observed traffic rate at an intersection below Threshold1 may correspond to (e.g., indicate) low traffic
  • observed traffic rate at an intersection above Threshold3 e.g., observed traffic rate > Threshold3 may correspond to (e.g., indicate) traffic jam.
  • the traffic pattern cluster TP1 1302a may correspond to normal traffic
  • traffic pattern cluster TP2 1302b may correspond to heavy traffic
  • traffic pattern cluster TP3 1302c may correspond to low traffic
  • traffic pattern cluster TP4 1302d may correspond to traffic jam.
  • Q-learning category Q 1 1304a corresponding to traffic pattern cluster TP1 1302a may be applied to control actions for the traffic signals at intersection 1300 during normal traffic;
  • Q-learning category Q 2 1304b corresponding to traffic pattern cluster TP2 1302b may be applied to control actions for the traffic signals at intersection 1300 during heavy traffic;
  • Q-learning category Q 3 1304c corresponding to traffic pattern cluster TP3 1302c may be applied to control actions for the traffic signals at intersection 1300 during low traffic;
  • Q-learning category Q 4 1304d corresponding to traffic pattern cluster TP4 1302d may be applied to control actions for the traffic signals at intersection 1300 during traffic jams.
  • the number of Q-learning categories depicted in intersection 1300, and the thresholds that delineate the Q-learning categories is for illustration, and one skilled in the art will appreciate that there may be a different number of Q-learning categories.
  • FIG. 14 is a flow diagram 1400 that illustrates an example process to cluster historical traffic data that may be performed by an agent such as agent 108 of FIG. 3 , arranged in accordance with at least some embodiments described herein.
  • Example processes and methods may include one or more operations, functions or actions as illustrated by one or more of blocks 1402, 1404, 1406, 1408, and/or 1410, and may in some embodiments be performed by a computing system such as computing system 1000 of FIG. 10 .
  • the operations described in blocks 1402-1410 may also be stored as computer-executable instructions in a computer-readable medium such as memory 1004 and/or data storage 1006 of computing system 1000.
  • the example process to cluster historical traffic data may begin with block 1402 ("Determine Number of Q-learning Categories and Corresponding Thresholds"), where an operator may decide the number of Q-learning categories and corresponding thresholds for controlling the traffic signals at an intersection.
  • the thresholds for a specific Q-learning category may characterize the Q-learning category.
  • the operator may analyze the historical traffic statistics (e.g., the historical traffic data) of the intersection, and decide the number of Q-learning categories and the corresponding thresholds based on the analysis of the historical traffic statistics and/or operational policies for controlling the traffic signals at the intersection.
  • the operator may utilize one or more conventional clustering techniques to cluster the historical traffic data of the intersection, and the operator may decide the number of Q-learning categories and the corresponding thresholds based on an analysis of the clustered historical traffic data and/or operational policies for controlling the traffic signals at the intersection.
  • Block 1402 may be followed by block 1404 ("Configure Computing System with the Thresholds"), where the operator may configure a computing system, such as an agent (e.g., agent 108 of FIG. 3 ) configured to control the traffic signals deployed at the intersection, with the thresholds that correspond to the Q-learning categories determined at block 1402.
  • a computing system such as an agent (e.g., agent 108 of FIG. 3 ) configured to control the traffic signals deployed at the intersection, with the thresholds that correspond to the Q-learning categories determined at block 1402.
  • Block 1404 may be followed by block 1406 ("Obtain Historical Traffic Data"), where the agent configured to control the traffic signals at the intersection may obtain the historical traffic data of the intersection.
  • the agent may obtain some or all of the historical traffic data of the intersection from a remote data store, such as central data store 1212. Additionally or alternatively, the agent may obtain some or all of the historical traffic data of the intersection from a local data store, such as information data store 310.
  • Block 1406 may be followed by block 1408 ("Cluster Historical Traffic Data into Traffic Pattern Clusters based on the Thresholds"), where the agent may cluster the historical traffic data of the intersection into traffic pattern clusters based on the specified thresholds.
  • the agent may cluster the historical traffic data using one or more conventional clustering techniques.
  • Block 1408 may be followed by block 1410 ("Associate a Q-learning Category with a Traffic Pattern Cluster"), where the agent may associate each Q-learning category determined at block 1402 with a corresponding traffic pattern cluster. Accordingly, a Q-learning category associated with a specific traffic pattern cluster is able to determine an optimal action for the traffic signals at the intersection based on its own historical traffic data (e.g., the historical traffic data associated with the specific traffic pattern cluster). In some embodiments, the operator may associate an event to one or more of the traffic pattern clusters, for example, based on traffic data observed during the occurrence of past events.
  • the operator may compare the characteristics (e.g., historical traffic data) of a traffic pattern cluster to recorded traffic data that actually occurred during an event, and associate the traffic pattern cluster to the event based on the similarity between the characteristics of the traffic pattern cluster and the recorded traffic data that actually occurred during the event.
  • the agent may associate an event to one or more of the traffic pattern clusters based on traffic data observed during the occurrence of past events.
  • the number of thresholds provided in the example above is for illustration, and one skilled in the art will appreciate that there may be a different number of thresholds.
  • FIG. 15 is a flow diagram 1500 that illustrates an example process to generate control actions for traffic signals at an intersection based at least in part on a Q-learning category that may be performed by an agent such as agent 108 of FIG. 3 , arranged in accordance with at least some embodiments described herein.
  • Example processes and methods may include one or more operations, functions or actions as illustrated by one or more of blocks 1502, 1504, 1506, 1508, 1510, 1512, and/or 1514, and may in some embodiments be performed by a computing system such as computing system 1000 of FIG. 10 .
  • the operations described in blocks 1502-1514 may also be stored as computer-executable instructions in a computer-readable medium such as memory 1004 and/or data storage 1006 of computing system 1000.
  • the example process to generate control actions for traffic signals at an intersection based at least in part on a Q-learning category may begin with block 1502 ("Monitor Traffic Data"), where a computing system, such as an agent (e.g., agent 108 of FIG. 3 ) configured to control the traffic signals deployed at the intersection, may monitor the traffic data (e.g., the incoming traffic rate) at the intersection.
  • the monitored traffic data may be for a particular (e.g., current) time slot.
  • Block 1502 may be followed by block 1504 ("Determine Traffic Pattern Cluster that Corresponds to the Monitored Traffic Data"), where the agent may determine a traffic pattern cluster that corresponds to the monitored traffic data.
  • the agent may determine a traffic pattern cluster occurring at the intersection at the particular time slot based on the monitored traffic data. That is, the agent may identify the traffic pattern cluster to which the currently monitored traffic data belongs. For example, there may be multiple clusters of historical traffic data (e.g., multiple traffic pattern clusters) of the intersection, and the agent may identify one of the multiple traffic pattern clusters as being the traffic pattern cluster to which the currently monitored traffic data belongs.
  • the currently monitored traffic data may correspond to the current state of the intersection, for example, S t i,d , and the agent may identify the traffic pattern to which S t i,d belongs. The agent may then determine the traffic pattern cluster to which the identified traffic pattern belongs.
  • Block 1504 may be followed by decision block 1506 ("Traffic Pattern Cluster Change?"), where the agent may determine whether there is a change in the traffic pattern cluster.
  • the agent may determine whether the traffic pattern cluster that corresponds to the currently monitored traffic data is different from the traffic pattern cluster that is associated with the Q-learning category that the agent is currently using to generate the control actions for the traffic signals at the intersection. That is, the agent may determine whether the traffic pattern cluster at the particular time slot is different than the traffic pattern cluster that occurred in the preceding time slot.
  • decision block 1506 may be followed by block 1502, where the agent continues to monitor the traffic data at the intersection. That is, as there is no change in the traffic pattern cluster, the agent may continue to use the current Q-learning category to generate the control actions for the traffic signals at the intersection.
  • decision block 1506 may be followed by decision block 1508 ("False Positive Traffic Pattern Cluster Change?"), where the agent may determine whether the change in the traffic pattern cluster is a false positive change.
  • the agent may avoid unnecessarily and/or frequently changing traffic pattern clusters when the change in traffic pattern is temporary (e.g., the change to a different traffic pattern is for a small number of time slots).
  • FIG. 16 illustrates an example false positive traffic pattern cluster change, arranged in accordance with at least some embodiments described herein.
  • the historical traffic data at an intersection may be clustered based on queue lengths (e.g., motorized user queue lengths and non-motorized user queue lengths). Based on queue lengths, the historical traffic data of the intersection may be clustered into two traffic pattern clusters, TP1 and TP2, and the two traffic pattern clusters may be associated with respective Q-learning categories, Q1 and Q2.
  • time 0 to about time 50 e.g., time slot 0 to time slot 50
  • traffic data corresponding to traffic pattern cluster TP2 may have been observed at the intersection.
  • Q-learning category Q2 may have been used to control the traffic signals at the intersection.
  • time 50 e.g., time slot 50
  • Q-learning category Q1 may have been used to control the traffic signals at the intersection from shortly after about time 50 (e.g., time slot 50).
  • traffic data corresponding to traffic pattern cluster TP2 may have been observed at the intersection.
  • the traffic data corresponding to traffic pattern cluster TP2 may have been observed for a very short period of time (e.g., small number of time slots), and, at about time 93 (e.g., time slot 93), traffic data corresponding to traffic pattern cluster TP1 may again have been observed.
  • time 93 e.g., time slot 93
  • Q-learning categories from Q1 to Q2 for the short period of time from about time 90 to about time 93 e.g., small number of time slots
  • the change from traffic pattern cluster TP1 to traffic pattern cluster TP2 at about time 90, only to last to about time 93 may be considered (e.g., processed as) a false positive traffic pattern cluster change.
  • the number of Q-learning categories provided in the example above is for illustration, and one skilled in the art will appreciate that there may be a different number of Q-learning categories.
  • the agent may monitor a change in the traffic pattern cluster at the intersection to determine that the new (e.g., different) traffic pattern cluster has been monitored for a specific number of consecutive time slots K, where K >> 1, before determining that the change in traffic pattern cluster is not a false positive.
  • the agent may monitor a change in the traffic pattern cluster at the intersection to determine that the new (e.g., different) traffic pattern cluster has been monitored for at least a certain percentage or number of a specific number of consecutive time slots before determining that the change in traffic pattern cluster is not a false positive.
  • the agent may determine that the new traffic pattern cluster has been monitored for at least a specific number of time slots M in the past (e.g., preceding) specific number of consecutive time slots N, where N >> 1 and 1 ⁇ M ⁇ N, before determining that the change in traffic pattern cluster is not a false positive.
  • the thresholds K, M, and/or N may be specified based on an operational policy or policies for controlling the traffic signals at the intersection. Additionally or alternatively, the thresholds K, M, and/or N may vary for the different traffic patterns (e.g., traffic pattern clusters).
  • the agent may determine whether the change in the traffic pattern cluster is a false positive change based on a transitional phase. For example, upon observing traffic data that corresponds to a traffic pattern cluster, e k , and before changing to the traffic pattern cluster, e k , the agent may start to count the number of consecutive time slots (e.g., measure of time) in which traffic data corresponding to the traffic pattern cluster, e k , is observed to determine a transitional phase, L k , for determining a change to the traffic pattern cluster, e k . If L k exceeds a specific threshold, R k , the agent may determine that the change in traffic pattern cluster is not a false positive.
  • a transitional phase e.g., measure of time
  • the agent may make a change to e k upon determining that L k > R k .
  • the threshold R k may be arbitrarily specified. Additionally or alternatively, the threshold R k may be specified based on historical traffic data of the intersection. In some embodiments, the threshold R k may vary based on the frequency of occurrence of e k . That is, the threshold R k may be based on a probability of the occurrence of e k . For example, if traffic data corresponding to e k was frequently observed at the intersection, threshold R k may be lower (e.g., set to a smaller number).
  • threshold R k may be higher (e.g., set to a higher number).
  • R k may vary between a minimum value and a maximum value. The minimum values and/or the maximum value vary based on the frequency of occurrence of e k .
  • the agent may update (e.g., re-calculate) R k upon determining a change in the traffic pattern cluster at the intersection. For example, the agent may update R k to account for the currently observed traffic pattern cluster at the intersection.
  • FIG. 17 is a time series diagram illustrating an example traffic pattern cluster change based on a transitional phase, arranged in accordance with at least some embodiments described herein.
  • a Q-learning category Q 2 which is associated with a traffic data cluster e 2
  • an agent at the intersection may observe traffic data corresponding to a traffic pattern cluster e 1 .
  • traffic pattern cluster e 1 is observed at the intersection, the agent may continue to use Q-learning category Q 2 to control the traffic signals.
  • the agent may determine a transitional phase L 1 for traffic pattern cluster e 1 .
  • the agent may determine that the change in traffic pattern cluster from e 2 to e 1 is not a false positive and, accordingly, use Q-learning category Q 1 to control the traffic signals. While using Q-learning category Q 1 to control the traffic signals at the intersection, the agent may observe traffic data corresponding to a traffic pattern cluster e 3 . Although traffic pattern cluster e 3 is observed at the intersection, the agent may continue to use Q-learning category Q 1 to control the traffic signals. Rather than changing traffic pattern clusters and corresponding Q-learning categories, the agent may determine a transitional phase L 3 for traffic pattern cluster e 3 .
  • a corresponding threshold R 1 e.g., L j1 > R 1
  • transitional phase L 3 e.g., the traffic data corresponding to traffic pattern cluster e 1 is again observed
  • the agent may determine that transitional phase L 3 does not exceed a corresponding threshold R 3 (e.g., L 3 ⁇ R 3 ), and, as a result, the agent may determine that the change in traffic pattern cluster from e 1 to e 3 is a false positive. Having determined that the change from traffic pattern cluster e 1 to e 3 is a false positive, the agent may continue to use Q-learning category Q 1 to control the traffic signals.
  • a corresponding threshold R 3 e.g., L 3 ⁇ R 3
  • decision block 1508 may be followed by block 1502, where the agent continues to monitor the traffic data at the intersection. That is, as the change in the traffic pattern cluster is a false positive, the agent may continue to use the current Q-learning category to generate the control actions for the traffic signals at the intersection.
  • decision block 1508 may be followed by block 1510 ("Select Q-learning Category and Corresponding Historical Traffic Data that Correspond to the Traffic Pattern Cluster"), where the agent may select the Q-learning category that corresponds to the current traffic pattern cluster (e.g., the traffic pattern cluster associated with the traffic data currently monitored at the intersection).
  • the agent may retrieve the historical traffic data that corresponds to the current traffic pattern cluster, for example, from a local data store. Additionally or alternatively, the agent may retrieve some or all of the historical traffic data from a remote data store.
  • Block 1510 may be followed by block 1512 ("Generate Control Action Using Selected Q-learning Category, Corresponding Historical Traffic Data, and Monitored Traffic Data"), where the agent may use the Q-learning category, the retrieved historical traffic data, and the currently monitored (e.g., currently observed) traffic data at the intersection, to generate a control action for the traffic signals at the intersection.
  • the agent may input the current state of the intersection, S t i,d , as a new state to the Q-learning category, and use the Q-learning category to calculate the immediate reward based on S t i,d , to determine a control action for the traffic signals at the intersection.
  • Block 1512 may be followed by block 1514 ("Save Control Action Information"), where the agent may save the control action and the information related to the control action (e.g., state/action/reward information) in a data store. Accordingly, the control action and its related information may become part of the historical traffic data of the traffic pattern cluster associated with the Q-learning category.
  • the agent may maintain a record of the count of a number of times (e.g., frequency) the traffic pattern (e.g., traffic pattern cluster) was observed at the intersection.
  • FIG. 18 is a flow diagram 1800 that illustrates an example process to adjust a learning rate of Q-learning that may be performed by an agent such as agent 108 of FIG. 3 , arranged in accordance with at least some embodiments described herein.
  • Example processes and methods may include one or more operations, functions or actions as illustrated by one or more of blocks 1802, 1804, 1806, 1808, 1810, 1812, and/or 1814, and may in some embodiments be performed by a computing system such as computing system 1000 of FIG. 10 .
  • the operations described in blocks 1802-1814 may also be stored as computer-executable instructions in a computer-readable medium such as memory 1004 and/or data storage 1006 of computing system 1000.
  • the example process to adjust a learning rate of Q-learning may begin with block 1802 ("Monitor Traffic Data"), where a computing system, such as an agent (e.g., agent 108 of FIG. 3 ) configured to control the traffic signals deployed at the intersection, may monitor the traffic data at the intersection.
  • the monitored traffic data may be for a particular (e.g., current) time slot.
  • the monitoring of the traffic data may be for metrics such as traffic rate (e.g., motorized traffic and/or non-motorized traffic arriving at the intersection), queue length (e.g., motorized user queue lengths and/or non-motorized user queue lengths at the intersection), waiting time (e.g., motorized traffic and/or non-motorized traffic average waiting time at the intersection), accident, and/or the like.
  • traffic rate e.g., motorized traffic and/or non-motorized traffic arriving at the intersection
  • queue length e.g., motorized user queue lengths and/or non-motorized user queue lengths at the intersection
  • waiting time e.g., motorized traffic and/or non-motorized traffic average waiting time at the intersection
  • accident e.g., accident, and/or the like.
  • the metrics may be determined based on an operational policy or policies for controlling the traffic signals at the intersection.
  • Q-learning works by learning an action-value function that provides an expected utility of taking a given action (e.g., generating a given control action) in a given state (e.g., given state of the traffic signals) and following the optimal policy thereafter.
  • Q-learning equation [2] above may assume a constant or non-varying learning rate ⁇ in the entire learning procedure.
  • learning rate ⁇ in the Q-learning equation [2] may depend on the status change of traffic pattern change.
  • the learning rate of Q-learning determines to what extent the newly observed traffic data overrides the older traffic data in learning the action-value function.
  • a learning rate of 0 causes Q-learning to not learn anything (e.g., newly observed traffic data is not considered), while a learning rate of 1 causes Q-learning to consider only the newly observed traffic data.
  • Block 1802 may be followed by block 1804 ("Determine Traffic Pattern that Corresponds to the Monitored Traffic Data"), where the agent may determine a traffic pattern that corresponds to the monitored traffic data.
  • the agent may determine a traffic pattern occurring at the intersection at the particular time slot based on the monitored traffic data. For example, the traffic pattern may be based on the traffic rate at the intersection. Additionally or alternatively, the traffic pattern may be based on the queue lengths at the intersection.
  • an operator may specify the thresholds for the various traffic patterns.
  • the thresholds may be specified based on an operational policy or policies for controlling the traffic signals at the intersection. For example, similar to the discussion above with respect to traffic pattern clusters, the thresholds may delineate the traffic rates into corresponding traffic patterns.
  • Block 1804 may be followed by decision block 1806 ("Traffic Pattern Change?”), where the agent may determine whether there is a change in the traffic pattern at the intersection. The agent may determine whether the traffic pattern at the particular time slot is different than the traffic pattern that occurred in the preceding time slot. If the agent determines that there is no change in the traffic pattern at the intersection, decision block 1806 may be followed by decision block 1808 ("Satisfy Traffic Pattern Unchanged Criteria?"), where the agent may determine whether the traffic pattern at the intersection has been unchanged for a sufficient length of time (e.g., a traffic pattern unchanged criteria). The traffic pattern unchanged criteria may be specified based on an operational policy or policies for controlling the traffic signals at the intersection, and may indicate a length of time (e.g., number of time slots).
  • the agent may decrease or reduce the learning rate of Q-learning if the unchanged traffic pattern at the intersection satisfies the traffic pattern unchanged criteria (e.g., the traffic pattern at the intersection remains the same or unchanged for the requisite length of time (e.g., requisite number of time slots)). In some embodiments, the agent may gradually decrease or reduce the learning rate of Q-learning if the unchanged traffic pattern at the intersection satisfies the traffic pattern unchanged criteria.
  • the check against the traffic pattern unchanged criteria may be applied to avoid unnecessary and/or frequent adjustments of the learning rate of Q-learning. For example, if the traffic pattern at the intersection is temporarily constant (e.g., remains the same temporarily), there may be minimal to no benefit to adjusting the learning rate of Q-learning.
  • the traffic pattern unchanged criteria may be specified as a specific number of consecutive time slots K, where K >> 1. That is, the traffic pattern unchanged criteria may be determined or considered to be met if the traffic pattern at the intersection has been unchanged for K consecutive time slots. In some embodiments, the traffic pattern unchanged criteria may be specified as at least a certain percentage or number of a specific number of consecutive time slots. That is, the traffic pattern unchanged criteria may be determined or considered to be met if, in the past (e.g., preceding) N (N >> 1) consecutive time slots, at least M time slots have the same traffic pattern (1 ⁇ M ⁇ N).
  • the thresholds K, M, and/or N may be specified based on an operational policy or policies for controlling the traffic signals at the intersection. Additionally or alternatively, the thresholds K, M, and/or N may vary for the different traffic patterns.
  • decision block 1808 may be followed by block 1802, where the agent may monitor the traffic data at the intersection. For example, the agent may continue to monitor the traffic data at the intersection for the next or subsequent time slot.
  • decision block 1808 may be followed by block 1810 ("Decrease Learning Rate of Q-learning"), where the agent may decrease the learning rate of Q-learning.
  • the agent may decrease the learning rate of Q-learning so that Q-learning considers more of the historical traffic data of the intersection since the traffic pattern is not changing at the intersection.
  • the traffic pattern at the intersection may no longer continue to satisfy the traffic pattern unchanged criteria. That is, the traffic pattern at the intersection may need to satisfy the traffic pattern unchanged criteria from scratch (e.g., from the beginning).
  • the traffic pattern may need to remain unchanged for another 10 consecutive time slots for the traffic pattern unchanged criteria to be satisfied again (e.g., satisfied for a second time).
  • the agent may gradually decrease the learning rate, for example, by varying the specific, predefined value. For example, the agent may decrease the learning rate to a first value upon the traffic pattern at the intersection satisfying the traffic pattern unchanged criteria for a first time. If the traffic pattern unchanged criteria is satisfied a second time, the agent may decrease the learning rate to a second value, which is smaller than the first value. In some embodiments, the agent may decrease the gap or difference between successive values over time. For example, the difference between the first value and the second value may be larger than the difference between the second value and a third value. In some embodiments, the agent may increase the gap or difference between successive values over time. For example, the difference between the first value and the second value may be smaller than the difference between the second value and a third value. The agent may decrease the learning rate until a minimum learning rate value is reached.
  • the agent may vary the decreasing rate. For example, the agent may decrease the learning rate by a first decreasing rate upon the traffic pattern at the intersection satisfying the traffic pattern unchanged criteria for a first time. If the traffic pattern unchanged criteria is satisfied a second time, the agent may decrease the learning rate by a second decreasing rate. In some embodiments, the second decreasing rate may be smaller than the first decreasing rate. In this instance, the learning rate reduction decreases over time. In some embodiments, the second decreasing rate may be larger than the first decreasing rate. In this instance, the learning rate reduction increases over time. The agent may decrease the learning rate until a minimum learning rate value is reached.
  • the agent may continue to monitor the traffic data at the intersection for the next or subsequent time slot (block 1802).
  • decision block 1806 may be followed by decision block 1812 ("False Positive Traffic Pattern Change?"), where the agent may determine whether the change in the traffic pattern is a false positive change.
  • the agent may avoid unnecessary and/or frequent adjustments of the learning rate of Q-learning when the change in traffic pattern is temporary (e.g., the change to a different traffic pattern is for a small number of time slots). For example, if the traffic pattern change at the intersection is temporary, there may be minimal to no benefit to adjusting the learning rate of Q-learning.
  • the agent may monitor a change in the traffic pattern at the intersection to determine that the new (e.g., different) traffic pattern has been monitored for a specific number of consecutive time slots K, where K >> 1, before determining that the change in traffic pattern is not a false positive. In some embodiments, the agent may monitor a change in the traffic pattern at the intersection to determine that the new (e.g., different) traffic pattern has been monitored for at least a certain percentage or number of a specific number of consecutive time slots before determining that the change in traffic pattern is not a false positive.
  • the agent may determine that the new traffic pattern has been monitored for at least a specific number of time slots M in the recent (e.g., preceding) specific number of consecutive time slots N, where N >> 1 and 1 ⁇ M ⁇ N, before determining that the change in traffic pattern is not a false positive.
  • the thresholds K, M, and/or N may be specified based on an operational policy or policies for controlling the traffic signals at the intersection. Additionally or alternatively, the thresholds K, M, and/or N may vary for the different traffic patterns.
  • decision block 1812 may be followed by block 1802, where the agent may monitor the traffic data at the intersection. For example, the agent may continue to monitor the traffic data at the intersection for the next or subsequent time slot.
  • decision block 1812 may be followed by block 1814 ("Increase Learning Rate of Q-learning"), where the agent may increase the learning rate of Q-learning.
  • the agent may increase the learning rate of Q-learning so that Q-learning considers more of the recent traffic data of the intersection since the traffic pattern may be frequently changing at the intersection. As the traffic pattern may be frequently changing at the intersection, the more recent traffic data may be more relevant than the more historical traffic data. Accordingly, it may be beneficial for Q-learning to consider more of the recent traffic data and less of the more historical traffic data.
  • the agent may vary the increase in the learning rate based on a degree or extent of change in the traffic pattern. If the change in the traffic pattern is great or significant, then the agent may increase the learning rate by a large amount. Conversely, if the change in the traffic pattern is small or not significant, then the agent may increase the learning rate by a small amount.
  • a change in traffic pattern may be determined based on a change in traffic rates corresponding to the traffic patterns.
  • an operator may specify the thresholds to designate the degree or extent of change in traffic patterns.
  • the thresholds may be specified based on an operational policy or policies for controlling the traffic signals at the intersection. For example, there may be three thresholds, ThresholdA, ThesholdB, and ThresholdC.
  • ThresholdA may be the highest threshold
  • ThresholdB may be the middle threshold
  • ThesholdC may be the lowest threshold (e.g., ThresholdA > ThresholdB > ThresholdC).
  • ThresholdA may be the highest threshold
  • ThresholdB may be the middle threshold
  • ThesholdC may be the lowest threshold (e.g., ThresholdA > ThresholdB > ThresholdC).
  • ThresholdA may be the highest threshold
  • ThresholdB may be the middle threshold
  • ThesholdC may be the lowest threshold (e.g., ThresholdA > ThresholdB > ThresholdC).
  • the agent may continue to monitor the traffic data at the intersection for the next or subsequent time slot (block 1802).
  • embodiments described in the present disclosure may include the use of a special purpose or general purpose computer (e.g., processor 1002 of FIG. 10 ) including various computer hardware or software modules, as discussed in greater detail herein. Further, as indicated above, embodiments described in the present disclosure may be implemented using computer-readable media (e.g., memory 1004 of FIG. 10 ) for carrying or having computer-executable instructions or data structures stored thereon.
  • a special purpose or general purpose computer e.g., processor 1002 of FIG. 10
  • computer-readable media e.g., memory 1004 of FIG. 10
  • An example system may include an information data store and an agent coupled to the information data store.
  • the information data store may be configured to store multiple traffic pattern clusters, where each traffic pattern cluster of the multiple traffic pattern clusters may be assigned a Q-learning category.
  • the agent may be configured to determine a first traffic data of an intersection; determine a first traffic pattern cluster of the multiple traffic pattern clusters that corresponds to the first traffic data; generate a first control action for traffic signals at the intersection based at least in part on a first Q-learning category that corresponds to the first traffic pattern cluster; determine a second traffic data of the intersection, the second traffic data being subsequent in time to the first traffic data; determine whether the second traffic data corresponds to the first traffic pattern cluster; responsive to a determination that the second traffic data corresponds to the first traffic pattern cluster, generate a second control action for the traffic signals at the intersection based at least in part on the first Q-learning category; and responsive to a determination that the second traffic data does not correspond to the first traffic pattern cluster, determine a second traffic pattern cluster of the plurality of traffic pattern clusters that corresponds to the second traffic data, and generate a third control action for the traffic signals at the intersection based at least in part on a second Q-learning category that corresponds to the second traffic pattern cluster.
  • the agent may also be configured to determine
  • determination of whether the second traffic data corresponds to the first traffic pattern cluster may be based at least in part on a determination of whether the second traffic data occurred in a specific number of consecutive time slots.
  • the specific number of consecutive time slots may be based at least in part on a frequency of occurrence of the second traffic pattern cluster that corresponds to the second traffic data.
  • determination of whether the second traffic data corresponds to the first traffic pattern cluster may be based at least in part on a determination of whether the second traffic data occurred in a certain percentage of a specific number of consecutive time slots.
  • at least one traffic pattern cluster of the multiple traffic pattern clusters may be associated with an event.
  • An example method may include clustering historical traffic data into multiple traffic pattern clusters; generating multiple Q-learning categories, each Q-learning category of the multiple Q-learning categories corresponding to a traffic pattern cluster of the multiple traffic pattern clusters; determining a first Q-learning category of the multiple Q-learning categories to use in controlling traffic signals at an intersection based at least in part on a first traffic data of the intersection, the first Q-learning category corresponding to a first traffic pattern cluster, the first traffic data corresponding to the first traffic pattern cluster; and generating a first control action for the traffic signals at the intersection based at least in part on the first Q-learning category.
  • clustering historical traffic data may be based at least in part on queue lengths.
  • clustering historical traffic data may be based at least in part on traffic pattern characteristics.
  • the traffic pattern characteristics may include on one or more thresholds.
  • the method may also include determining a second traffic data of the intersection; determining whether to change to a second Q-learning category of the multiple Q-learning categories to use in controlling traffic signals at the intersection, the second Q-learning category corresponding to a second traffic pattern cluster, the second traffic data corresponding to the second traffic pattern cluster; and responsive to a determination to change to the second Q-learning category, generating a second control action for the traffic signals at the intersection based at least in part on the second Q-learning category.
  • determining whether to change to a second Q-learning category may include determining whether the second traffic pattern cluster occurred in a specific number of consecutive time slots.
  • the specific number of consecutive time slots may be based at least in part on a frequency of occurrence of the second traffic pattern cluster that corresponds to the second traffic data.
  • determining whether to change to a second Q-learning category may include determining whether the second traffic pattern cluster occurred in a certain percentage of a specific number of consecutive time slots.
  • the method may further include, responsive to a determination to change to the second Q-learning category, updating the first traffic pattern cluster with the first traffic data.
  • non-transitory computer-readable storage media storing thereon instructions for execution by a processor of a computing system.
  • the non-transitory computer-readable storage media may further store thereon instructions that, in response to execution by the processor, causes the processor to, in response to execution by the processor, determine a second traffic data of the intersection; determine whether to change to a second Q-learning category of the multiple Q-learning categories to use to control the traffic signals at the intersection, the second Q-learning category corresponding to a second traffic pattern cluster, the second traffic data corresponding to the second traffic pattern cluster; and responsive to a determination to change to the second Q-learning category, generate a second control action for the traffic signals at the intersection based at least in part on the second Q-learning category.
  • the non-transitory computer-readable storage media may further store thereon instructions that, in response to execution by the processor, causes the processor to, responsive to a determination to change to the second Q-learning category, update the first traffic pattern cluster with the first traffic data.
  • An example method may include generating control actions for traffic signals at an intersection based on Q-learning, the Q-learning configured to determine the generated control actions based on at least a portion of historical traffic data of the intersection; determining a frequency of change in traffic pattern of the intersection, a change in traffic pattern being a change from a first traffic pattern of the intersection to a second traffic pattern of the intersection; and adjusting a learning rate of the Q-learning based on the determined frequency of change in traffic pattern of the intersection.
  • determining the frequency of change in traffic pattern may include determining whether the second traffic pattern occurred in a specific number of consecutive time slots.
  • determining the frequency of change in traffic pattern may include determining whether the second traffic pattern occurred in a certain percentage of a specific number of consecutive time slots.
  • adjusting the learning rate of the Q-learning may include increasing the learning rate in accordance with increasing frequency of change in traffic pattern of the intersection. In some examples, increasing the learning rate may include increasing the learning rate based on a degree of change from the first traffic pattern of the intersection to the second traffic pattern of the intersection. In other examples, increasing the learning rate may include increasing the learning rate to a specific predefined value. In still other examples, increasing the learning rate may include increasing the learning rate by a specific predefined increasing rate. In further examples, adjusting the learning rate of the Q-learning may include decreasing the learning rate responsive to a determination that the first traffic pattern of the intersection occurred for a specific number of consecutive time slots. In other examples, decreasing the learning rate may include decreasing the learning rate to a specific predefined value. In further examples, decreasing the learning rate may include decreasing the learning rate by a specific predefined decreasing rate.
  • An example system may include an information data store and an agent coupled to the information data store.
  • the information data store may be configured to store historical traffic data.
  • the agent may be configured to apply Q-learning to generate control actions for traffic signals at an intersection based on at least a portion of historical traffic data of the intersection; determine whether there is a change in traffic pattern of the intersection, the change in traffic pattern being a change from a first traffic pattern of the intersection to a second traffic pattern of the intersection; and responsive to a determination that there is a change in traffic pattern of the intersection, adjust a learning rate of the Q-learning.
  • the determination that there is a change in traffic pattern of the intersection may include a determination that the second traffic pattern occurred in a specific number of consecutive time slots. In other examples, the determination that there is a change in traffic pattern of the intersection may include a determination that the second traffic pattern occurred in a certain percentage of a specific number of consecutive time slots.
  • adjust the learning rate of the Q-learning may include an increase of the learning rate based on a degree of change in the traffic pattern of the intersection. In other examples, adjust the learning rate of the Q-learning may include an increase of the learning rate to a specific predefined value. In still other examples, adjust the learning rate of the Q-learning may include an increase of the learning rate by a specific predefined increasing rate.
  • the agent may be further configured to, responsive to a determination that there is not a change in traffic pattern of the intersection, decrease the learning rate of the Q-learning.
  • the determination that there is not a change in traffic pattern of the intersection may include a determination that the first traffic pattern occurred in a specific number of consecutive time slots.
  • the determination that there is not a change in traffic pattern of the intersection may include a determination that the first traffic pattern occurred in a certain percentage of a specific number of consecutive time slots.
  • non-transitory computer-readable storage media storing thereon instructions for execution by a processor of a computing system.
  • module or “component” may refer to specific hardware implementations configured to perform the actions of the module or component and/or software objects or software routines that may be stored on and/or executed by general purpose hardware (e.g., computer-readable media, processing devices, etc.) of the computing system.
  • general purpose hardware e.g., computer-readable media, processing devices, etc.
  • the different components, modules, engines, and services described in the present disclosure may be implemented as objects or processes that execute on the computing system (e.g., as separate threads). While some of the system and methods described in the present disclosure are generally described as being implemented in software (stored on and/or executed by general purpose hardware), specific hardware implementations, firmware implements, or any combination thereof are also possible and contemplated.
  • a "computing entity” may be any computing system as previously described in the present disclosure, or any module or combination of modulates executing on a computing system.

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Traffic Control Systems (AREA)
EP18179505.5A 2017-07-03 2018-06-25 Commande de signal de circulation utilisant de multiples catégories de q-learning Active EP3425608B1 (fr)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US15/641,168 US10002530B1 (en) 2017-03-08 2017-07-03 Traffic signal control using multiple Q-learning categories

Publications (2)

Publication Number Publication Date
EP3425608A1 true EP3425608A1 (fr) 2019-01-09
EP3425608B1 EP3425608B1 (fr) 2020-03-25

Family

ID=62778758

Family Applications (1)

Application Number Title Priority Date Filing Date
EP18179505.5A Active EP3425608B1 (fr) 2017-07-03 2018-06-25 Commande de signal de circulation utilisant de multiples catégories de q-learning

Country Status (1)

Country Link
EP (1) EP3425608B1 (fr)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109215355A (zh) * 2018-08-09 2019-01-15 北京航空航天大学 一种基于深度强化学习的单点交叉口信号配时优化方法
CN109712413A (zh) * 2019-01-29 2019-05-03 蚌埠学院 具备自动调节功能的十字道路交通管理控制方法及系统
CN110164151A (zh) * 2019-06-21 2019-08-23 西安电子科技大学 基于分布式深度循环q网络的交通灯控制方法
CN110428615A (zh) * 2019-07-12 2019-11-08 中国科学院自动化研究所 基于深度强化学习单路口交通信号控制方法、系统、装置
CN110491144A (zh) * 2019-07-23 2019-11-22 平安国际智慧城市科技股份有限公司 基于路况预测的调整红绿灯时长的方法及相关设备
CN113129614A (zh) * 2020-01-10 2021-07-16 阿里巴巴集团控股有限公司 一种交通控制方法、装置及电子设备
EP3866135A1 (fr) * 2020-02-14 2021-08-18 Siemens Mobility GmbH Procédé de commande d'une installation de signalisation lumineuse

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19941854A1 (de) * 1999-09-02 2001-04-05 Siemens Ag Steuerungsvorrichtung für eine Verkehrsampelkreuzung
CN105118308A (zh) * 2015-10-12 2015-12-02 青岛大学 基于聚类强化学习的城市道路交叉口交通信号优化方法
CN106846836A (zh) * 2017-02-28 2017-06-13 许昌学院 一种单交叉口信号灯时间控制方法及系统

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19941854A1 (de) * 1999-09-02 2001-04-05 Siemens Ag Steuerungsvorrichtung für eine Verkehrsampelkreuzung
CN105118308A (zh) * 2015-10-12 2015-12-02 青岛大学 基于聚类强化学习的城市道路交叉口交通信号优化方法
CN106846836A (zh) * 2017-02-28 2017-06-13 许昌学院 一种单交叉口信号灯时间控制方法及系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
FREY, BRENDAN J.; DUECK, DILBERT: "Clustering by passing messages between data points", SCIENCE, vol. 315.5814, 16 February 2007 (2007-02-16), pages 972 - 6, XP002565072, DOI: doi:10.1126/science.1136800

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109215355A (zh) * 2018-08-09 2019-01-15 北京航空航天大学 一种基于深度强化学习的单点交叉口信号配时优化方法
CN109712413A (zh) * 2019-01-29 2019-05-03 蚌埠学院 具备自动调节功能的十字道路交通管理控制方法及系统
CN110164151A (zh) * 2019-06-21 2019-08-23 西安电子科技大学 基于分布式深度循环q网络的交通灯控制方法
CN110428615A (zh) * 2019-07-12 2019-11-08 中国科学院自动化研究所 基于深度强化学习单路口交通信号控制方法、系统、装置
CN110428615B (zh) * 2019-07-12 2021-06-22 中国科学院自动化研究所 基于深度强化学习单路口交通信号控制方法、系统、装置
CN110491144A (zh) * 2019-07-23 2019-11-22 平安国际智慧城市科技股份有限公司 基于路况预测的调整红绿灯时长的方法及相关设备
CN113129614A (zh) * 2020-01-10 2021-07-16 阿里巴巴集团控股有限公司 一种交通控制方法、装置及电子设备
EP3866135A1 (fr) * 2020-02-14 2021-08-18 Siemens Mobility GmbH Procédé de commande d'une installation de signalisation lumineuse

Also Published As

Publication number Publication date
EP3425608B1 (fr) 2020-03-25

Similar Documents

Publication Publication Date Title
US10395529B2 (en) Traffic signal control using multiple Q-learning categories
EP3425608B1 (fr) Commande de signal de circulation utilisant de multiples catégories de q-learning
US9972199B1 (en) Traffic signal control that incorporates non-motorized traffic information
CN108583578B (zh) 用于自动驾驶车辆的基于多目标决策矩阵的车道决策方法
Chen et al. An improved adaptive signal control method for isolated signalized intersection based on dynamic programming
US20230180086A1 (en) Method and apparatus for accessing network cell
AU2018274980A1 (en) Systems and methods for analyzing and adjusting road conditions
CN108806283A (zh) 一种交通信号灯的控制方法及车联网平台
WO2018141403A1 (fr) Système, dispositif et procédé de gestion de la circulation dans un lieu géographique
US20190051164A1 (en) System and method for retail revenue based traffic management
JP5895926B2 (ja) 移動案内装置及び移動案内方法
AU2024201521A1 (en) Predictive traffic management system
CN106017496A (zh) 一种基于路况的实时导航方法
CN113276874B (zh) 一种车辆行驶轨迹处理方法及相关装置
Gupte et al. Vehicular networking for intelligent and autonomous traffic management
WO2021073716A1 (fr) Raisonneur de trafic
Masutani A sensing coverage analysis of a route control method for vehicular crowd sensing
CN113379099B (zh) 一种基于机器学习与copula模型的高速公路交通流自适应预测方法
Thunig et al. Adaptive traffic signal control for real-world scenarios in agent-based transport simulations
Chen et al. Dynamic traffic light optimization and Control System using model-predictive control method
CN113628446B (zh) 一种基于物联网的交通信息采集分析方法及系统
Liu et al. Trade-offs between bus and private vehicle delays at signalized intersections: Case study of a multiobjective model
Gomides et al. Fire-nrd: A fully-distributed and vanets-based traffic management system for next road decision
Meuser et al. Relevance-aware information dissemination in vehicular networks
Antoine et al. Real-time traffic flow-based traffic signal scheduling: A queuing theory approach

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN PUBLISHED

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20190510

RBV Designated contracting states (corrected)

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: GRANT OF PATENT IS INTENDED

INTG Intention to grant announced

Effective date: 20191010

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE PATENT HAS BEEN GRANTED

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: DE

Ref legal event code: R096

Ref document number: 602018003229

Country of ref document: DE

REG Reference to a national code

Ref country code: AT

Ref legal event code: REF

Ref document number: 1249466

Country of ref document: AT

Kind code of ref document: T

Effective date: 20200415

Ref country code: IE

Ref legal event code: FG4D

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20200324

Year of fee payment: 3

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: RS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

Ref country code: NO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200625

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20200303

Year of fee payment: 3

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: BG

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200625

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200626

Ref country code: LV

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

Ref country code: HR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

REG Reference to a national code

Ref country code: NL

Ref legal event code: MP

Effective date: 20200325

REG Reference to a national code

Ref country code: LT

Ref legal event code: MG4D

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200818

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

Ref country code: SM

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

Ref country code: IS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200725

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

Ref country code: LT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

REG Reference to a national code

Ref country code: AT

Ref legal event code: MK05

Ref document number: 1249466

Country of ref document: AT

Kind code of ref document: T

Effective date: 20200325

REG Reference to a national code

Ref country code: DE

Ref legal event code: R097

Ref document number: 602018003229

Country of ref document: DE

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

Ref country code: MC

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: PL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

26N No opposition filed

Effective date: 20210112

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20200625

REG Reference to a national code

Ref country code: BE

Ref legal event code: MM

Effective date: 20200630

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20200625

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

Ref country code: BE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20200630

REG Reference to a national code

Ref country code: DE

Ref legal event code: R119

Ref document number: 602018003229

Country of ref document: DE

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LI

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20210630

Ref country code: DE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20220101

Ref country code: CH

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20210630

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: TR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

Ref country code: MT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

Ref country code: FR

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20210630

Ref country code: CY

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

Ref country code: AL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200325

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20220625

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20220625