CN111583675A

CN111583675A - Regional road network traffic signal lamp coordination control system and method

Info

Publication number: CN111583675A
Application number: CN202010409600.9A
Authority: CN
Inventors: 吴钢; 李琳; 彭玉泉; 黄传明; 李劲松; 范翠红; 刘辉能
Original assignee: Individual
Current assignee: Individual
Priority date: 2020-05-14
Filing date: 2020-05-14
Publication date: 2020-08-25
Anticipated expiration: 2040-05-14
Also published as: CN111583675B

Abstract

The embodiment of the invention provides a regional road network traffic signal lamp coordination control system and a device, wherein the system comprises a cloud center, edge nodes and multi-source traffic data acquisition equipment, wherein the edge nodes and the multi-source traffic data acquisition equipment are arranged at each intersection in a region, one end of each edge node is connected with the multi-source traffic data acquisition equipment of the corresponding intersection, the other end of each edge node is connected with the cloud center, and adjacent edge calculation nodes are mutually connected. The intelligent independent edge nodes are set, the multi-source traffic data are processed by utilizing the computing power of the edge nodes, and the traffic running state is sensed. And the plurality of edge nodes and the cloud center jointly act, and a multi-agent reinforcement learning method is adopted to coordinate and optimize the traffic signal lamp timing scheme of each intersection in the area. The problem of urban traffic jam can be effectively solved, and the passing efficiency of motor vehicles at the intersection is improved.

Description

Regional road network traffic signal lamp coordination control system and method

Technical Field

The embodiment of the invention relates to the technical field of intelligent traffic, in particular to a regional road network traffic signal lamp coordination control system and method.

Background

The living standard of people is increasingly improved, cities are rapidly developed, urban traffic systems face more and more severe tests along with the progress of urban modernization, the quantity of vehicles kept increases year by year, the vehicle congestion is more and more severe, traffic accidents are frequent, social resources are wasted, the environmental pollution is aggravated, the traveling efficiency, the living quality and the physical and psychological health of people are seriously influenced, and therefore the urban traffic congestion relieving system has great economic and ecological significance.

In most cities in China, the difference between the front and the back of the traffic intersection signal control establishing time is long, and the types of intersection signal control machines are not uniform. The signal control system generally adopts a multi-period timing signal machine, an induction type signal machine and a centralized coordination type signal machine. The signal control scheme mostly adopts a fixed timing method and a self-adaptive timing method. However, when the intersection scale is enlarged, the centralized control system cannot meet the requirements of communication transmission of a large number of traffic data streams and real-time optimization of traffic control strategies, and the system needs to establish a complex traffic model, is difficult to maintain, and in the face of excessively complex data, the traditional traffic signal lamp control scheme and the traffic data processing method cannot match the requirements of current traffic control optimization.

Disclosure of Invention

The embodiment of the invention provides a regional road network traffic signal lamp coordination control system and a regional road network traffic signal lamp coordination control method, which are used for solving the defects that a traditional traffic signal lamp timing system cannot meet the communication transmission of a large number of traffic data streams and the real-time optimization of traffic control strategies, a complex traffic model needs to be established, and the maintenance difficulty is high.

In a first aspect, an embodiment of the present invention provides a regional road network traffic signal lamp coordination control system, including:

the system comprises a cloud center, edge nodes and multi-source traffic data acquisition equipment, wherein the edge nodes and the multi-source traffic data acquisition equipment are arranged at each intersection in an area;

the multi-source traffic data acquisition equipment is used for acquiring multi-source traffic data of the current intersection and sending the multi-source traffic data to the corresponding edge node; the multi-source traffic data comprises geomagnetic coil data, road video monitoring data, radar microwave data and floating car track data;

the edge node includes:

the traffic running state modeling module is used for acquiring multi-source traffic data of corresponding intersections and establishing a traffic running state model; the multi-source traffic data comprises geomagnetic coil data, road video monitoring data, radar microwave data and floating car track data;

the single intersection signal lamp timing module is used for obtaining a traffic signal lamp timing optimization scheme corresponding to an intersection by adopting a reinforcement learning method based on the traffic running state model and a preset initial signal lamp timing scheme, and uploading the traffic signal lamp timing optimization scheme to the cloud center;

the coordination optimization module is used for coordinating and optimizing a traffic signal lamp timing scheme of each intersection in the area by adopting a multi-agent reinforcement learning method in combination with a cloud center and other edge nodes in the area;

the cloud center is used for combining each edge node in the area, and coordinating and optimizing a traffic signal lamp timing scheme of each intersection in the area by adopting a multi-agent reinforcement learning method.

Further, the traffic operation state modeling module specifically includes:

the acquisition unit is used for acquiring the multi-source traffic data acquired by the multi-source traffic data acquisition equipment;

the extraction unit is used for respectively extracting the traffic flow characteristics in the geomagnetic coil data, the radar microwave data and the road video monitoring data; the traffic flow characteristics are large-scale vehicle track data comprising time series position information and movement characteristics;

the data fusion unit is used for integrating and extracting traffic flow characteristics of three types of traffic data sources by adopting a multi-mode data fusion technology to obtain a road traffic state of the intersection;

and the floating vehicle track processing unit is used for processing the floating vehicle track data to obtain the time sequence characteristics and the state characteristics of the motor vehicle track information so as to obtain the vehicle passing state of the intersection.

Further, the single-intersection signal lamp timing module specifically includes:

the judging unit is used for judging whether the traffic flow passing condition of the intersection is a conventional traffic flow state or a dynamic traffic flow state;

the signal lamp timing optimization unit is used for taking the traffic running state model as a state space of the intelligent agent and taking an initial signal lamp timing scheme as an action space of the intelligent agent based on a reinforcement learning algorithm if the traffic flow passing condition at the intersection is the flow state of the conventional vehicle; constructing a reinforcement learning model taking the edge node as an intelligent agent by evaluating a traffic running state as a reward mechanism; and if the traffic running state is the dynamic vehicle flow state, modifying the vehicle following mode g (Q) in the Q function on the basis of the reinforcement learning model to obtain a Q function value under the dynamic vehicle flow state so as to obtain the reinforcement learning model under the dynamic vehicle flow state.

Further, the multi-agent reinforcement learning method comprises the following steps:

and respectively replacing the state and the action of a single agent in the reinforcement learning model with a combined state and a combined action in a dynamic random environment, estimating a value function of a balancing strategy in each game strategy stage, realizing simultaneous strategies among a plurality of agents, and repeatedly iterating to approach an optimal strategy so as to find the unique balance in a regional environment.

In a second aspect, an embodiment of the present invention provides an intersection traffic signal light timing optimization method, including:

acquiring multi-source traffic data of corresponding intersections, and establishing a traffic running state model; the multi-source traffic data comprises geomagnetic coil data, road video monitoring data, radar microwave data and floating car track data;

based on the traffic running state model and a preset initial signal lamp timing scheme, acquiring a traffic signal lamp timing optimization scheme corresponding to the intersection by adopting a reinforcement learning method, and uploading the traffic signal lamp timing optimization scheme to a cloud center;

and coordinating and optimizing the traffic signal lamp timing scheme of each intersection in the area by combining the cloud center and other edge nodes in the area and adopting a multi-agent reinforcement learning method.

Further, the acquiring of the multi-source traffic data of the corresponding intersection and the establishing of the traffic running state model specifically include:

acquiring multi-source traffic data acquired by multi-source traffic data acquisition equipment;

traffic flow characteristics in the geomagnetic coil data, the radar microwave data and the road video monitoring data are respectively extracted; the traffic flow characteristics are large-scale vehicle track data comprising time series position information and movement characteristics;

by adopting a multi-mode data fusion technology, traffic flow characteristics of three types of traffic data sources are integrated and extracted, road traffic states of the intersection are obtained, floating vehicle track data are processed, and time sequence characteristics and state characteristics of motor vehicle track information are obtained, so that vehicle passing states of the intersection are obtained.

Further, the obtaining of the traffic signal lamp timing optimization scheme corresponding to the intersection by using a reinforcement learning method based on the traffic running state model and a preset initial signal lamp timing scheme specifically includes:

judging whether the traffic flow passing condition of the intersection is a conventional vehicle flow state or a dynamic vehicle flow state;

if the traffic flow passing condition at the intersection is the flow state of the conventional vehicle, taking the traffic running state model as the state space of the intelligent body and taking the initial signal lamp timing scheme as the action space of the intelligent body based on a reinforcement learning algorithm; constructing a reinforcement learning model taking the edge node as an intelligent agent by evaluating a traffic running state as a reward mechanism;

and if the traffic running state is the dynamic vehicle flow state, modifying the vehicle following mode g (Q) in the Q function on the basis of the reinforcement learning model to obtain a Q function value under the dynamic vehicle flow state so as to obtain the reinforcement learning model under the dynamic vehicle flow state.

In a third aspect, an embodiment of the present invention provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the steps of the method for coordinating and controlling regional road network traffic signal lights according to the second aspect of the present invention.

In a fourth aspect, the embodiment of the present invention provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the steps of the method for coordinating and controlling regional road network traffic signal lights according to the embodiment of the second aspect of the present invention.

The regional road network traffic signal lamp coordination control system and method provided by the embodiment of the invention can effectively reduce the delay of system processing, reduce the data transmission bandwidth and improve the usability by adopting edge calculation. Because the intelligent control of the traffic signal lamp has the characteristics of real-time data acquisition, high time delay requirement and the like, the calculation and control process is difficult and disadvantageous to be deployed on a cloud end, and the problem can be better solved by adopting an edge calculation mode to complete the control process in the embodiment of the invention.

The intelligent independent edge node is set based on a cloud edge hybrid computing framework, multisource traffic data are processed by utilizing the computing power of the edge node, and the traffic running state is sensed. And the plurality of edge nodes and the cloud center jointly act, and a multi-agent reinforcement learning method is adopted to coordinate and optimize the traffic signal lamp timing scheme of each intersection in the area. The problem of urban traffic jam can be effectively solved, and the passing efficiency of motor vehicles at the intersection is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

Fig. 1 is a schematic structural diagram of a regional road network traffic signal lamp coordination control system according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of an edge node according to an embodiment of the present invention;

FIG. 3 is a schematic view of a traffic operating state model according to an embodiment of the present invention;

fig. 4 is a schematic flow chart of a regional road network traffic signal lamp coordination control method according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

Fig. 1 is a schematic structural diagram of a regional road network traffic signal lamp coordination control system provided by an embodiment of the present invention, and referring to fig. 1, the system includes a cloud center, and an edge node and a multi-source traffic data acquisition device which are arranged at each intersection in a region, where one end of the edge node is connected to the multi-source traffic data acquisition device at the corresponding intersection, and the other end is connected to the cloud center. And the adjacent edge computing nodes are connected with each other.

The multi-source traffic data acquisition equipment is used for acquiring multi-source traffic data of the current intersection and sending the multi-source traffic data to the corresponding edge node; the multi-source traffic data includes geomagnetic coil data, road video monitoring data, radar microwave data, and floating car track data.

Referring to fig. 1, an edge node and a set of multi-source traffic data acquisition equipment are correspondingly arranged at an intersection. According to the embodiment of the invention, each intersection in the area is provided with the edge node and the multi-source traffic data acquisition equipment. The multisource traffic data acquisition equipment comprises geomagnetic coil equipment, video monitoring equipment and radar microwave equipment.

Specifically, the road video monitoring equipment respectively monitors and analyzes multi-lane and multi-traffic flow in four directions of an intersection, geomagnetic coil induction equipment is buried below each lane, and radar microwave equipment is used for setting corresponding sectors for collecting traffic flow information on lanes at a traffic flow inlet of the intersection.

The embodiment of the invention adopts the edge computing node to assist the control of the traffic signal lamp, the edge node independently completes the operations of traffic flow information acquisition, traffic flow information processing, traffic signal lamp optimization and the like, and the edge node can transmit information with the peripheral node while processing the information of the node. In the scene of edge calculation, each intersection is an independent edge node, the intersection has independent data sensing and calculation control capacity, more control functions are reduced to the edge side by the edge calculation, and each edge node is a basic unit capable of performing traffic control. The traffic control scheme reduces the bandwidth pressure of a communication network and the workload of a cloud end, and improves the real-time performance of control.

Fig. 2 is a schematic structural diagram of an edge node according to an embodiment of the present invention, and referring to fig. 2, the edge node includes:

the traffic running state modeling module 201 is used for acquiring multi-source traffic data of corresponding intersections and establishing a traffic running state model; wherein the multi-source traffic data comprises geomagnetic coil data, road video monitoring data, radar microwave data and floating car track data.

Fig. 3 is a schematic view of a traffic operation state model according to an embodiment of the present invention. Referring to fig. 3, the traffic operation state modeling module of the edge node builds the traffic operation state model shown in fig. 3 based on multi-source traffic data. Here, the traffic running state includes a road traffic state and a vehicle passing state.

In this embodiment, the multi-source traffic data of a single intersection is processed at the corresponding edge node. Specifically, the edge nodes extract traffic flow characteristics in geomagnetic coil data, radar microwave data and road video monitoring data of the current intersection, and the traffic flow characteristics of three types of traffic data sources are integrated and extracted by adopting a multi-mode data fusion technology to obtain the road traffic state of the current intersection. The road traffic state describes the traffic running state of each direction of the intersection from the road level. Here, the traffic flow characteristics are large-scale vehicle trajectory data including time-series position information and movement characteristics. The time series position information is a passing road junction position sequence or a bayonet position sequence, and the movement characteristics comprise speed, direction and the like.

In addition, the embodiment processes the floating vehicle track data to obtain the time sequence characteristics and the state characteristics of the motor vehicle track information, so that the vehicle passing state of the intersection is obtained. The vehicle traffic state describes, on the vehicle level, the driving behavior of the motor vehicle and the traffic state in the intersection. According to the invention, the collected multi-source traffic data is fully utilized from two levels of vehicles and roads, and urban road traffic state evaluation and vehicle driving behavior analysis modeling are completed.

It should be noted that the edge node includes a microcomputer platform with a certain computing power, the microcomputer platform includes but is not limited to a microcomputer motherboard equipped with a Linux system and a GPU device with a certain computing power, the collected multisource traffic data can be processed on the microcomputer platform, effective analysis of the multisource traffic data is realized by a multimodal data fusion technology, and a traffic running state model of a single intersection under the edge node is further established.

And the single intersection signal light timing module 202 is configured to obtain a traffic signal light timing optimization scheme corresponding to an intersection by using a reinforcement learning method based on the traffic running state model and a preset initial signal light timing scheme, and upload the traffic signal light timing optimization scheme to the cloud center.

And the coordination optimization module 203 is used for coordinating and optimizing the traffic signal lamp timing scheme of each intersection in the area by adopting a multi-agent reinforcement learning method in combination with the cloud center and other edge nodes in the area.

Specifically, there is information exchange and data communication between edge computing agents, agents and cloud computing centers. After calculating the traffic signal lamp timing scheme of the current intersection, the intelligent agent locally retains data and uploads the data to the cloud computing center, so that the cloud computing center is assisted to coordinate and optimize the intelligent agent groups in the area, and the optimization operation of the traffic passing condition of the intersection in the area is further realized.

The invention adopts a multi-agent reinforcement learning method, the state and the action of a single agent in the reinforcement learning model are respectively replaced by the joint state and the joint action in the dynamic random environment, the value function of the equilibrium strategy is estimated in each game strategy stage, the simultaneous strategy among a plurality of agents is realized, and the optimal strategy is approached by repeated iteration in such a way, so that the unique equilibrium in the regional environment is searched. The Q value function of the multi-agent action linkage is obtained as follows:

wherein S ∈ S is S¹,...,s^NA ∈ A ═ a¹,...a^NRepresenting its joint motion vector; a is a joint action space; v_i(S) is a function of the state value of agent i in the joint state S; NE is a Nash equilibrium strategy; π (a) is the mixed strategy with uncertainty, i.e., the probability that N agents select a joint action.

Represents the Q function value of the agent i at the time k, k represents the kth time step, k +1 represents the k +1 time step, r_i ^kRepresenting the reward value of agent i at time k, and gamma representing the discount factor; pi represents the strategy learned by the reinforcement learning model; pi^*And representing the optimal strategy learned by the reinforcement learning model.

Representing the Q value of agent n at time k +1, n representing the nth agent.

The invention constructs an edge node group, forms a multi-agent edge computing network, forms direct association between intersections, outputs a traffic running state model of the intersection and a traffic signal lamp timing scheme of the current intersection, constructs a distributed trusted computing network between the intersections and a cloud center, and completes the traffic signal lamp coordination optimization timing control of the intersections in the area.

The regional road network traffic signal lamp coordination control system provided by the embodiment of the invention adopts edge calculation, so that the delay of system processing can be effectively reduced, the data transmission bandwidth is reduced, and the usability is improved. Because the intelligent control of the traffic signal lamp has the characteristics of real-time data acquisition, high time delay requirement and the like, the calculation and control process is difficult and disadvantageous to be deployed on a cloud end, and the problem can be better solved by adopting an edge calculation mode to complete the control process in the embodiment of the invention.

On the basis of the above embodiment, the traffic operation state modeling module 201 specifically includes:

On the basis of the above embodiment, the single-intersection signal lamp timing module 202 specifically includes:

and the judging unit is used for judging whether the traffic flow passing condition at the intersection is a conventional traffic flow state or a dynamic traffic flow state.

Firstly, the traffic flow passing conditions of the intersection are divided into a conventional traffic flow state and a dynamic traffic flow state caused by weather or traffic accidents. It should be noted that the conventional traffic state refers to normal traffic flow change of the intersection in one day, which includes peak and off-peak time periods in a day cycle, and traffic flow change of working days and off-working days in a week cycle; the dynamic traffic state refers to the dynamic surge of traffic flow at the intersection caused by the influence of weather such as rain, snow and the like and traffic accidents at the upstream and downstream.

Specifically, if the traffic flow passing condition at the intersection is the flow state of the conventional vehicle, based on the reinforcement learning algorithm, the traffic running state model obtained in the step S1 is used as the state space of the intelligent agent, and the initial signal lamp timing scheme is used as the action space of the intelligent agent; and constructing a reinforcement learning model taking the edge nodes as an intelligent agent by evaluating the traffic running state as a reward mechanism.

In this embodiment, under a normal-state traffic flow, the traffic light control of a single intersection is optimized by using a reinforcement learning method, the edge nodes are used as agents, and the state space of the agents, including the queuing length L of each intersection lane i, is obtained from the traffic running state model obtained in S1_iNumber of vehicles V in lane_iWaiting time W_iHerein, thisBesides, a traffic running state model in step S1 may also be extracted to obtain a traffic position waiting information pattern matrix at the intersection, and the state space further includes a current traffic signal phase Pc and a next traffic signal phase Pn. Here, the traffic signal phase is defined as: the successive timing of when one or several traffic streams obtain the exact same signal light color display at any time during a signal period is called a signal phase.

And setting an initial traffic signal lamp timing scheme, and taking the initial traffic signal lamp timing scheme as an action space of the intelligent agent on the basis. The traffic passing status obtained in step S1 is used as a reward mechanism of the agent, including the number l of all waiting vehicles in all lanes of the intersection, the total delay time D of all lanes in the intersection, and the delay time D of each lane_iThe calculation method of (2) is as follows:

where ls represents the average speed of the vehicle on the lane; sm denotes the maximum speed limit of the lane.

Defining the total waiting time of all lanes as W_jSpecifically, the waiting time is reset to zero after each vehicle movement, and is calculated by the following formula:

in the formula, W_jRepresents the total waiting time of all lanes; t represents a time step; vs represents the speed of the vehicle.

The calculation method of the reward mechanism is as follows:

wherein R represents a reward mechanism; l is_iRepresenting the queuing length of the intersection lane i; d_iIndicating a delay time for each lane; w_iRepresenting the waiting time of lane i; c is a traffic light control scheme switching instruction, N is the sum of all vehicles passing through the intersection within a time interval, T is the journey time of all vehicles passing through the intersection within the time interval, and l represents the number of all waiting vehicles on all lanes of the intersection; w is a₁～w₄The learning parameters are needed to strengthen the learning model.

Further, the aforesaid step S1 can obtain the current traffic operation status at the intersection, and the goal of the agent is to find a traffic light timing scheme that can maximize the reward mechanism based on the initial traffic light timing scheme. The reward mechanism is used as a result representation after the intelligent agent selects the traffic light timing scheme, and plays a role in exciting or punishing the traffic light timing scheme at the intersection.

And constructing a state space s taking the edge computing node as an intelligent agent and a reinforcement learning model taking the traffic signal lamp control scheme a as input by adopting a reinforcement learning method, wherein the output is an action function value under time t.

The obtained reinforcement learning model is as follows:

a is a preset traffic signal lamp timing scheme, Q is a function model updating formula, gamma is a discount factor, α is a learning rate, R is an incentive calculation method, and pi is a model strategy;

expressing Q function values under t time, a model strategy pi, a state space s and an action space a;

expressing Q function values under t +1 moment, model strategy pi, state space s and action space a; f is the following mode of the traffic vehicle. The reinforcement learning model includes the action reward and the maximum possible future reward at the next time t +1,here, f (q) is a vehicle following model in the normal traffic flow.

The reinforcement learning model includes action rewards and maximum possible future rewards at the next time t +1, where f (q) is a vehicle following model under normal traffic flow.

And if the traffic running state is the dynamic vehicle flow state, modifying the vehicle following mode g (Q) in the Q function on the basis of the reinforcement learning model to obtain a Q function value under the dynamic vehicle flow state so as to obtain the reinforcement learning model under the dynamic vehicle flow state. Here, the reinforcement learning model in the dynamic vehicle flow state is:

in this embodiment, the edge node obtains the state space of the agent from the traffic running state model by using a reinforcement learning method, takes the initial signal lamp timing scheme as the action space of the agent, takes the traffic running state of the intersection as the reward mechanism by evaluating, and optimizes the initial signal lamp timing scheme according to the algorithm to give the traffic signal lamp timing optimization result of the intersection.

Fig. 4 is a schematic flow chart of a regional road network traffic signal lamp coordination control method provided in an embodiment of the present invention, and referring to fig. 4, the method includes:

step 401, acquiring multi-source traffic data of a corresponding intersection, and establishing a traffic running state model; the multi-source traffic data comprises geomagnetic coil data, road video monitoring data, radar microwave data and floating car track data;

step 402, based on the traffic running state model and a preset initial signal lamp timing scheme, obtaining a traffic signal lamp timing optimization scheme corresponding to the intersection by adopting a reinforcement learning method, and uploading the traffic signal lamp timing optimization scheme to a cloud center;

and step 403, coordinating and optimizing the traffic signal lamp timing scheme of each intersection in the area by using a multi-agent reinforcement learning method in combination with the cloud center and other edge nodes in the area.

Referring to fig. 1 and 4, the execution subject of the method may be an edge node. The regional road network traffic signal lamp coordination control method provided by the embodiment of the present invention can be specifically executed by the edge node in the regional road network traffic signal lamp coordination control system, and since the regional road network traffic signal lamp coordination control system and the edge node are described in detail in the above embodiment, the regional road network traffic signal lamp coordination control method is not described in detail here.

On the basis of the above embodiment, in step 401, the acquiring multi-source traffic data of the corresponding intersection and establishing a traffic running state model specifically include:

On the basis of the foregoing embodiment, in step 402, obtaining a traffic signal light timing optimization scheme corresponding to an intersection by using a reinforcement learning method based on the traffic running state model and a preset initial signal light timing scheme specifically includes:

The regional road network traffic signal lamp coordination control method provided by the embodiment of the invention can effectively reduce the delay of system processing, reduce the data transmission bandwidth and improve the usability by adopting edge calculation. Because the intelligent control of the traffic signal lamp has the characteristics of real-time data acquisition, high time delay requirement and the like, the calculation and control process is difficult and disadvantageous to be deployed on a cloud end, and the problem can be better solved by adopting an edge calculation mode to complete the control process in the embodiment of the invention. The intelligent independent edge node is set based on a cloud edge hybrid computing framework, multisource traffic data are processed by utilizing the computing power of the edge node, and the traffic running state is sensed. And the plurality of edge nodes and the cloud center jointly act, and a multi-agent reinforcement learning method is adopted to coordinate and optimize the traffic signal lamp timing scheme of each intersection in the area. The problem of urban traffic jam can be effectively solved, and the passing efficiency of motor vehicles at the intersection is improved.

An embodiment of the present invention provides an electronic device, as shown in fig. 5, where the electronic device may include: a processor (processor)501, a communication Interface (Communications Interface)502, a memory (memory)503, and a communication bus 504, wherein the processor 501, the communication Interface 502, and the memory 503 are configured to communicate with each other via the communication bus 504. The processor 501 may call the logic instructions in the memory 503 to execute the method for coordinating and controlling the regional road network traffic signal lights provided by the above embodiments, for example, the method includes: acquiring multi-source traffic data of corresponding intersections, and establishing a traffic running state model; the multi-source traffic data comprises geomagnetic coil data, road video monitoring data, radar microwave data and floating car track data; based on the traffic running state model and a preset initial signal lamp timing scheme, acquiring a traffic signal lamp timing optimization scheme corresponding to the intersection by adopting a reinforcement learning method, and uploading the traffic signal lamp timing optimization scheme to a cloud center; and coordinating and optimizing the traffic signal lamp timing scheme of each intersection in the area by combining the cloud center and other edge nodes in the area and adopting a multi-agent reinforcement learning method.

An embodiment of the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program is implemented to perform the method for coordinating and controlling regional road network traffic signal lights provided in the foregoing embodiments when executed by a processor, for example, the method includes: acquiring multi-source traffic data of corresponding intersections, and establishing a traffic running state model; the multi-source traffic data comprises geomagnetic coil data, road video monitoring data, radar microwave data and floating car track data; based on the traffic running state model and a preset initial signal lamp timing scheme, acquiring a traffic signal lamp timing optimization scheme corresponding to the intersection by adopting a reinforcement learning method, and uploading the traffic signal lamp timing optimization scheme to a cloud center; and coordinating and optimizing the traffic signal lamp timing scheme of each intersection in the area by combining the cloud center and other edge nodes in the area and adopting a multi-agent reinforcement learning method.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. The regional road network traffic signal lamp coordination control system is characterized by comprising a cloud center, edge nodes and multi-source traffic data acquisition equipment, wherein the edge nodes and the multi-source traffic data acquisition equipment are arranged at each intersection in a region;

the edge node includes:

2. The regional road network traffic signal lamp coordination control system according to claim 1, wherein said traffic operation state modeling module specifically comprises:

3. The system of claim 2, wherein the single intersection signal lamp timing module specifically comprises:

4. The coordinated control system of traffic signal lights in regional road network as claimed in claim 3, wherein said multi-agent reinforcement learning method comprises:

5. The regional road network traffic signal lamp coordination control method of the regional road network traffic signal lamp coordination control system according to any one of claims 1 to 4, comprising:

6. The regional road network traffic signal lamp coordination control method according to claim 5, wherein the obtaining of the multi-source traffic data of the corresponding intersection and the establishment of the traffic running state model specifically comprise:

7. The regional road network traffic signal lamp coordination control method according to claim 5, wherein the obtaining of the traffic signal lamp timing optimization scheme corresponding to the intersection by adopting a reinforcement learning method based on the traffic running state model and a preset initial signal lamp timing scheme specifically comprises:

8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor when executing said program performs the steps of the regional road network traffic signal light coordination control method according to any one of claims 6 to 7.

9. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the steps of the regional road network traffic signal light coordination control method according to any of claims 6 to 7.