CN115512554B

CN115512554B - Parameter model training and traffic signal control method, device, equipment and medium

Info

Publication number: CN115512554B
Application number: CN202211071604.6A
Authority: CN
Inventors: 曾宏生; 周波; 王泽隆; 王凡; 陈永锋; 何径舟
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2022-09-02
Filing date: 2022-09-02
Publication date: 2023-07-28
Anticipated expiration: 2042-09-02
Also published as: CN115512554A

Abstract

The disclosure provides a parameter model training and traffic signal control method, device, equipment and medium, and relates to the fields of deep learning, intelligent traffic and the like. The specific implementation scheme is as follows: inputting the traffic flow operation information into an initial parameter model of a signal lamp control strategy to obtain first weights of at least two pieces of operation information in the traffic flow operation information; determining a first control parameter of a signal lamp on each road according to traffic flow operation information by adopting a signal lamp control strategy based on first weights of at least two pieces of operation information; responding to the first control parameters of the signal lamps to control the signal lamps, and acquiring first driving data of a plurality of vehicles on each road; determining a target index value according to the first driving data; and training the initial parameter model according to the target index value. Therefore, the control parameters of the signal lamps are predicted based on the trained initial parameter model, the reliability of the prediction result can be improved, each signal lamp is further controlled according to the reliable control parameters, and the traffic efficiency can be improved.

Description

Parameter model training and traffic signal control method, device, equipment and medium

Technical Field

The disclosure relates to the field of artificial intelligence, in particular to the technical fields of deep learning, intelligent traffic and the like, and particularly relates to a parameter model training and traffic signal control method, device, equipment and medium.

Background

With the continuous growth of urban population and vehicles, urban traffic networks often have congestion problems, and traffic congestion brings about serious pollution and economic cost. The traffic signal lamp is controlled, for example, the period duration of the traffic signal lamp and the time distribution of different signal phases are controlled, so that the traffic efficiency can be improved, and the traffic jam can be relieved.

Disclosure of Invention

The present disclosure provides a parametric model training and traffic signal control method, apparatus, device and medium.

According to an aspect of the present disclosure, there is provided a parametric model training method of a signal lamp control strategy, including:

acquiring traffic flow operation information on each road in a first setting area, and inputting the traffic flow operation information into an initial parameter model of a signal lamp control strategy so as to determine first weights of at least two items of operation information in the traffic flow operation information according to output of the initial parameter model;

determining a first control parameter of a signal lamp on each road according to the traffic flow operation information by adopting the signal lamp control strategy based on the first weights of the at least two pieces of operation information;

Responding to each signal lamp according to a first control parameter of each signal lamp, and acquiring driving data of a plurality of first vehicles on each road to obtain first driving data of the plurality of first vehicles;

determining a target index value according to first driving data of the first vehicles, wherein the target index value is used for indicating the passing efficiency of the first vehicles on each road;

and training the initial parameter model according to the target index value to obtain a target parameter model.

According to another aspect of the present disclosure, there is provided a traffic signal control method including:

acquiring traffic flow operation information on each road in a second setting area;

inputting the traffic flow operation information into a target parameter model of a signal lamp control strategy to determine weights of at least two pieces of operation information in the traffic flow operation information according to the output of the target parameter model;

determining control parameters of signal lamps on each road by adopting the signal lamp control strategy according to the traffic flow operation information based on the weights of the at least two pieces of operation information;

and controlling each signal lamp according to the control parameters of each signal lamp.

According to still another aspect of the present disclosure, there is provided a parametric model training apparatus of a signal lamp control strategy, including:

the acquisition module is used for acquiring traffic flow operation information on each road in the first setting area;

the first determining module is used for inputting the traffic flow operation information into an initial parameter model of a signal lamp control strategy so as to determine first weights of at least two items of operation information in the traffic flow operation information according to the output of the initial parameter model;

the second determining module is used for determining first control parameters of the signal lamps on each road according to the traffic flow operation information by adopting the signal lamp control strategy based on the first weights of the at least two pieces of operation information;

the acquisition module is used for responding to the first control parameters of the signal lamps to control the signal lamps, and acquiring driving data of a plurality of first vehicles on the roads to obtain first driving data of the plurality of first vehicles;

a third determining module, configured to determine a target index value according to first driving data of the plurality of first vehicles, where the target index value is used to indicate traffic efficiency of the plurality of first vehicles on each road;

And the training module is used for training the initial parameter model according to the target index value to obtain a target parameter model.

According to still another aspect of the present disclosure, there is provided a traffic signal control apparatus including:

the acquisition module is used for acquiring traffic flow operation information on each road in the second setting area;

the first determining module is used for inputting the traffic flow operation information into a target parameter model of a signal lamp control strategy so as to determine the weight of at least two items of operation information in the traffic flow operation information according to the output of the target parameter model;

the second determining module is used for determining control parameters of the signal lamps on each road according to the traffic flow operation information by adopting the signal lamp control strategy based on the weights of the at least two pieces of operation information;

and the control module is used for controlling each signal lamp according to the control parameters of each signal lamp.

According to still another aspect of the present disclosure, there is provided an electronic apparatus including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the parametric model training method of the traffic light control strategy set forth in one aspect of the disclosure or to perform the traffic signal control method set forth in another aspect of the disclosure.

According to still another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium of computer instructions for causing the computer to perform the parametric model training method of the traffic light control strategy set forth in the above aspect of the present disclosure, or to perform the traffic signal control method set forth in the above aspect of the present disclosure.

According to yet another aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the parametric model training method of the traffic light control strategy set forth in the above aspect of the present disclosure, or implements the traffic signal control method set forth in the above aspect of the present disclosure.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is a schematic diagram of a configuration of an intelligent traffic signal control system;

FIG. 2 is a diagram of an intelligent traffic signal control system architecture employing a single point adaptive control algorithm

FIG. 3 is a flowchart of a method for training a parametric model of a signal lamp control strategy according to an embodiment of the disclosure;

fig. 4 is a flow chart of a parametric model training method of a signal lamp control strategy according to a second embodiment of the disclosure;

fig. 5 is a flow chart of a parametric model training method of a signal lamp control strategy according to a third embodiment of the disclosure;

fig. 6 is a flowchart of a parametric model training method of a signal lamp control strategy according to a fourth embodiment of the present disclosure;

fig. 7 is a flow chart of a traffic signal control method according to a fifth embodiment of the disclosure;

fig. 8 is a schematic architecture diagram of a signal lamp control system according to an embodiment of the disclosure;

FIG. 9 is a schematic diagram of an update flow of an evolutionary strategy algorithm provided in an embodiment of the disclosure;

FIG. 10 is a schematic view of a deployment flow of a traffic scene provided by an embodiment of the present disclosure;

fig. 11 is a schematic structural diagram of a parametric model training device for a signal lamp control strategy according to a sixth embodiment of the present disclosure;

fig. 12 is a schematic structural diagram of a traffic signal control device according to a seventh embodiment of the disclosure;

FIG. 13 illustrates a schematic block diagram of an example electronic device that may be used to implement embodiments of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

At present, the structure of the intelligent traffic signal lamp control system can be shown in fig. 1, wherein traffic flow/queuing information in a traffic scene can be acquired, and the traffic flow/queuing information is input into an intelligent control model to obtain control parameters (including cycle duration of the signal lamp and time distribution of different signal phases) of the signal lamp output by the intelligent control model, so that the traffic signal lamp can be dynamically controlled according to the control parameters of the signal lamp, and traffic efficiency is improved.

In the related art, the dynamic signal lamp control algorithm is mainly an optimization scheme for solving based on an operation research method, for example, a single-point adaptive control algorithm, and the structure of an intelligent traffic signal lamp control system applying the single-point adaptive control algorithm can be shown in fig. 2. The single-point adaptive control algorithm is mainly based on a green-signal-ratio equalization optimization model (i.e. a prediction module in fig. 2) and an optimal period optimization model (i.e. an optimization module in fig. 2), and solves control parameters (including period duration and allocation duration of different phases) of an optimal signal lamp according to real-time traffic flow information and queuing information.

The green-signal ratio refers to the proportion time available for vehicles to pass in one period of the traffic signal lamp, namely the ratio of the effective green-light duration of one signal phase to the total period duration.

In addition, methods based on reinforcement learning and neural networks have recently been tried to be applied to intelligent traffic signal lamp control systems, generally, by constructing a simulation environment (including modeling of road network, signal lamp, traffic flow data, etc.) of urban level traffic, then constructing a policy model based on the neural network to receive traffic observation states and output actions controlled by the signal lamp, and based on reinforcement learning algorithms (for example, DDPG (Deep Deterministic Policy Gradient, depth deterministic policy gradient), SAC (Soft Actor-Critic, a depth reinforcement learning algorithm), etc.), iterating and updating the policy network on the traffic simulator, and finally deploying the policy model trained by the traffic simulator into a real traffic scene.

However, the single-point adaptive control algorithm (operation research method) needs to set some super parameters according to manual experience, such as the weight ratio of traffic flow and queuing value in the comprehensive flow value, and adjust the super parameters according to the control effect of the traffic scene. The manner of manually setting the superparameter mainly has the following two problems:

First, when the number of intersections where the superparameter needs to be set is large, the manual superparameter setting method needs to be adjusted with a large amount of labor cost, and the superparameter is usually a value with a good effect and is not an optimal solution of the model.

Secondly, when the traffic state has large variation range in a short time, the manually set super parameters cannot be rapidly and flexibly changed, so that the solution obtained by solving the model is not an optimal solution.

Whereas reinforcement learning based methods suffer from the following drawbacks:

1. because the reinforcement learning algorithm has poor robustness and insufficient generalization capability, the model trained by the simulation environment is not easy to directly migrate into a real traffic scene.

2. Not well fused with some other policy and rule constraints on the line.

In view of at least one of the above problems, the present disclosure proposes a parametric model training and traffic signal control method, apparatus, device, and medium.

The parametric model training and traffic signal control methods, apparatuses, devices and media of embodiments of the present disclosure are described below with reference to the accompanying drawings.

Fig. 3 is a flowchart of a parametric model training method of a signal lamp control strategy according to an embodiment of the disclosure.

The embodiment of the disclosure is illustrated by the method for training the parameter model of the signal lamp control strategy being configured in the device for training the parameter model of the signal lamp control strategy, and the device for training the parameter model of the signal lamp control strategy can be applied to any electronic equipment, so that the electronic equipment can execute the function of training the parameter model of the signal lamp control strategy.

The electronic device may be any device with computing capability, for example, may be a personal computer, a mobile terminal, a server, and the like, and the mobile terminal may be, for example, a mobile phone, a tablet computer, a personal digital assistant, a wearable device, and the like, which have hardware devices with various operating systems, touch screens, and/or display screens.

As shown in fig. 3, the parametric model training method of the signal lamp control strategy may include the following steps:

step 301, obtaining traffic flow operation information on each road in the first setting area, and inputting the traffic flow operation information into an initial parameter model of a signal lamp control strategy, so as to determine first weights of at least two items of operation information in the traffic flow operation information according to output of the initial parameter model.

In the embodiment of the present disclosure, the first setting area is a preset area, for example, may be a city, a province, or the like. That is, in the present disclosure, the division granularity of the first setting area is not limited, and may be country, town, county, district, city, province, country, or the like.

In the embodiment of the disclosure, information collection may be performed on vehicles on each road in the first setting area to obtain traffic flow operation information, where the traffic flow operation information may include traffic flow information, queuing information of the vehicles, traffic flow information of the vehicles, running track of the vehicles, and other operation information.

In the disclosed embodiments, the signal control strategy may include, but is not limited to, a control strategy such as a dynamic signal control algorithm (e.g., a single point adaptive control algorithm).

In an embodiment of the present disclosure, an initial parametric model of a signal control strategy is used to generate parameters (or super-parameters) related to the signal control strategy.

In the embodiment of the present disclosure, the traffic flow operation information may be input into an initial parameter model of the signal lamp control policy, so as to obtain an output of the initial parameter model, where the output of the initial parameter model is used to indicate first weights of at least two items of operation information in the traffic flow operation information, so in the present disclosure, the first weights of at least two items of operation information in the traffic flow operation information, for example, a weight of traffic flow information, a weight of queuing information, and the like, may be determined according to the output of the initial parameter model.

Step 302, determining a first control parameter of a signal lamp on each road according to the traffic flow operation information by adopting a signal lamp control strategy based on first weights of at least two items of operation information.

Wherein the first control parameter of each signal lamp includes, but is not limited to, control parameters such as period duration of the signal lamp, time allocation of different signal phases, and the like.

In the embodiment of the disclosure, a signal lamp control strategy can be adopted to determine a first control parameter of a signal lamp on each road according to traffic flow operation information based on first weights of at least two pieces of operation information.

Taking the signal lamp control strategy as a single-point adaptive control algorithm for example, the initial parameter model can be used for generating super parameters (such as the weight of traffic flow information (i.e. the traffic flow weight), the weight of queuing information (i.e. the queuing numerical weight) and the like) of the single-point adaptive control algorithm, so that the first control parameters of the signal lamps on each road can be determined according to traffic flow operation information by adopting the single-point adaptive control algorithm based on the super parameters of the single-point adaptive control algorithm.

Step 303, in response to controlling each signal lamp according to the first control parameter of each signal lamp, acquiring driving data of a plurality of first vehicles on each road, so as to obtain first driving data of the plurality of first vehicles.

In the embodiment of the disclosure, the corresponding signal lamp may be controlled according to the first control parameter of each signal lamp, and the driving data of the plurality of first vehicles on each road may be collected, so as to obtain the first driving data of the plurality of first vehicles. The first travel data may include a track point at which the first vehicle travels, position information of each track point, a travel time stamp of each track point, and the like.

In step 304, a target index value is determined according to the first driving data of the first vehicles, wherein the target index value is used for indicating the passing efficiency of the first vehicles on each road.

In the embodiment of the disclosure, a target index value may be determined according to first driving data of the plurality of first vehicles, where the target index value is used to indicate traffic efficiency of the plurality of first vehicles on each road. For example, the higher the passing efficiency of the plurality of first vehicles on each road, the larger the target index value, whereas the lower the passing efficiency of the plurality of first vehicles on each road, the smaller the target index value.

Step 305, training the initial parameter model according to the target index value to obtain a target parameter model.

In the embodiment of the disclosure, the initial parameter model may be trained according to the target index value to obtain the target parameter model.

As an application scenario, road information of each road and signal lamp information (which is subsequently recorded as road network topology) on each road in a first setting area (for example, a city to be optimized) may be obtained, traffic flow operation data (including a vehicle driving track and the like) in a set period of time may be obtained, a traffic simulator may be constructed according to the road network topology, and the traffic flow operation data may be loaded on the traffic simulator, so as to restore real traffic operation.

And the signal lamp control strategy (such as a single-point self-adaptive control algorithm) on the line can be accessed into the traffic simulator, so that the signal lamps of the traffic simulator are controlled according to the control parameters of the signal lamps output by the signal lamp control strategy. The related parameters (such as characteristic combination weight parameters of vehicles and queuing characteristics) of the single-point self-adaptive control algorithm, which are originally set by artificial experience or expert domain knowledge, can be predicted by an initial parameter model, and the input of the initial parameter model can be real-time road condition information (such as traffic flow, queuing and other information in traffic flow operation data), wherein a target index value can be determined according to the parameters of average delay and the like of the vehicles fed back by a traffic simulator, and the initial parameter model is trained according to the target index value so as to maximize the average traffic efficiency throughout the day.

According to the parameter model training method of the signal lamp control strategy, the traffic flow operation information on each road in the first setting area is input into the initial parameter model of the signal lamp control strategy, so that the first weight of at least two pieces of operation information in the traffic flow operation information is determined according to the output of the initial parameter model; determining a first control parameter of a signal lamp on each road according to traffic flow operation information by adopting a signal lamp control strategy based on first weights of at least two pieces of operation information; responding to the first control parameters of the signal lamps to control the signal lamps, and acquiring driving data of a plurality of first vehicles on each road to obtain first driving data of the first vehicles; determining a target index value according to first driving data of the first vehicles, wherein the target index value is used for indicating the passing efficiency of the first vehicles on each road; and training the initial parameter model according to the target index value to obtain a target parameter model. Therefore, the initial parameter model of the signal lamp control strategy can be trained based on the deep learning technology, so that the control parameters of the signal lamps are predicted based on the trained initial parameter model, the accuracy and reliability of a prediction result can be improved, each signal lamp is controlled according to the reliable control parameters, and the traffic efficiency can be improved.

It should be noted that, in the technical solution of the present disclosure, the related processes of collecting, storing, using, processing, transmitting, providing, disclosing, etc. of the personal information of the user are all performed on the premise of proving the consent of the user, and all conform to the rules of the related laws and regulations, and do not violate the popular regulations of the public order.

In order to clearly explain how the target index value is determined according to the first driving data of the plurality of first vehicles in the above embodiments of the present disclosure, the present disclosure also proposes a parametric model training method of a signal lamp control strategy.

Fig. 4 is a flowchart of a parametric model training method of a signal lamp control strategy according to a second embodiment of the disclosure.

As shown in fig. 4, the parametric model training method of the signal lamp control strategy may include the following steps:

step 401, obtaining traffic flow operation information on each road in the first setting area, and inputting the traffic flow operation information into an initial parameter model of a signal lamp control strategy, so as to determine first weights of at least two items of operation information in the traffic flow operation information according to output of the initial parameter model.

Step 402, determining a first control parameter of a signal lamp on each road according to the traffic flow operation information by adopting a signal lamp control strategy based on first weights of at least two items of operation information.

Step 403, in response to controlling each signal lamp according to the first control parameter of each signal lamp, acquiring driving data of a plurality of first vehicles on each road, so as to obtain first driving data of the plurality of first vehicles.

The explanation of steps 401 to 403 may be referred to the relevant description in any embodiment of the present disclosure, and will not be repeated here.

Step 404, for any one of the plurality of first vehicles, determining a driving start point and a driving end point of the first vehicle, and an actual driving duration and an actual driving distance of the first vehicle from the driving start point to the driving end point according to first driving data of the first vehicle.

In the embodiment of the disclosure, for any one of the plurality of first vehicles, the driving start point and the driving end point of the first vehicle, and the actual driving duration and the actual driving distance of the first vehicle from the driving start point to the driving end point may be determined according to the first driving data of the first vehicle.

For example, the first travel data may include position information of each track point of the first vehicle travel and a travel time stamp of each track point, a travel start point and a travel end point may be determined from each track point, an actual travel time period may be determined according to a difference between the first travel time stamp corresponding to the travel start point and the second travel time stamp corresponding to the travel end point, and an actual travel distance may be determined according to position information of each track point between the travel start point and the travel end point.

Step 405, determining a reference travel duration of the first vehicle according to an actual travel distance between the travel start point and the travel end point.

In the embodiment of the present disclosure, the reference travel duration of the first vehicle may be determined according to an actual travel distance between the travel start point and the travel end point. The reference driving time length and the actual driving distance are in positive correlation, namely, the longer the actual driving distance is, the larger the reference driving time length is, otherwise, the shorter the actual driving distance is, and the smaller the reference driving time length is.

As an example, a reference vehicle speed may be preset, and the reference travel time period may be determined based on a ratio between the actual travel distance and the reference vehicle speed.

Step 406, determining a traffic delay time of the first vehicle according to the first difference between the actual running time and the reference running time.

In the embodiment of the present disclosure, the passage delay period of the first vehicle may be determined from a difference (noted as a first difference in the present disclosure, such as a difference value, an absolute value of the difference value, a square of the difference value, or the like) between the actual travel period and the reference travel period. The passing delay time length and the first difference are in positive correlation.

Step 407, determining a target index value according to the traffic delay time lengths of the plurality of first vehicles.

In the embodiment of the disclosure, the target index value may be determined according to the traffic delay time periods of the plurality of first vehicles.

As a possible implementation manner, the traffic delay durations of the plurality of first vehicles may be accumulated to obtain a first sum value, and the target index value is determined according to the first sum value, where the target index value and the first sum value have a negative correlation.

As another possible implementation manner, the traffic delay durations of the plurality of first vehicles may be weighted and summed to obtain a second sum value, and the target index value is determined according to the second sum value, where the target index value and the second sum value have a negative correlation.

As yet another possible implementation manner, a mean value of the traffic delay durations of the plurality of first vehicles may be determined, and a target index value is determined according to the mean value, where the target index value and the mean value have a negative correlation.

It should be understood that the longer the traffic delay time length of the vehicle is, the lower the traffic efficiency of the vehicle is, whereas the shorter the traffic delay time length of the vehicle is, the higher the traffic efficiency of the vehicle is, and the accuracy and reliability of the determination result can be improved by determining the target index value for indicating the traffic efficiency of the plurality of first vehicles on each road according to the average value of the traffic delay time lengths of the vehicles.

Of course, the target index value may also be determined based on other algorithms according to the traffic delay durations of the plurality of first vehicles, which is not limited by the present disclosure.

Therefore, the target index value can be determined based on different modes, and the flexibility and applicability of the method can be improved.

Step 408, training the initial parameter model according to the target index value to obtain a target parameter model.

In any of the embodiments of the present disclosure, the initial parametric model may be trained to maximize target index values based on the target index values.

It should be noted that, the foregoing example is only performed by taking the termination condition of model training as the maximization of the target index value, and other termination conditions may be set during practical application, for example, the training frequency reaches the set frequency, the training duration reaches the set duration, the target index value converges, and the disclosure is not limited to this.

Therefore, the termination condition of model training is that the target index value is maximized, so that the trained model can learn the optimal parameters, and the traffic light is controlled according to the optimal parameters, so that traffic efficiency can be improved.

According to the parameter model training method of the signal lamp control strategy, a running starting point and a running end point of a first vehicle are determined according to first running data of the first vehicle for any one of a plurality of first vehicles, and an actual running duration and an actual running distance from the running starting point to the running end point of the first vehicle are determined; determining a reference driving duration of the first vehicle according to the actual driving distance between the driving starting point and the driving end point; determining a traffic delay time of the first vehicle according to a first difference between the actual running time and the reference running time; and determining the target index value according to the traffic delay time of the plurality of first vehicles. In sum, the target index value can indicate the passing delay time and the passing efficiency of the vehicle, so that the initial parameter model is trained according to the target index value, the trained model can optimize the passing efficiency and the passing delay time of the vehicle, and the passing efficiency of traffic is improved.

In order to clearly illustrate how the initial parameter model is trained according to the target index value in any embodiment of the disclosure, the disclosure further provides a parameter model training method of the signal lamp control strategy.

Fig. 5 is a flowchart of a parametric model training method of a signal lamp control strategy according to a third embodiment of the present disclosure.

As shown in fig. 5, the parametric model training method of the signal lamp control strategy may include the following steps:

step 501, obtaining traffic flow operation information on each road in a first setting area, and inputting the traffic flow operation information into an initial parameter model of a signal lamp control strategy, so as to determine first weights of at least two items of operation information in the traffic flow operation information according to output of the initial parameter model.

Step 502, determining a first control parameter of a signal lamp on each road according to the traffic flow operation information by adopting a signal lamp control strategy based on first weights of at least two pieces of operation information.

In step 503, in response to controlling each signal lamp according to the first control parameter of each signal lamp, the driving data of the plurality of first vehicles on each road are collected, so as to obtain the first driving data of the plurality of first vehicles.

In step 504, a target index value is determined according to the first driving data of the first vehicles, wherein the target index value is used for indicating the passing efficiency of the first vehicles on each road.

The explanation of steps 501 to 504 may be referred to the relevant description in any embodiment of the disclosure, and will not be repeated here.

Step 505, a reference parameter model of the signal lamp control strategy is obtained, wherein the reference parameter model is obtained by adding noise to the initial parameter model.

In the embodiment of the disclosure, noise may be added to the initial parameter model to obtain a reference parameter model, where the noise is a vector, and the dimension of the noise is matched with or the same as the dimension of the model parameter in the initial parameter model.

The number of the reference parameter models may be one or may be multiple, which is not limited in this disclosure, for example, when the number of the reference parameter models is multiple, different noise may be added to the initial parameter model, so as to obtain multiple reference parameter models, that is, the noise added by each reference parameter model is different.

Step 506, inputting the traffic flow operation information into the reference parameter model to determine a second weight of at least two items of operation information in the traffic flow operation information according to the output of the reference parameter model.

In the embodiment of the disclosure, the traffic flow operation information may be input into the reference parameter model to obtain an output of the reference parameter model, where the output of the reference parameter model is used to indicate a second weight of at least two pieces of operation information in the traffic flow operation information, so in the disclosure, the second weight of at least two pieces of operation information in the traffic flow operation information, for example, a weight of traffic flow information, a weight of queuing information, and the like, may be determined according to the output of the reference parameter model.

And 507, determining a second control parameter of the signal lamp on each road according to the traffic flow operation information by adopting a signal lamp control strategy based on the second weights of at least two pieces of operation information.

Wherein the second control parameter of each signal lamp comprises, but is not limited to, control parameters such as period duration of the signal lamp, time allocation of different signal phases, and the like.

In the embodiment of the disclosure, a signal lamp control strategy can be adopted to determine the second control parameters of the signal lamps on each road according to the traffic flow operation information based on the second weights of at least two items of operation information. The specific implementation principle is similar to that of step 302, and will not be described here again.

And step 508, in response to controlling each signal lamp according to the second control parameter of each signal lamp, acquiring the driving data of the plurality of second vehicles on each road to obtain second driving data of the plurality of second vehicles.

Wherein the second vehicle may be the same as the first vehicle, or the second vehicle may be different from the first vehicle, as this disclosure is not limited in this regard.

In the embodiment of the disclosure, the corresponding signal lamp may be controlled according to the second control parameter of each signal lamp, and the driving data of the plurality of second vehicles on each road may be collected, so as to obtain the second driving data of the plurality of second vehicles. The second driving data may include a track point where the second vehicle is driving, position information of each track point, a driving time stamp of each track point, and the like.

Step 509, determining a reference index value according to second driving data of the plurality of second vehicles; the reference index value is used for indicating the passing efficiency of the plurality of second vehicles on each road.

In the embodiment of the disclosure, a reference index value may be determined according to the second traveling data of the plurality of second vehicles, wherein the reference index value is used to indicate the passing efficiency of the plurality of second vehicles on each road. For example, the higher the passing efficiency of the plurality of second vehicles on each road, the larger the reference index value, whereas the lower the passing efficiency of the plurality of second vehicles on each road, the smaller the reference index value. The specific implementation manner and the determination manner of the target index value are not described herein.

Step 510, training the initial parameter model according to the target index value and the reference index value.

In the embodiment of the disclosure, the initial parameter model may be trained according to the target index value and the reference index value.

According to the parameter model training method of the signal lamp control strategy, the mode of interaction between the reference parameter model based on noise addition and the initial parameter model can be achieved, the initial parameter model is guided to train, the training effect of the initial parameter model can be improved, and therefore the prediction accuracy of the model is improved.

In order to clearly explain how to train the initial parameter model according to the target index value and the reference index value in the above embodiment, the present disclosure also provides a parameter model training method of the signal lamp control strategy.

Fig. 6 is a flowchart of a parametric model training method of a signal lamp control strategy according to a fourth embodiment of the present disclosure.

As shown in fig. 6, the parametric model training method of the signal lamp control strategy may include the following steps:

step 601, obtaining traffic flow operation information on each road in a first setting area, and inputting the traffic flow operation information into an initial parameter model of a signal lamp control strategy, so as to determine first weights of at least two items of operation information in the traffic flow operation information according to output of the initial parameter model.

Step 602, determining a first control parameter of a signal lamp on each road according to the traffic flow operation information by adopting a signal lamp control strategy based on first weights of at least two items of operation information.

Step 603, in response to controlling each signal lamp according to the first control parameter of each signal lamp, collecting driving data of a plurality of first vehicles on each road, so as to obtain first driving data of the plurality of first vehicles.

In step 604, a target index value is determined according to the first driving data of the first vehicles, wherein the target index value is used for indicating the passing efficiency of the first vehicles on each road.

Step 605, a reference parameter model of the signal lamp control strategy is obtained, wherein the reference parameter model is obtained by adding noise to the initial parameter model.

Step 606, inputting the traffic flow operation information into the reference parameter model to determine a second weight of at least two items of operation information in the traffic flow operation information according to the output of the reference parameter model.

Step 607, determining a second control parameter of the signal lights on each road according to the traffic flow operation information by adopting the signal light control strategy based on the second weights of the at least two pieces of operation information.

And step 608, in response to controlling each signal lamp according to the second control parameter of each signal lamp, acquiring the driving data of the plurality of second vehicles on each road to obtain second driving data of the plurality of second vehicles.

Step 609, determining a reference index value according to second driving data of the plurality of second vehicles; the reference index value is used for indicating the passing efficiency of the plurality of second vehicles on each road.

The explanation of steps 601 to 609 may be referred to the relevant description in any embodiment of the present disclosure, and will not be repeated here.

Step 610, a second difference between the target index value and the reference index value is determined.

In embodiments of the present disclosure, a second difference (e.g., a difference value, an absolute value of the difference value, a square of the difference value, etc.) between the target index value and the reference index value may be determined.

In step 611, the second difference and the noise are fused to obtain fused data.

In the embodiment of the disclosure, the second difference and the noise may be fused to obtain fused data.

As a possible implementation, when the number of reference parameter models is one, since the noise is a vector and the second difference is a scalar, the second difference may be multiplied with the noise to obtain the fused data.

As another possible implementation manner, when the number of the reference parameter models is multiple, for each reference parameter model, the noise corresponding to the reference parameter model may be multiplied by the second difference to obtain intermediate data, so that the intermediate data of the multiple reference parameter models may be added to obtain the fusion data.

Therefore, the method for interacting the reference parameter model and the initial parameter model based on a plurality of added noise can be realized, the initial parameter model is guided to train, the training effect of the initial parameter model can be improved, and the prediction accuracy of the model is improved.

Step 612, updating the first weights of the at least two pieces of operation information according to the fusion data to obtain the third weights of the at least two pieces of operation information.

In the embodiment of the disclosure, the first weights of at least two pieces of operation information may be updated according to the fusion data, so as to obtain the third weights of at least two pieces of operation information.

For example, the output of the initial parametric model (for indicating the first weight of the at least two items of operational information) may be represented by a vector or vector, and the output of the initial parametric model may be added to the fused data (vector) to obtain target data for indicating the third weight of the at least two items of operational information.

Step 613, training the initial parametric model based on the third weights of the at least two pieces of operation information to obtain a target parametric model.

In an embodiment of the present disclosure, the initial parametric model may be trained based on a third weight of at least two items of operational information to obtain the target parametric model.

In one possible implementation manner of the embodiment of the present disclosure, a signal lamp control policy may be adopted to determine a third control parameter of a signal lamp on each road according to the traffic flow operation information based on the third weights of at least two items of operation information, and the specific implementation manner is the same as step 302, which is not described herein. Then, each signal lamp may be controlled according to the third control parameter of each signal lamp, and the driving data of the plurality of third vehicles on each road may be collected, so as to obtain third driving data of the plurality of third vehicles, where the third vehicles may be the same as the first vehicle and the second vehicle, or may be different from the first vehicle and the second vehicle, and the disclosure is not limited to this.

Then, an updated index value may be determined according to third driving data of the plurality of third vehicles, where the updated index value is used to indicate traffic efficiency of the plurality of third vehicles on each road, and a specific implementation manner is the same as a determination manner of the target index value, which is not described herein. Therefore, the initial parameter model can be trained according to the updated index value to obtain the target parameter model. Therefore, the method for interacting the reference parameter model and the initial parameter model based on noise addition can be realized to guide the initial parameter model to train, the training effect of the initial parameter model can be improved, and the prediction precision of the model is improved.

As a possible implementation, the initial parametric model may be trained to maximize the updated index values based on the updated index values.

It should be noted that, the foregoing example is only implemented by taking the termination condition of model training as the maximization of updating the index value, and other termination conditions may be set in practical application, for example, the training frequency reaches the set frequency, the training duration reaches the set duration, and the updating the index value converges, which is not limited in this disclosure.

Therefore, the model training termination condition is that the updated index value is maximized, so that the trained model can learn the optimal parameters, the signal lamp is controlled according to the optimal parameters, and the traffic efficiency of vehicles can be improved.

The above embodiments corresponding to the parameter model training method of the signal lamp control strategy, the disclosure further provides an application method of the traffic signal control model, namely a traffic signal control method.

Fig. 7 is a flow chart of a traffic signal control method according to a fifth embodiment of the disclosure.

As shown in fig. 7, the traffic signal control method may include the steps of:

step 701, acquiring traffic flow operation information on each road in the second setting area.

In the embodiment of the present disclosure, the second setting area is a preset area, for example, may be a city, a province, or the like. That is, in the present disclosure, the division granularity of the second setting area is not limited, and may be country, town, county, district, city, province, country, or the like.

The second setting area may be the same as the first setting area, or may be different from the first setting area, which is not limited in the present disclosure.

In the embodiment of the disclosure, information collection may be performed on the vehicles on each road in the second setting area to obtain traffic flow operation information, where the traffic flow operation information may include traffic flow information, queuing information of the vehicles, traffic flow information of the vehicles, running track of the vehicles, and other operation information.

Step 702, inputting the traffic flow operation information into a target parameter model of the signal lamp control strategy, so as to determine weights of at least two pieces of operation information in the traffic flow operation information according to output of the target parameter model.

Among other things, the signal control strategies may include, but are not limited to, control strategies such as dynamic signal control algorithms (e.g., single point adaptive control algorithms).

The target parameter model of the signal lamp control strategy can be trained by adopting any method embodiment.

In the embodiment of the disclosure, the traffic flow operation information may be input into a target parameter model of the signal lamp control policy to obtain an output of the target parameter model, where the output of the target parameter model is used to indicate weights of at least two items of operation information in the traffic flow operation information, so in the disclosure, the weights of at least two items of operation information in the traffic flow operation information, for example, weights of traffic flow information, weights of queuing information, and the like, may be determined according to the output of the initial parameter model.

Step 703, determining control parameters of signal lamps on each road according to the traffic flow operation information by adopting a signal lamp control strategy based on the weights of at least two pieces of operation information.

The control parameters of each signal lamp include, but are not limited to, control parameters such as period duration of the signal lamp, time distribution of different signal phases, and the like.

In the embodiment of the disclosure, the signal lamp control strategy can be adopted to determine the control parameters of the signal lamps on each road according to the traffic flow operation information based on the weights of at least two pieces of operation information.

Taking the signal lamp control strategy as a single-point adaptive control algorithm for example, the target parameter model can be used for generating super parameters (such as the weight of traffic flow information (i.e. traffic flow weight), the weight of queuing information (i.e. queuing numerical weight) and the like) of the single-point adaptive control algorithm, so that the control parameters of the signal lamps on each road can be determined according to traffic flow operation information by adopting the single-point adaptive control algorithm based on the super parameters of the single-point adaptive control algorithm.

Step 704, controlling each signal lamp according to the control parameters of each signal lamp.

In the embodiment of the disclosure, for each signal lamp, the signal lamp can be controlled according to the control parameter of the signal lamp.

According to the traffic signal control method, the traffic flow operation information on each road in the second setting area is input into a target parameter model of a signal lamp control strategy, so that the weight of at least two pieces of operation information in the traffic flow operation information is determined according to the output of the target parameter model; based on the weights of at least two pieces of operation information, determining control parameters of signal lamps on each road according to traffic flow operation information by adopting a signal lamp control strategy; and controlling each signal lamp according to the control parameters of each signal lamp. Therefore, the relevant parameters of the signal lamp control strategy are predicted based on the model, the relevant parameters (namely the weights) of the signal lamp control strategy are not required to be set by the engineering experience or expert domain knowledge, the labor cost can be reduced, the prediction precision of the parameters can be improved, the control parameters of the signal lamp are generated according to the parameters with higher precision, and the traffic efficiency can be improved.

In any one embodiment of the disclosure, taking a signal lamp control strategy as a single-point adaptive control algorithm for example, the architecture of a signal lamp control system may be as shown in fig. 8, specifically, may first collect traffic flow operation data (including a vehicle driving track and the like) of a road network topology (road information, signal lamp information and the like) of a city and a set period of time which need to be optimized; then, road network topology and traffic flow operation data can be loaded on the existing traffic simulator, so that real traffic operation is restored; in addition, a single-point self-adaptive control algorithm on the line can be connected into the traffic simulator, so that a signal lamp of the traffic simulator is controlled; for the related parameters (such as characteristic combination weight parameters of vehicles and queuing characteristics) of a single-point self-adaptive control algorithm originally set by artificial experience or expert domain knowledge, the related parameters are replaced by predictions based on a neural network model (marked as an initial parameter model in the disclosure), the input of the neural network model can be real-time road condition characteristics (such as traffic flow and queuing information), then the neural network model is optimized based on an evolution strategy algorithm, and the optimization target is the highest average throughout-day traffic efficiency (such as can be described by vehicle average time delay); the optimized neural network model (marked as a target parameter model in the disclosure) can be deployed into a real traffic scene for verification.

The method comprises the steps of determining a target index value (or called rewards) according to parameters such as average delay of a vehicle and the like fed back by a traffic simulator, and updating model parameters in an initial parameter model according to the target index value by combining an evolution strategy algorithm so as to maximize average traffic efficiency throughout the day, namely, the rewards.

The updating flow of the evolution strategy algorithm may be as shown in fig. 9, a plurality of (i.e., n+1) traffic simulators may be operated in parallel, traffic flow operation data in a fixed traffic operation time period (for example, a round of 4 hours) is sampled in each traffic simulator, then simulation operation is performed based on the model of the current iteration round and the model after noise is added, according to the obtained rewards respectively, the reward difference value is used as the final reward of the noise sampling, after collecting the noise and the corresponding rewards of the traffic simulators, updating the model parameters according to the evolution strategy algorithm, and then the next iteration is performed until the model converges.

Specifically, for each traffic simulator, the following steps may be performed:

inputting traffic flow operation data into a neural network model without noise, and determining a first control parameter of signal lamps on each road according to traffic flow operation information by adopting a single-point self-adaptive control algorithm according to parameters output by the neural network model A number of first vehicles on each road in the traffic simulator are acquired for driving data in response to controlling each signal lamp in the traffic simulator according to the first control parameter of each signal lamp to obtain first driving data of the first vehicles, and a target index value (i.e. a prize without adding noise, such as ep_len in fig. 9) is determined according to the first driving data of the first vehicles _originPolicy )。

And inputting the traffic flow operation data into a neural network model added with noise, determining a second control parameter of signal lamps on each road according to the traffic flow operation information by adopting a single-point self-adaptive control algorithm according to the parameters output by the neural network model, responding to each signal lamp in the traffic simulator controlled according to the second control parameter of each signal lamp, acquiring driving data of a plurality of second vehicles on each road in the traffic simulator to obtain second driving data of the plurality of second vehicles, determining a reference index value (namely, adding the rewards of the noise according to the second driving data of the plurality of second vehicles, such as EP_LEN in fig. 9) _NosisyPolicy )。

The difference between the reference index value and the target index value (i.e., ep_len in fig. 9 _NosisyPolicy -EP_LEN _originPolicy ) As the final reward for this noise sampling (i.e. R in FIG. 8 ₀ 、R ₁ 、…、R _n ) R is taken as ₀ Noise ₀ Multiply +R ₁ Noise ₁ Multiply by + … +R _n Noise _n Obtaining fusion data, adding the fusion data and the output of the neural network model without noise to obtain updated parameters, determining third control parameters of signal lamps on each road according to traffic flow operation information by adopting a single-point self-adaptive control algorithm according to the updated parameters, responding to the third control parameters of the signal lamps to control the signal lamps in the traffic simulator, acquiring driving data of a plurality of third vehicles on each road in the traffic simulator to obtain third driving data of the plurality of third vehicles, determining an update index value according to the third driving data of the plurality of third vehicles, training the neural network model without noise according to the update index value untilThe model converges.

After the neural network model is trained by the traffic simulator, the trained neural network model can be deployed to a traffic site, and after road condition data of the traffic site is input to the neural network model, super parameters corresponding to the single-point self-adaptive control algorithm are output, so that the control effect of the single-point self-adaptive control algorithm is adjusted. For example, a deployment flow for a traffic scene may be as shown in fig. 10.

In summary, compared with a traffic signal lamp control system based on operation research, in the present disclosure, no human experience or expert knowledge is needed, and the super parameters of the single-point adaptive control algorithm can be adjusted in real time according to different traffic conditions, so as to further improve the control effect of the algorithm. In addition, compared with a traffic light control system based on pure reinforcement learning of a traffic simulator, the traffic light control system disclosed by the invention is controlled by combining with the existing operation study method without completely relying on a neural network, and has better generalization.

Corresponding to the above-mentioned parametric model training method of the signal lamp control strategy provided by the embodiment of fig. 3 to 6, the present disclosure further provides a parametric model training device of the signal lamp control strategy, and since the parametric model training device of the signal lamp control strategy provided by the embodiment of the present disclosure corresponds to the parametric model training method of the signal lamp control strategy provided by the embodiment of fig. 2 to 7, the implementation of the parametric model training method of the signal lamp control strategy is also applicable to the parametric model training device of the signal lamp control strategy provided by the embodiment of the present disclosure, which is not described in detail in the embodiment of the present disclosure.

Fig. 11 is a schematic structural diagram of a parametric model training device for a signal lamp control strategy according to a sixth embodiment of the present disclosure.

As shown in fig. 11, the parametric model training apparatus 1100 of the traffic light control strategy may include: an acquisition module 1101, a first determination module 1102, a second determination module 1103, an acquisition module 1104, a third determination module 1105, and a training module 1106.

The obtaining module 1101 is configured to obtain traffic flow operation information on each road in the first setting area.

The first determining module 1102 is configured to input the traffic flow operation information into an initial parameter model of the signal lamp control policy, so as to determine a first weight of at least two pieces of operation information in the traffic flow operation information according to an output of the initial parameter model.

The second determining module 1103 is configured to determine, according to the traffic flow operation information, a first control parameter of a signal lamp on each road by using a signal lamp control policy based on the first weights of the at least two pieces of operation information.

The acquisition module 1104 is configured to acquire driving data of a plurality of first vehicles on each road in response to controlling each signal according to a first control parameter of each signal, so as to obtain first driving data of the plurality of first vehicles.

The third determining module 1105 is configured to determine a target index value according to first driving data of the plurality of first vehicles, where the target index value is used to indicate traffic efficiency of the plurality of first vehicles on each road.

The training module 1106 is configured to train the initial parameter model according to the target index value to obtain a target parameter model.

In a possible implementation of an embodiment of the present disclosure, the third determining module 1105 is configured to: determining, for any one of the plurality of first vehicles, a travel start point and a travel end point of the first vehicle, and an actual travel duration and an actual travel distance of the first vehicle from the travel start point to the travel end point, according to first travel data of the first vehicle; determining a reference driving duration of the first vehicle according to the actual driving distance between the driving starting point and the driving end point; determining a traffic delay time of the first vehicle according to a first difference between the actual running time and the reference running time; and determining the target index value according to the traffic delay time of the plurality of first vehicles.

In a possible implementation of an embodiment of the present disclosure, the third determining module 1105 is configured to: determining an average value of the traffic delay time periods of the plurality of first vehicles; and determining a target index value according to the average value, wherein the target index value and the average value are in a negative correlation.

In one possible implementation of the embodiments of the present disclosure, training module 1106 is configured to: obtaining a reference parameter model of a signal lamp control strategy, wherein the reference parameter model is obtained by adding noise to an initial parameter model; inputting the traffic flow operation information into a reference parameter model to determine second weights of at least two pieces of operation information in the traffic flow operation information according to the output of the reference parameter model; determining second control parameters of signal lamps on each road according to the traffic flow operation information by adopting a signal lamp control strategy based on second weights of at least two pieces of operation information; responding to the second control parameters of the signal lamps to control the signal lamps, and acquiring driving data of a plurality of second vehicles on each road to obtain second driving data of the second vehicles; determining a reference index value according to second driving data of the plurality of second vehicles; the reference index value is used for indicating the passing efficiency of the plurality of second vehicles on each road; and training the initial parameter model according to the target index value and the reference index value.

In one possible implementation of the embodiments of the present disclosure, training module 1106 is configured to: determining a second difference between the target index value and the reference index value; fusing the second difference with noise to obtain fused data; updating the first weights of the at least two pieces of operation information according to the fusion data to obtain the third weights of the at least two pieces of operation information; training the initial parameter model based on the third weights of the at least two items of operation information to obtain a target parameter model.

In one possible implementation of the embodiments of the present disclosure, training module 1106 is configured to: determining third control parameters of signal lamps on each road according to the traffic flow operation information by adopting a signal lamp control strategy based on third weights of at least two pieces of operation information; responding to the third control parameters of the signal lamps to control the signal lamps, and collecting running data of a plurality of third vehicles on each road to obtain third running data of the third vehicles; determining an update index value according to third traveling data of a plurality of third vehicles; the updating index value is used for indicating the passing efficiency of a plurality of third vehicles on each road; training the initial parameter model according to the updated index value to obtain the target parameter model.

In one possible implementation manner of the embodiment of the disclosure, the reference parameter models are multiple, and the multiple reference parameter models are obtained by adding different noises to the initial parameter model; training module 1106 to: multiplying the corresponding first difference and noise by any reference parameter model to obtain intermediate data; and adding the intermediate data of the multiple reference parameter models to obtain fusion data.

In one possible implementation of the embodiments of the present disclosure, training module 1106 is configured to: and training the initial parameter model according to the updated index value so as to maximize the updated index value.

In one possible implementation of the embodiments of the present disclosure, training module 1106 is configured to: and training the initial parameter model according to the target index value so as to maximize the target index value.

According to the parameter model training device of the signal lamp control strategy, the traffic flow operation information on each road in the first setting area is input into the initial parameter model of the signal lamp control strategy, so that the first weight of at least two pieces of operation information in the traffic flow operation information is determined according to the output of the initial parameter model; determining a first control parameter of a signal lamp on each road according to traffic flow operation information by adopting a signal lamp control strategy based on first weights of at least two pieces of operation information; responding to the first control parameters of the signal lamps to control the signal lamps, and acquiring driving data of a plurality of first vehicles on each road to obtain first driving data of the first vehicles; determining a target index value according to first driving data of the first vehicles, wherein the target index value is used for indicating the passing efficiency of the first vehicles on each road; and training the initial parameter model according to the target index value to obtain a target parameter model. Therefore, the initial parameter model of the signal lamp control strategy can be trained based on the deep learning technology, so that the control parameters of the signal lamps are predicted based on the trained initial parameter model, the accuracy and reliability of a prediction result can be improved, each signal lamp is controlled according to the reliable control parameters, and the traffic efficiency can be improved.

Corresponding to the traffic signal control method provided by the embodiment of fig. 7, the present disclosure also provides a traffic signal control device, and since the traffic signal control device provided by the embodiment of the present disclosure corresponds to the traffic signal control method provided by the embodiment of fig. 7, the implementation of the traffic signal control method is also applicable to the traffic signal control device provided by the embodiment of the present disclosure, which is not described in detail in the embodiment of the present disclosure.

Fig. 12 is a schematic structural diagram of a traffic signal control device according to a seventh embodiment of the disclosure.

As shown in fig. 12, the traffic signal control apparatus 1200 may include: an acquisition module 1201, a first determination module 1202, a second determination module 1203, and a control module 1204.

The acquiring module 1201 is configured to acquire traffic flow operation information on each road in the second setting area.

The first determining module 1202 is configured to input the traffic flow operation information into a target parameter model of the signal lamp control policy, so as to determine weights of at least two pieces of operation information in the traffic flow operation information according to output of the target parameter model.

The second determining module 1203 is configured to determine, based on the weights of the at least two pieces of operation information, control parameters of the signal lights on each road according to the traffic flow operation information by using the signal light control policy.

And the control module 1204 is used for controlling each signal lamp according to the control parameters of each signal lamp.

According to the traffic signal control device, the traffic flow operation information on each road in the second setting area is input into a target parameter model of a signal lamp control strategy, so that the weight of at least two pieces of operation information in the traffic flow operation information is determined according to the output of the target parameter model; based on the weights of at least two pieces of operation information, determining control parameters of signal lamps on each road according to traffic flow operation information by adopting a signal lamp control strategy; and controlling each signal lamp according to the control parameters of each signal lamp. Therefore, the relevant parameters of the signal lamp control strategy are predicted based on the model, the relevant parameters (namely the weights) of the signal lamp control strategy are not required to be set by the engineering experience or expert domain knowledge, the labor cost can be reduced, the prediction precision of the parameters can be improved, the control parameters of the signal lamp are generated according to the parameters with higher precision, and the traffic efficiency can be improved.

To achieve the above embodiments, the present disclosure also provides an electronic device that may include at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a parametric model training method or a traffic signal control method of the traffic light control strategy according to any of the above embodiments of the present disclosure.

To implement the above embodiments, the present disclosure also provides a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the parametric model training method or the traffic signal control method of the traffic light control strategy set forth in any one of the above embodiments of the present disclosure.

To achieve the above embodiments, the present disclosure further provides a computer program product comprising a computer program which, when executed by a processor, implements a parametric model training method or a traffic signal control method of the traffic light control strategy set forth in any of the above embodiments of the present disclosure.

According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.

FIG. 13 illustrates a schematic block diagram of an example electronic device that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 13, the electronic device 1300 includes a computing unit 1301 that can perform various appropriate actions and processes according to a computer program stored in a ROM (Read-Only Memory) 1302 or a computer program loaded from a storage unit 1308 into a RAM (Random Access Memory, random access/Memory) 1303. In the RAM 1303, various programs and data required for the operation of the electronic device 1300 can also be stored. The computing unit 1301, the ROM 1302, and the RAM 1303 are connected to each other through a bus 1304. An I/O (Input/Output) interface 1305 is also connected to bus 1304.

Various components in electronic device 1300 are connected to I/O interface 1305, including: an input unit 1306 such as a keyboard, a mouse, or the like; an output unit 1307 such as various types of displays, speakers, and the like; storage unit 1308, such as a magnetic disk, optical disk, etc.; and a communication unit 1309 such as a network card, a modem, a wireless communication transceiver, or the like. The communication unit 1309 allows the electronic device 1300 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.

The computing unit 1301 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 1301 include, but are not limited to, a CPU (Central Processing Unit ), GPU (Graphic Processing Units, graphics processing unit), various dedicated AI (Artificial Intelligence ) computing chips, various computing units running machine learning model algorithms, DSP (Digital Signal Processor ), and any suitable processor, controller, microcontroller, etc. The calculation unit 1301 performs the respective methods and processes described above, such as the parameter model training method of the traffic light control policy or the traffic signal control method described above. For example, in some embodiments, the parametric model training method of the traffic light control strategy or the traffic signal control method described above may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 1308. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 1300 via the ROM 1302 and/or the communication unit 1309. When the computer program is loaded into the RAM 1303 and executed by the computing unit 1301, one or more steps of the parametric model training method of the traffic light control strategy or the traffic signal control method described above may be performed. Alternatively, in other embodiments, the computing unit 1301 may be configured in any other suitable manner (e.g., by means of firmware) to perform the parametric model training method or traffic signal control method of the signal control strategy described above.

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit System, FPGA (Field Programmable Gate Array ), ASIC (Application-Specific Integrated Circuit, application-specific integrated circuit), ASSP (Application Specific Standard Product, special-purpose standard product), SOC (System On Chip ), CPLD (Complex Programmable Logic Device, complex programmable logic device), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, RAM, ROM, EPROM (Electrically Programmable Read-Only-Memory, erasable programmable read-Only Memory) or flash Memory, an optical fiber, a CD-ROM (Compact Disc Read-Only Memory), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., CRT (Cathode-Ray Tube) or LCD (Liquid Crystal Display ) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: LAN (Local Area Network ), WAN (Wide Area Network, wide area network), internet and blockchain networks.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service (Virtual Private Server, virtual special servers) are overcome. The server may also be a server of a distributed system or a server that incorporates a blockchain.

It should be noted that, artificial intelligence is a subject of studying a certain thought process and intelligent behavior (such as learning, reasoning, thinking, planning, etc.) of a computer to simulate a person, and has a technology at both hardware and software level. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like; the artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural language processing technology, a machine learning/deep learning technology, a big data processing technology, a knowledge graph technology and the like.

According to the technical scheme of the embodiment of the disclosure, the traffic flow operation information on each road in the first setting area is input into an initial parameter model of a signal lamp control strategy, so that first weights of at least two items of operation information in the traffic flow operation information are determined according to output of the initial parameter model; determining a first control parameter of a signal lamp on each road according to traffic flow operation information by adopting a signal lamp control strategy based on first weights of at least two pieces of operation information; responding to the first control parameters of the signal lamps to control the signal lamps, and acquiring driving data of a plurality of first vehicles on each road to obtain first driving data of the first vehicles; determining a target index value according to first driving data of the first vehicles, wherein the target index value is used for indicating the passing efficiency of the first vehicles on each road; and training the initial parameter model according to the target index value to obtain a target parameter model. Therefore, the initial parameter model of the signal lamp control strategy can be trained based on the deep learning technology, so that the control parameters of the signal lamps are predicted based on the trained initial parameter model, the accuracy and reliability of a prediction result can be improved, each signal lamp is controlled according to the reliable control parameters, and the traffic efficiency can be improved.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the technical solutions presented in the present disclosure are achieved, and are not limited herein.

The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims

1. A parametric model training method of a signal lamp control strategy, the method comprising:

2. The method of claim 1, wherein the determining a target index value from the first travel data of the plurality of first vehicles comprises:

determining, for any one of the plurality of first vehicles, a travel start point and a travel end point of the first vehicle, and an actual travel duration and an actual travel distance of the first vehicle from the travel start point to the travel end point, according to first travel data of the first vehicle;

determining a reference driving duration of the first vehicle according to the actual driving distance between the driving starting point and the driving ending point;

Determining a traffic delay time of the first vehicle according to a first difference between the actual running time and the reference running time;

and determining the target index value according to the traffic delay time length of the plurality of first vehicles.

3. The method of claim 2, wherein the determining the target index value according to the traffic delay time periods of the plurality of first vehicles comprises:

determining a mean value of the traffic delay time lengths of the plurality of first vehicles;

and determining the target index value according to the average value, wherein the target index value and the average value are in a negative correlation.

4. The method of claim 1, wherein the training the initial parametric model according to the target index values to obtain a target parametric model comprises:

obtaining a reference parameter model of the signal lamp control strategy, wherein the reference parameter model is obtained by adding noise to the initial parameter model;

inputting the traffic flow operation information into the reference parameter model to determine second weights of at least two items of operation information in the traffic flow operation information according to the output of the reference parameter model;

Determining second control parameters of signal lamps on each road according to the traffic flow operation information by adopting the signal lamp control strategy based on second weights of the at least two pieces of operation information;

responding to each signal lamp according to a second control parameter of each signal lamp, and acquiring driving data of a plurality of second vehicles on each road to obtain second driving data of the plurality of second vehicles;

determining a reference index value according to second driving data of the second vehicles; wherein the reference index value is used for indicating the passing efficiency of the plurality of second vehicles on each road;

and training the initial parameter model according to the target index value and the reference index value.

5. The method of claim 4, wherein the training the initial parametric model based on the target index value and the reference index value comprises:

determining a second difference between the target index value and the reference index value;

fusing the second difference and the noise to obtain fused data;

updating the first weights of the at least two pieces of operation information according to the fusion data to obtain the third weights of the at least two pieces of operation information;

And training the initial parameter model based on the third weight of the at least two pieces of operation information to obtain the target parameter model.

6. The method of claim 5, wherein the training the initial parametric model based on the third control parameter for each signal lamp comprises:

determining a third control parameter of the signal lamp on each road according to the traffic flow operation information by adopting the signal lamp control strategy based on the third weights of the at least two pieces of operation information;

responding to each signal lamp according to a third control parameter of each signal lamp, and acquiring driving data of a plurality of third vehicles on each road to obtain third driving data of the plurality of third vehicles;

determining an update index value according to third traveling data of the plurality of third vehicles; wherein the updated index value is used for indicating the passing efficiency of the plurality of third vehicles on each road;

training the initial parameter model according to the updated index value to obtain a target parameter model.

7. The method of claim 5, wherein the reference parametric model is a plurality of reference parametric models, the plurality of reference parametric models being obtained by adding different noise to an initial parametric model;

The fusing the second difference and the noise to obtain fused data includes:

multiplying the corresponding first difference and noise for any reference parameter model to obtain intermediate data;

and adding the intermediate data of the multiple reference parameter models to obtain the fusion data.

8. The method of any of claims 5-7, wherein the training the initial parametric model according to the updated index values comprises:

and training the initial parameter model according to the updated index value so as to maximize the updated index value.

9. The method of claim 1, wherein the training the initial parametric model according to the target index values comprises:

and training the initial parameter model according to the target index value so as to maximize the target index value.

10. A traffic signal control method, the method comprising:

controlling each signal lamp according to the control parameters of each signal lamp;

the training method of the target parameter model comprises the following steps:

11. A parametric model training apparatus for a signal lamp control strategy, the apparatus comprising:

12. The apparatus of claim 11, wherein the third determination module is configured to:

13. The apparatus of claim 12, wherein the third determination module is configured to:

14. The apparatus of claim 11, wherein the training module is to:

15. The apparatus of claim 14, wherein the training module is to:

fusing the second difference and the noise to obtain fused data;

16. The apparatus of claim 15, wherein the training module is to:

17. The apparatus of claim 15, wherein the reference parametric model is a plurality of the reference parametric models obtained by adding different noise to an initial parametric model;

the training module is used for:

18. The apparatus of any of claims 15-17, wherein the training module is to:

19. The apparatus of claim 11, wherein the training module is to:

20. A traffic signal control apparatus, the apparatus comprising:

the control module is used for controlling each signal lamp according to the control parameters of the signal lamp;

the device is further configured to obtain the target parameter model, and specifically includes:

21. An electronic device, comprising:

at least one processor; and

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-9 or to perform the method of claim 10.

22. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-9 or to perform the method of claim 10.