CN111932871B

CN111932871B - Regional real-time traffic control strategy recommendation system and method

Info

Publication number: CN111932871B
Application number: CN202010597532.3A
Authority: CN
Inventors: 庞钰琪; 金峻臣; 温晓岳; 郭海锋; 王辉; 秦俊峰; 陈才君
Original assignee: Enjoyor Co Ltd
Current assignee: Yinjiang Technology Co.,Ltd.
Priority date: 2020-06-28
Filing date: 2020-06-28
Publication date: 2021-06-29
Anticipated expiration: 2040-06-28
Also published as: CN111932871A

Abstract

The invention relates to a regional real-time traffic control strategy recommendation system and a regional real-time traffic control strategy recommendation method. The invention greatly reduces the calculation amount and can provide real-time calculation. Under the condition of large quantity of intersections, the screened intersections have better approximate effect on the feedback of the selectable traffic control strategies, and the recommended strategies are more accurate.

Description

Regional real-time traffic control strategy recommendation system and method

Technical Field

The invention relates to the field of intelligent traffic, in particular to a regional real-time traffic control strategy recommendation system and method.

Background

Traffic jam often occurs at road intersections, and by reasonably distributing the right of way for each traffic flow at the intersections, vehicle delay can be effectively reduced, and vehicle queuing can be reduced, so that the traffic jam can be prevented and treated. At present, intersection participation coordination, control by using a self-adaptive control/curing scheme and the like are generally judged by manual experience, and are designed into a signal control system in a mode of manually inputting instructions, so that the efficiency is low, the experience judgment is different from person to person, and a scientific methodology is lacked. A general traffic control strategy recommendation idea is to use various traffic data of an intersection as the characteristics of the intersection, search for similar intersections according to the similarity of the characteristics of the intersection, and provide traffic control strategy recommendation for a target intersection by using traffic control strategies adopted by the similar intersections. When the method faces to the intersections with huge number, the high-dimensional characteristics of the intersections need to be compared one by one, and the calculated amount is huge.

Disclosure of Invention

The invention aims to overcome the defects and provides a regional real-time traffic control strategy recommendation system and a regional real-time traffic control strategy recommendation method. The invention greatly reduces the calculation amount and can provide real-time calculation. Under the condition of large quantity of intersections, the screened intersections have better approximate effect on the feedback of the selectable traffic control strategies, and the recommended strategies are more accurate.

The invention achieves the aim through the following technical scheme: a regional level real-time traffic control strategy recommendation system, comprising: the system comprises a data module, a dynamic intersection state grading module, a dynamic recommendation scene building module and a traffic control strategy recommendation module; wherein the content of the first and second substances,

the data module contains a multi-source database and is used for providing real-time and historical actual traffic state data, actual intersection traffic control strategy data and an intersection traffic state data set;

the dynamic intersection state grading module comprises a grading algorithm unit, a grading standardization unit and a dynamic grading unit, wherein the grading algorithm unit integrates intersection data by using data resources which are continuously updated iteratively to obtain a comprehensive intersection grading index; and a grade standard of the intersection which accords with the recent traffic characteristics is formulated through a grade standardization unit; then, determining the grade of the marked intersection at the appointed time according to the real-time and historical actual traffic state data for the specific intersection at the specific time interval through a dynamic grading unit;

the dynamic recommendation scene building module comprises a dynamic recommendation scene building unit and a dynamic recommendation experience base, wherein the dynamic recommendation scene building unit acquires intersection and actual traffic control strategy data from the data module, acquires actual intersection state grading data from the dynamic intersection state grading module, constructs a three-metadata data table of [ intersection, actual traffic control strategy and actual intersection state grade ] under different intersection state grading conditions, and the dynamic recommendation experience base stores a three-metadata data table of [ intersection, actual traffic control strategy and actual intersection state grade ] as a historical experience base;

the traffic control strategy recommendation module comprises a recommendation algorithm optimizing unit and a control strategy recommendation unit, wherein the recommendation algorithm optimizing unit acquires a three-metadata table of [ intersection, actual traffic control strategy and actual intersection state score ] from a historical experience base as a training data set, an optimal recommendation model combination is searched from a plurality of initial recommendation algorithm models, selectable hyper-parameters of each model and selectable parameters of each model determining hyper-parameters, the control strategy recommendation unit of the traffic control strategy recommendation module acquires the optimal recommendation model combination from the recommendation algorithm optimizing unit, inputs the intersection and unexecuted selectable traffic control strategies into the optimal recommendation model combination, outputs predicted intersection state scores corresponding to the intersection and unexecuted selectable traffic control strategies, and forms the [ intersection and unexecuted selectable traffic control strategies, and predicting intersection state score, splicing the three-metadata table with the actual three-metadata table of the intersection, the actual traffic control strategy and the actual intersection state score to obtain the three-metadata table of the intersection, all the selectable traffic control strategies and the mixed intersection state score, sequencing according to the recommended mixed intersection state score, and automatically recommending the traffic control strategy for the intersection at the next moment.

A regional real-time traffic control strategy recommendation method comprises the following steps:

(1) the data module provides real-time and historical actual traffic state data, actual intersection traffic control strategy data and an intersection traffic state data set based on a multi-source database;

(2) the dynamic intersection state grading module grades and marks real-time and historical actual traffic state data to obtain actual intersection state grading data;

(3) a dynamic recommendation scene construction unit of the dynamic recommendation scene construction module constructs a three-metadata table of [ intersection, actual traffic control strategy and actual intersection state score ] based on actual intersection traffic control strategy data and actual intersection state grading data; the dynamic recommendation experience base stores a three-metadata table of [ intersection, actual traffic control strategy and actual intersection state score ];

(4) a recommendation algorithm optimizing unit of the traffic control strategy recommendation module acquires a three-metadata table of the intersection, the actual traffic control strategy and the actual intersection state score, the three-metadata table is used as a training data set, and an optimal recommendation model combination is searched from a plurality of initial recommendation algorithm models, the super-parameters selectable by each model and the selectable parameters of each model for determining the super-parameters;

(5) a control strategy recommending unit of the traffic control strategy recommending module acquires an optimal recommendation model combination from a recommending algorithm optimizing unit, inputs the intersection and the unexecuted optional traffic control strategy into the optimal recommendation model combination, outputs a predicted intersection state score corresponding to the intersection and the unexecuted optional traffic control strategy by the optimal recommendation model combination, forms a three-metadata table of the intersection, the unexecuted optional traffic control strategy and the predicted intersection state score, is spliced with the three-metadata table of the actual intersection, the actual traffic control strategy and the actual intersection state score to obtain the three-metadata table of the intersection, all the optional traffic control strategies and the mixed intersection state score, sorts according to the recommended mixed intersection state score, and automatically recommends the traffic control strategy for the intersection.

Preferably, the step (2) is specifically: a grading algorithm unit of the dynamic intersection state grading module integrates intersection data by using continuously iteratively updated data resources to obtain a comprehensive intersection grading index; and a grade standard of the intersection which accords with the recent traffic characteristics is formulated through a grade standardization unit; then, determining the grade of the marked intersection at the appointed time according to the real-time and historical actual traffic state data for the specific intersection at the specific time interval through a dynamic grading unit; the method comprises the following specific steps:

(i) the hierarchical algorithm unit obtains the characteristic index S of the intersection in the time period t_tThe calculation method is as follows:

S_t

＝α₁·α₂·[l_{turn_1}·q_{turn_1，t}，…，l_{turn_i}·q_{turn_i，t}，…，l_{turn_j}·q_{turn_j，t}]

·[v_{turn_1，t}/d_{turn_1，t}，…，v_{turn_i，t}/d_{turn_i，t}，…，v_{turn_j，t}/d_{turn_j，t}]^T

wherein alpha is₁Is a crossing type parameter; alpha is alpha₂Controlling a property parameter for the intersection signal; turn _ i ∈ [ turn _1, turn _ j [ ]]The turning directions of the inlet passage; l_{turn_i}Turn _ i direction lane number; q. q.s_{turn_i，t}Turn _ i directional flow; v. of_{turn_i，t}Is the average velocity; d_{turn_i，t}Delay time of the lane in a time period t;

(ii) the hierarchical standardization unit takes historical data of a plurality of days and respectively calculates S of each intersection in a time period t_tTake all S_tMaximum value of S_{t_max}Minimum value S_{t_min}If the intersection is set to be divided into m levels, the intersection state level node _ level of the time period t is set_tComprises the following steps:

(iii) the dynamic grading unit is used for judging the intersection state of the required time period, acquiring a data set of the t time period from historical data or real-time data, and calculating S by the grading algorithm unit_tThen substituting the standard into the standard generated by the standard unit for comparison, and outputting the corresponding level node _ level according to the judgment result_t。

Preferably, the three-metadata data table of [ intersection, actual traffic control strategy, actual intersection state score ] constructed in the step (3) is as follows:

(a) the intersection mark is LK; wherein, LK _ ii is represented as the ii th intersection, LK _ ii is the intersection ID, which is the unique identification of the intersection;

(b) the actual traffic control strategy includes: a coordination control strategy and a single-point control strategy; self-adaptive control strategy and curing scheme control strategy; a value range control strategy of a scheme adopted by a signal control period; wherein the symbol of the traffic control strategy is represented as follows:

wherein: the coor is whether to participate in the coordination,

n^policyin order for the policy to be taken,

the number of the specific schemes is j, the number of the schemes configured at the intersection is set according to the intersection,

wherein the content of the first and second substances,

(c) marking the intersection state score: rt is an integer of^CLS (t) -S (t-1), S (t) represents the intersection state level for the time period t.

Preferably, the dynamic recommendation experience base is established in real time and dynamically changed, and the establishing method is as follows:

(I) receiving the next time period t to be recommended, the date of the next time period t to be recommended is gg day, and the intersection LK _ ii;

(II) acquiring a real-time intersection level s (t) of the intersection LK _ II from the dynamic intersection grading module;

(III) inquiring the date and time period corresponding to the historical intersection level which is the same as the real-time intersection level s (t) of the intersection LK _ ii from the historical data, wherein the date and time period is marked as tt _ jj;

(IV) selecting an actual traffic control strategy CL and an actual intersection state score rt which operate at time tt _ jj for the intersection LK _ ii;

(V) storing in the experience repository in the form [ LK, CL, rt ]: for an intersection LK _ ii, classifying the same actual traffic control strategy in the actual traffic control strategy CL and the actual intersection state score rt which operate at the selected time tt _ jj, calculating an average actual intersection state score, and constructing a three-metadata table of [ the intersection, the actual traffic control strategy and the actual average intersection state score ] as a historical experience base.

Preferably, in the step (4), the recommendation algorithm optimizing unit realizes automatic selection of the initial recommendation algorithm model and the hyper-parameters and parameter combinations; the recommendation algorithm optimizing unit takes a historical experience base as training data, takes the minimum average error LOSS between the predicted intersection state score and the actual intersection state score as a target, and selects an optimal initial recommendation algorithm model meeting the training requirements and corresponding hyperparameters or parameter combinations from a plurality of selectable initial recommendation algorithm models and selectable hyperparameters or parameter values corresponding to each initial recommendation algorithm model to form an optimal recommendation model combination; the specific process is as follows:

1) coding the selectable initial recommendation algorithm models and the selectable hyper-parameter or parameter values corresponding to each initial recommendation algorithm model, wherein one coded value BH is_iForming an initial code set containing a plurality of code values corresponding to an initial recommended algorithm model and a group of specific hyper-parameter values (alg _ p, { para _ p _1_ r, para _ p _2_ r, …, para _ p _ q _ r, …, para _ p _ q (p) _ r });

2) selecting X coded values from an initial coded set to form a coded subset;

3) respectively calculating LOSS values generated by a recommendation algorithm corresponding to the code values in the code subsets; the LOSS value represents the average error between the predicted intersection state score and the actual intersection state score, and the Mean Square Error (MSE) or the Root Mean Square Error (RMSE) can be adopted;

wherein X code values correspond to K LOSS values, BH_iCorresponding LOSS_i；

4) Performing data fitting on the coding values in the coding subset and the corresponding LOSS values to obtain a recommendation algorithm optimizing function;

5) searching a key code value in the recommendation algorithm optimizing function, wherein the key code value can adopt a minimum point of the function or a point of which the function change meets a threshold value;

6) adding the key coding value into the coding subset to form a new coding subset;

7) repeating the steps 3) to 6) until the training requirement is met;

8) and taking the recommendation algorithm corresponding to the code value corresponding to the LOSS minimum value of the final recommendation algorithm optimizing function as the optimal recommendation algorithm.

Preferably, the encoding method in step 1) adopts a random numbering mode, and specifically includes the following steps:

1.1) forming a recommendation algorithm by an initial recommendation algorithm model and a group of specific hyper-parameter values, wherein the recommendation algorithm corresponds to a coding value;

1.2) an initial recommendation algorithm model and a group of specific hyper-parameter values form a recommendation algorithm corresponding to a plurality of coding values.

Preferably, in the step 2), the selection method is as follows:

2.1) random selection;

2.2) ensuring at least one initial recommendation algorithm model, and randomly selecting the rest;

2.3) are proportionally distributed to each initial recommendation algorithm model, and each initial recommendation algorithm model is randomly selected in the distribution quantity.

Preferably, in the step 4), a normal distribution probability density function may be adopted, as follows: normal distribution probability density function

LOSS＝f(BH)

Wherein, pi is a circumference ratio and is a constant; sigma is the standard deviation of the data as a whole; μ is the expected (average) value for the data as a whole; x is a variable, here corresponding to the code value BH;

the fitting method comprises the following steps: let ln lose_i＝z_iSolving for

Abbreviated as, Z ═ XB; according to the least squares principle, the generalized least squares solution of the constructed matrix B is B ═ X^TX)^-1X^TZ, after calculationAccording to

The variance and expectation are solved and thus fitted to a normal distribution probability density function.

Preferably, the training requirements include the following:

7.1) repeating the steps of 3-6N times; wherein N is a fixed value and is selected according to actual needs;

7.2) LOSS value in the range of a% -b% of the average LOSS value.

The invention has the beneficial effects that: the invention greatly reduces the calculation amount and can provide real-time calculation. Under the condition of large quantity of intersections, the screened intersections have better approximate effect on the feedback of the selectable traffic control strategies, and the recommended strategies are more accurate.

Drawings

FIG. 1 is a schematic diagram of the system architecture of the present invention;

FIG. 2 is a schematic flow diagram of the method of the present invention;

FIG. 3 is a schematic diagram of the present invention in traffic control;

FIG. 4 is a schematic flow chart of a recommendation algorithm optimizing unit of the present invention.

Detailed Description

The invention will be further described with reference to specific examples, but the scope of the invention is not limited thereto:

example (b): the invention takes the intersection as an object, constructs a traffic control strategy recommendation system, and recommends a traffic control strategy for the intersection in real time based on the real-time traffic state of the intersection. The traffic control strategy described herein includes: a coordination control strategy and a single-point control strategy; self-adaptive control strategy and curing scheme control strategy; and a value range control strategy of a scheme adopted by the signal control period.

As shown in fig. 1, the regional real-time traffic control strategy recommendation system is composed of a data module, a dynamic intersection state classification module, a dynamic recommendation scene construction module, and a traffic control strategy recommendation module. The data module is internally provided with a multi-source database and provides real-time and historical actual traffic state data, actual intersection traffic control strategy data and an intersection traffic state data set for other modules.

The dynamic intersection state grading module comprises a grading algorithm unit, a grading standardization unit and a dynamic grading unit and is used for grading and marking real-time and/or historical actual traffic state data acquired from the data module;

the dynamic recommendation scene building module is internally provided with a dynamic recommendation scene building unit and a dynamic recommendation experience base, the dynamic recommendation scene building unit acquires intersection and actual traffic control strategy data from the data module, acquires actual intersection state grading data from the dynamic intersection state grading module, constructs a three-metadata data table of [ intersection, actual traffic control strategy and actual intersection state grade ] under different intersection state grading conditions, and the dynamic recommendation experience base stores a three-metadata data table of [ intersection, actual traffic control strategy and actual intersection state grade ] as a historical experience base;

the traffic control strategy recommendation module comprises a recommendation algorithm optimizing unit and a control strategy recommendation unit, wherein the recommendation algorithm optimizing unit acquires a three-metadata table of [ intersection, actual traffic control strategy and actual intersection state score ] from a historical experience base as a training data set, an optimal recommendation model combination is searched from a plurality of initial recommendation algorithm models, a super parameter selectable by each model and a selectable parameter of each model determining the super parameter, the control strategy recommendation unit acquires the optimal recommendation model combination from the recommendation algorithm optimizing unit, the intersection and unexecuted selectable traffic control strategy are input into the optimal recommendation model combination, the optimal recommendation model combination outputs a predicted intersection state score corresponding to the intersection and unexecuted selectable traffic control strategy, and a three-metadata table of [ intersection, unexecuted selectable traffic control strategy and predicted intersection state score ] is formed, and obtaining a three-metadata data table of the intersection, all the selectable traffic control strategies and the mixed intersection state score, sorting according to the recommended mixed intersection state score, and recommending the traffic control strategy for the intersection at the next moment. After the recommended traffic control strategy for the next moment is executed, the recommended traffic control strategy becomes an actual traffic control strategy, a corresponding actual intersection state score is generated, and the entering data module can be updated.

As shown in fig. 2, a method for recommending an area-level real-time traffic control strategy specifically includes:

the method comprises the following steps that firstly, a data module provides real-time and historical actual traffic state data, actual intersection traffic control strategy data and an intersection traffic state data set based on a multi-source database;

step two, the dynamic intersection state grading module grades and marks real-time and historical actual traffic state data to obtain actual intersection state grading data;

the grading calculation unit of the dynamic intersection state grading module firstly integrates intersection data by using data resources which are continuously updated iteratively to obtain comprehensive intersection grading indexes. An intersection grading standard meeting recent traffic characteristics is formulated through a grading standardization unit; and then, determining the grade of the intersection at the specified time according to the historical data and the real-time data respectively for the specific intersection at the specific time period through the dynamic grading unit.

The method is integrated with an intersection as a whole, determines the grade of the intersection by integrating intersection information, can combine the road section information with the intersection information, and is more objective in evaluation. The algorithm comprehensively considers the objective properties of the road junction (such as basic components of T-shaped, cross-shaped and the like, the number of lanes of each entrance lane, the number of turning lanes in each direction, intersection signal control properties and the like) and the traffic demand properties (the flow rate, the speed, the delay and the like of the entrance lane). In the time period t, the grade index S of the intersection_tThe calculation method is as follows:

S_t

wherein alpha is₁Is a crossing type parameter; alpha is alpha₂Controlling a property parameter for the intersection signal; turn _ i ∈ [ turn _1, turn _ j [ ]]For each direction of turning of the entrance lane, such as the west entrance lane left turn direction, the north entrance lane straight direction, etc.; l_{turn_i}Turn _ i direction lane number; q. q.s_{turn_i，t}Turn _ i directional flow; v. of_{turn_i，t}Is the average velocity; d_{tun_i，t}The delay time of the lane in the period t.

The grading standardization unit formulates an intersection grading standard which meets the recent (tentatively one month) traffic characteristics. Compared with the traditional grading method, the method can dynamically grade the intersections according to the historical data, the grading standards of different intersections in the same area and different periods of each intersection are different, the differentiation of different intersections is more detailed, and when the intersection section is changed due to construction, temporary traffic control, local traffic design change and other common situations, the grade of the intersection can be flexibly adjusted, and the accuracy of the grade and traffic control of the intersection is improved.

Taking historical data of one month, respectively calculating S of each day in t time period_tTake all S_tMaximum value of S_{t_max}Minimum value S_{t_min}If the intersection is set to be divided into m levels, the intersection state level node _ level of the time period t is set_tComprises the following steps:

the dynamic grading unit is used for judging the intersection state in the required time period, acquiring a data set in the t time period from historical data or real-time data, and calculating S by the grading algorithm unit_tThen substituting the classification standard generated by the classification standardization unit for comparison, and outputting the corresponding grade according to the judgment result.

Assuming that the classification is applied to the region of 10000 intersections, one day is divided into 24 periods, and the intersection classification number is 5, the following table 1 can represent the status levels of each intersection at each period on a certain day.

	Period 0	1 period of time	2 period of time	···	For a period of 24 hours
						1 crossing	1	4	3	···	3
2 crossing	5	2	0	···	3
						3 crossing	3	0	4	···	3
4 crossing	1	4	4	···	3
						···	···	···	···	···	···
10000 road junction	2	4	2	···	4

TABLE 1

Thirdly, a dynamic recommended scene building unit of the dynamic recommended scene building module builds a three-metadata table of the intersection, the actual traffic control strategy and the actual intersection state grading data based on the actual intersection traffic control strategy data and the actual intersection state grading data; the dynamic recommendation experience base stores a three-metadata table of [ intersection, actual traffic control strategy and actual intersection state score ]; the data table for constructing the intersection, the actual traffic control strategy and the actual intersection state score is specifically as follows:

the intersection identification is as follows: LK

Wherein, LK _ ii is represented as the ii th crossing, and LK _ ii is the crossing ID, which is the unique identification of the crossing.

Assuming that the method is applied to an area with 100 intersections, there are 100 LK, and the identifier of each intersection is the unique number of the intersection in the control system.

The traffic control strategy comprises: a coordination control strategy and a single-point control strategy; self-adaptive control strategy and curing scheme control strategy; and a value range control strategy of a scheme adopted by the signal control period.

The method does not directly recommend an intersection timing scheme, and aims to recommend a coordination/single-point control strategy, a self-adaption/solidification scheme control strategy and a value range control strategy of an adopted scheme, which are applicable to the current intersection, instead of obtaining a final traffic control instruction set. Under the traffic control strategy, a traffic control agent method (such as a traffic engineering algorithm, a reinforcement learning algorithm, a deep reinforcement learning algorithm and the like) can be substituted into the calculation process to generate a more intelligent control instruction set (the function of the method in the whole traffic control is shown in an attached figure 3). This process is not discussed in the context of the present invention.

Specifically, an agent refers to an entity having basic characteristics of autonomy, sociality, reactivity, progressiveness, and premonition, which is embedded in an environment, senses the environment through an observer, autonomously acts on the environment through an actuator, and satisfies design requirements. The Agent has intelligence, has a knowledge base, a learning machine and a control machine, and can autonomously determine whether to respond to information from other agents. In the invention, the control strategy agent is an application of the agent, is a traffic control scheme making tool for a single intersection, senses the state of a traffic environment, can interact with other control strategy agent information, outputs a control scheme of a single-point traffic signal controller through a learning and knowledge base of the control strategy agent, and acts the control scheme into the environment.

Traffic control strategy, symbolic representation:

wherein: the coor is whether to participate in the coordination,

n^policyin order for the policy to be taken,

wherein the content of the first and second substances,

in the present embodiment, 10000 intersections in the area, LK _ ii is denoted as the ii-th intersection. In the traffic control strategy CL selectable at each intersection:

coor _ jj is expressed as jj-th coordination control strategy, and there are 2 cases in total. Wherein, the color _1 indicates that the intersection is a single-point intersection and does not participate in coordination, and the color _0 indicates that the intersection has a coordination relationship with other intersections;

n^policyklk is denoted as the kth period control strategy, with 2 cases in total. Wherein n is^policyThe _0indicates that the intersection adopts an adaptive control strategy, namely, in the time period, a control scheme is calculated in real time according to the vehicle speed and the traffic of the road section, and the control scheme changes along with the traffic and the speed change of the road section; n is^policy1 represents a curing period scheme control strategy, namely, in the period, the same control scheme is adopted and is not changed;

representing different scheme making parameters under different strategy states. For adaptive control strategies, i.e.

The mm is adjustable under the self-adaptive control strategyThe pitch range may be, for example for a period,

may represent a period range of 60, 70,

may represent a periodic range of [70, 80), etc.; for curing schedules control strategies, i.e.

mm represents a fixed time period scheme, and each control strategy agent represents a control scheme, such as

Indicating a period of 90s,

Indicating a period of 120s, etc.;

CL _ jj _ kk _ mm represents a control strategy which adopts the jj-th coordination control strategy, the kk-th time interval control strategy and the mm-th scheme to make parameters under the strategy, and according to the foregoing example, 2 × 5+15 is totally 40 CL.

Marking the intersection state score: rt is an integer of^CL＝S(t)-S(t-1)

For each intersection, a plurality of selectable traffic control strategies are provided, and in the actual traffic operation, one traffic control strategy is actually operated in one time interval and scores are given corresponding to the actual intersection state. For the LK _1 intersection, if there are 10 days of data, there are 24 time intervals in 1 day, and 1 actual traffic control strategy is operated every 1 time interval every 1 day, then 10 × 24 is operated as 240 actual traffic control strategies, and the selectable range of the actual traffic control strategies is 40. Because each actual traffic control strategy is generated by system recommendation and is not selected from the 40 optional traffic control strategies sequentially or with equal probability, the same traffic control strategy may be selected by multiple actual traffic control strategies, and some optional traffic control strategies may not be operated.

The dynamic recommendation experience base is established in real time and dynamically changed, and the establishing method specifically comprises the following steps:

1. receiving the next time period t to be recommended, the date of the next time period t to be recommended is gg day, and the intersection LK _ ii;

2. acquiring a real-time intersection level s (t) of an intersection LK _ ii from the dynamic intersection grading module;

in this embodiment, for the 10000 intersection areas, a traffic control policy that gg day t is 3 time periods (denoted as gg _3) needs to be recommended, and then the intersection level s (t) of the area where gg day t is 3 time periods is calculated according to the real-time actual traffic state data transmitted by the data module, as shown in table 2:

	3 period of time
		LK_1	s(t)＝1
LK_2	s(t)＝5
		LK_3	s(t)＝3
LK_4	s(t)＝1
		···	···
LK_10000	s(t)＝2

TABLE 2

3. Inquiring the date and time period corresponding to the historical intersection level which is the same as the real-time intersection level s (t) of the intersection LK _ ii from the historical data, wherein the date and time period is marked as tt _ jj;

in this embodiment, 1 month of history data, i.e., (gg-31) to (gg-1) is selected, and intersection levels of each intersection at each time period in each daily area are calculated. During the screening month, the intersection levels are on the same date and time period as s (t). Assuming that for a 1-intersection (ii ═ 1), the level s (t) of the gg day t ═ 3 time period is 1, and in one month, there are 50 time periods (e.g., (gg-30) _1, (gg-25) _5) with the level 1 at the 1-intersection, and so on, the 50 time periods are taken out, and the 50 time periods are recorded as a list tt _ 1. Other intersections are similarly treated. Herein, tt _ ii is shown in table 3 below.

LK_1	(gg-30)_1	(gg-25)_2	(gg-25)_7	···	(gg-1)_9
						LK_2	(gg-30)_3	(gg-27)_4	(gg-24)_3	···	(gg-2)_12
LK_3	(gg-30)_5	(gg-30)_2	(gg-29)_16	···	(gg-2)_13
						LK_4	(gg-29)_10	(gg-25)_2	(gg-22)_3	···	(gg-2)_1
···	···	···	···	···	···
						LK_10000	(gg-30)_4	(gg-24)_2	(gg-30)_12	···	(gg-2)_12

TABLE 3

4. For the intersection LK _ ii, the actual traffic control strategy CL and the actual intersection state score rt which operate at the time tt _ jj are selected.

For example, for intersection 1, as shown in table 4 below:

time tt _ jj	CL	rt
			(gg-30)_1	CL_0_1_3	4
(gg-25)_2	CL_1_0_2	0
			(gg-25)_7	CL_0_1_3	2
···	···	···
			(gg-1)_9	CL_0_1_1	1

TABLE 4

5. Store in the experience library in the form [ LK, CL, rt ]: for an intersection LK _ ii, classifying the same actual traffic control strategy in the actual traffic control strategy CL and the actual intersection state score rt which operate at the selected time tt _ jj, calculating an average actual intersection state score, and constructing a three-metadata table of [ the intersection, the actual traffic control strategy and the actual average intersection state score ] as a historical experience base.

For example, for the intersection 1, in the actual traffic control strategy CL and the actual intersection state score rt which are operated at the selected time tt _ jj, the same actual traffic control strategy CL _0_1_3 is adopted for both the time (gg-30) _1 and the time (gg-25) _7, rt is respectively 4 and 2 for the 2 times, and the average actual intersection state score is calculated to be (4+ 2)/2-3. Since for historical data, the intersection may not have applied all CL, there will be a gap for the alternative traffic control strategy that is not running, without the corresponding rt, as shown in the historical experience library of table 5 below:

	CL_0_0_1	CL_0_0_2	···	CL_1_1_15
					LK_1
	3		···	2
					LK_2		1	···
LK_3	2		···	1
					LK_4	4	1	···
LK_5	1		···	3
					···	···	···	···	···
LK_10000		1	···

TABLE 5

When the recommendation time period t changes, recalculation is needed from step 1, and the experience base also changes.

And fourthly, a recommendation algorithm optimizing unit of the traffic control strategy recommendation module acquires a three-metadata data table of the intersection, the actual traffic control strategy and the actual intersection state score, the three-metadata data table is used as a training data set, and an optimal recommendation model combination is searched from a plurality of initial recommendation algorithm models, the selectable hyper-parameters of each model and the selectable parameters of each model for determining the hyper-parameters.

The recommendation algorithm optimizing unit realizes automatic selection of initial recommendation algorithm models, hyper-parameters and parameter combinations. The initial recommendation algorithm model can predict the road LK after training according to the existing historical experience base, and the intersection state scoring under the optional traffic control strategy which is not executed. The initial recommended algorithm model has various types, such as KNN algorithm, SVD algorithm, NMF algorithm, and the like, and can also be pLSA algorithm and LDA algorithm.

Different initial recommendation algorithm models have respective corresponding hyper-parameters or parameters. The hyperparameter or parameter of the KNN algorithm is the sample number k and the distance measurement d of the model vector space; the hyperparameter or parameter of the SVD algorithm is the number of samples k. The hyper-parameters or the parameters have respective corresponding value ranges, for example, the value range of the sample parameter K is 1-K. And taking discrete values in the value range of the hyper-parameter or the parameter to form a value set of the hyper-parameter or the parameter.

And selecting an optimal initial recommendation algorithm model meeting training requirements and corresponding hyperparameters or parameter combinations from a plurality of selectable initial recommendation algorithm models and selectable hyperparameters or parameter values corresponding to each initial recommendation algorithm model by taking the historical experience library as training data and taking the minimum average error LOSS between the predicted intersection state score and the actual intersection state score as a target to form the optimal recommendation model combination. As shown in fig. 4, the specific process is as follows:

1. coding the selectable initial recommendation algorithm models and the selectable hyper-parameter or parameter values corresponding to each initial recommendation algorithm model, wherein one coded value BH is_iCorresponding to an initial recommended algorithm model and a group of specific hyper-parameter values (alg _ p, { para _ p _1_ r, para _ p _2_ r, …, para _ p _ q _ r, …, para _ p _ q (p) _ r }), an initial code value is formed which comprises a plurality of code valuesA set of codes is started.

Specifically, alg _ P represents a P-th initial algorithm model, the range [1, P ] of P is the number of the initial algorithm models, para _ P _ q _ r represents the q-th hyperparameter or parameter value of the P-th initial algorithm model is r, the range [1, q (P) ] of q, q (P) is the number of the hyperparameter or parameter of the P-th initial algorithm model, the range [ r (P, q) min, r (P, q) max ], r (P, q) min and r (P, q) max are the minimum value and the maximum value of the q-th hyperparameter or parameter value of the P-th initial algorithm model.

In this embodiment, there are 2 initial recommendation algorithm models, the 1 st initial recommendation algorithm model is a KNN algorithm, the KNN algorithm has 2 hyper-parameters or parameters in total, the 1 st hyper-parameter or parameter is a sample number k, k takes a value [3,4,5], the 2 nd hyper-parameter or parameter is a distance measurement d of a model vector space, d takes a value [1,2], 1 represents a manhattan distance, 2 represents an euclidean distance, the 2 nd initial recommendation algorithm model is an SVD algorithm, the SVD algorithm has 1 hyper-parameter or parameter in total, and the 1 st hyper-parameter or parameter is a value [3,4] of the sample number k, k.

The coding method adopts a random numbering mode, and specifically comprises the following steps:

(1) an initial recommendation algorithm model and a set of specific hyper-parameter values form a recommendation algorithm corresponding to a code value, as shown in table 6 below: the KNN algorithm has 6 combinations, the SVD algorithm has 3 combinations, 6 coded values BH of the KNN algorithm are distributed, and 3 coded values BH of the SVD algorithm are distributed;

TABLE 6

(2) An initial recommendation algorithm model and a set of specific hyper-parameter values form a recommendation algorithm corresponding to a plurality of code values. To consider each of the initial recommended algorithm models with balanced code sampling probabilities, table 7 below: the KNN algorithm has 6 combinations, the SVD algorithm has 3 combinations, 6 coded values BH of the KNN algorithm are distributed for balance, 6 coded values BH of the SVD algorithm are distributed, and each SVD algorithm combination obtains 2 coded values BH.

TABLE 7

2. X code values are selected from the initial code set to form a code subset. The selection method comprises the following steps: for example, for the random code set of table 6, there are 9 initial codes, and when 3 are selected as the initial subsets:

(2.1) randomly selecting; randomly selecting 3 code values from 9 initial code values

(2.2) ensuring at least one initial recommendation algorithm model, and randomly selecting the rest;

in Table 6, BH1 is the KNN algorithm and BH4 is the SVD algorithm, at least 1 is guaranteed, and the remaining 1 is randomly selected.

And (2.3) proportionally distributing to each initial recommendation algorithm model, wherein each initial recommendation algorithm model is randomly selected in the distribution quantity.

In table 6, there are 6 combinations of KNN algorithms and 3 combinations of SVD algorithms, there should be 2 KNN algorithms and 1 SVD algorithm in the initial subset,

3. and respectively calculating LOSS values generated by the recommended algorithms corresponding to the code values in the code subsets. Wherein the LOSS value can adopt:

(3.1) mean Square error MSE

(3.2) root mean square error RMSE

X code values correspond to K LOSS values, BH_iCorresponding LOSS_i

4. And performing data fitting on the coding values in the coding subset and the corresponding LOSS values to obtain a recommendation algorithm optimizing function. Data fitting, can adopt:

(4.1) Normal distribution probability Density function

LOSS＝f(BH)

(4.2) polynomial fitting

f(x)＝a₁x+a₂x²+a₃x³+...+a_mx^m

LOSS＝f(BH)

For example, three points, BH1 ═ 01Loss1 ═ 0.1, BH3 ═ 03Loss3 ═ 0.8, and BH5 ═ 05Loss5 ═ 0.2, are used as the initial subset.

(4.1) Normal distribution probability Density function

Abbreviated as Z ═ XB. According to the least squares principle, the generalized least squares solution of the constructed matrix B is B ═ X^TX)^-1X^TAnd Z. After finding, according to

Calculate New_BHμ + σ and rounded, at which time New_BHThe corresponding code is the code found to minimize the LOSS value, i.e. the new recommended algorithm.

5. And searching key coding values in the optimization function of the recommendation algorithm. The key coding value can adopt:

(5.1) solving the minimum point of the function by adopting a gradient descent method;

(5.2) the point at which the function changes meet a threshold, e.g., the function value changes within 10%. .

6. And adding the key coding value into the coding subset to form a new coding subset.

7. And repeating the steps of 3-6 until the training requirement is met. Wherein the training requirements may be:

(7.1) repeating the steps of 3-6N times; wherein, N is a fixed value and is selected according to actual needs. For a simpler recommendation score prediction algorithm, since the calculation time is shorter, N may be larger, for example, 15; for the recommendation score prediction algorithm using the neural network, the reinforcement learning and other methods, since the calculation time is long and the algorithm itself has an optimization process, N may be larger, for example, 7.

(7.2) LOSS is a value in the range of a% -b% of the average LOSS value, e.g., 5% -10%.

8. And taking the recommendation algorithm corresponding to the code value corresponding to the LOSS minimum value of the final recommendation algorithm optimizing function as the optimal recommendation algorithm.

And fifthly, a control strategy recommending unit of the traffic control strategy recommending module acquires an optimal recommended model combination from a recommending algorithm optimizing unit, inputs the intersection and the unexecuted optional traffic control strategies into the optimal recommended model combination, outputs predicted intersection state scores corresponding to the intersection and the unexecuted optional traffic control strategies to form a three-metadata table of [ intersection, unexecuted optional traffic control strategies and predicted intersection state scores ], is spliced with the three-metadata table of the actual [ intersection, actual traffic control strategies and actual intersection state scores ] to obtain a three-metadata table of [ intersection, all optional traffic control strategies and mixed intersection state scores ], sorts according to the recommended mixed intersection state scores, and automatically recommends the traffic control strategies for the intersection at the next moment. Here, all the selectable traffic control strategies represent the splicing of the unexecuted selectable traffic control strategies and the actual traffic control strategies, the mixed intersection state score represents the actual intersection state score corresponding to the actual traffic control strategy, and the predicted intersection state score corresponding to the unexecuted selectable traffic control strategy.

In this embodiment, for the historical experience library in table 5, the optimal recommended model combination is determined as the KNN algorithm, the hyperparameter k is 5, and the distance metric d is the euclidean distance.

In the first step, a distance measure d between intersections is calculated. For example, the score vector for LK _1 is {4, 0.·, rt_{1_jj_kk_mm}And (2) (the unexecuted traffic control strategy score is 0), the score vector of LK _2 is {0, 1',. rt)_{2_jj_kk_mm}V, 0, a distance measure d between LK _1 and LK _2_{1_2}Comprises the following steps:

wherein N is_CLIndicates the number of elements in the score vector, rt_{1_k}Is k element, rt, in the score vector of LK _1_{2_k}K elements in the score vector for LK _ 2.

Secondly, screening out k intersection set units with the minimum d value for each intersection LK _ ii_{user_ii}. Where k is 5, then Unit_{LK_ii}Which comprises 5 intersections.

Thirdly, combining the units_{LK_ii}And averaging the scores of all the traffic control strategies at the intersection, wherein the obtained score is the predicted score of the LK _ ii for all the traffic control strategies.

Fourth, for a traffic control strategy with a predictive score of 0 (i.e., Unit)_{LK_ii}Where all intersections have not scored the traffic control policy), the set needs to be updated, e.g., 2k intersections with the smallest d value are selected as new units'_{LK_ii}And (6) predicting the score. Note that intersection set Unit is adopted'_{user_ii}Predicting only adopting intersection set Unit_{LK_ii}The traffic control strategy with the prediction of 0, and the prediction scores of other traffic control strategies still adopt intersection set units_{LK_ii}The prediction score of (c) is standard. The hybrid intersection status scoring table is shown in table 8, where the body numbers are actual and the italic numbers are predicted.

TABLE 8

And then, recommending the traffic control strategy with the grade in the front for each intersection to realize automatic recommendation.

In summary, the invention searches intersections with similar feedback to the selectable traffic control strategy based on the intersection state level change distribution after the selectable traffic control strategy is executed, and provides traffic control strategy recommendation for the target intersection based on the traffic control strategies adopted by the intersections. The calculation amount is greatly reduced, and real-time calculation can be provided. Under the condition of large quantity of intersections, the screened intersections have better approximate effect on the feedback of the selectable traffic control strategies, and the recommended strategies are more accurate.

While the invention has been described in connection with specific embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A regional real-time traffic control strategy recommendation method is characterized by comprising the following steps:

(3) a dynamic recommendation scene construction unit of the dynamic recommendation scene construction module constructs a three-metadata table of [ intersection, actual traffic control strategy and actual intersection state score ] based on actual intersection traffic control strategy data and actual intersection state grading data; the dynamic recommendation experience base stores a three-metadata table of [ intersection, actual traffic control strategy and actual intersection state score ]; the three-metadata data table of the [ intersection, actual traffic control strategy and actual intersection state score ] is constructed as follows:

wherein: c0or is whether to participate in the coordination,

n^policyin order for the policy to be taken,

wherein the content of the first and second substances,

(c) marking the intersection state score: rt is an integer of^CLS (t) -S (t-1), S (t) represents the intersection state level for the t time period;

2. The method of claim 1, wherein the method comprises: the step (2) is specifically as follows: a grading algorithm unit of the dynamic intersection state grading module integrates intersection data by using continuously iteratively updated data resources to obtain a comprehensive intersection grading index; and a grade standard of the intersection which accords with the recent traffic characteristics is formulated through a grade standardization unit; then, determining the grade of the marked intersection at the appointed time according to the real-time and historical actual traffic state data for the specific intersection at the specific time interval through a dynamic grading unit; the method comprises the following specific steps:

S_t＝α₁·α₂·[l_{turn_1}·q_{turn_1，t}，…，l_{turn_i}·q_{turn_i，t}，…，l_{turn_j}·q_{turn_j，t}]·[v_{turn_1，t}/d_{turn_1，t}，…，v_{turn_i，t}/d_{turn_i，t}，…，v_{turn_j，t}/d_{turn_j，t}]^T

3. The method of claim 1, wherein the method comprises: the dynamic recommendation experience base is established in real time and dynamically changed, and the establishment method comprises the following steps:

4. The method of claim 1, wherein the method comprises: in the step (4), the recommendation algorithm optimizing unit realizes automatic selection of the initial recommendation algorithm model and the hyper-parameter and parameter combination; the recommendation algorithm optimizing unit takes a historical experience base as training data, takes the minimum average error LOSS between the predicted intersection state score and the actual intersection state score as a target, and selects an optimal initial recommendation algorithm model meeting the training requirements and corresponding hyperparameters or parameter combinations from a plurality of selectable initial recommendation algorithm models and selectable hyperparameters or parameter values corresponding to each initial recommendation algorithm model to form an optimal recommendation model combination; the specific process is as follows:

2) selecting X coded values from an initial coded set to form a coded subset;

3) respectively calculating LOSS values generated by a recommendation algorithm corresponding to the code values in the code subsets; wherein, LOSS value adopts mean square error MSE or root mean square error RMSE;

5) searching a key code value in the recommendation algorithm optimizing function, wherein the key code value adopts a minimum point of the function or a point of which the function change meets a threshold value;

7) repeating the steps 3) to 6) until the training requirement is met;

5. The method of claim 4, wherein the method comprises: the encoding method in the step 1) adopts a random numbering mode, and specifically comprises the following steps:

6. The method of claim 4, wherein the method comprises: in the step 2), the selection method is as follows:

2.1) random selection;

7. The method of claim 4, wherein the method comprises: in the step 4), a normal distribution probability density function is adopted, and the method comprises the following steps:

normal distribution probability density function

LOSS＝f(BH)

Abbreviated as, Z ═ XB; according to the least squares principle, the generalized least squares solution of the constructed matrix B is B ═ X^TX)^-1X^TZ, after being obtained, is according to

8. The method of claim 4, wherein the method comprises: the training requirements include the following:

7.2) LOSS value in the range of a% -b% of the average LOSS value.

9. A regional level real-time traffic control strategy recommendation system applying the method of claim 1, comprising: the system comprises a data module, a dynamic intersection state grading module, a dynamic recommendation scene building module and a traffic control strategy recommendation module; wherein the content of the first and second substances,