CN110491146B

CN110491146B - Deep learning-based traffic signal control scheme real-time recommendation method

Info

Publication number: CN110491146B
Application number: CN201910772945.8A
Authority: CN
Inventors: 郭海锋; 李瑶; 何德峰; 金峻臣; 孔桦桦; 周浩敏; 丁楚吟; 谢竞成; 杨宪赞; 温晓岳
Original assignee: Zhejiang University of Technology ZJUT; Enjoyor Co Ltd
Current assignee: Yinjiang Technology Co Ltd; Zhejiang University of Technology ZJUT
Priority date: 2019-08-21
Filing date: 2019-08-21
Publication date: 2020-08-21
Anticipated expiration: 2039-08-21
Also published as: CN110491146A

Abstract

A traffic signal control scheme real-time recommendation method based on deep learning comprises the following steps: preprocessing traffic data based on the acquired traffic state data, wherein the traffic data comprises cleaning error data, correcting abnormal data and repairing missing data; the method comprises the steps of constructing a training data set model of a time sequence, training a traffic signal control scheme real-time recommendation model of the intersection based on a deep learning method of a CNN-DA-RNN framework, recommending a traffic signal control scheme at the next moment, and realizing a problem intersection signal control scheme real-time recommendation function. The invention reduces the time for optimizing the intersection, improves the working efficiency of personnel, feeds back the recommended scheme in real time, and increases the reliability and reproducibility of the recommended scheme.

Description

Deep learning-based traffic signal control scheme real-time recommendation method

Technical Field

The invention relates to the field of intelligent traffic and urban traffic control, in particular to the field of traffic signal control scheme recommendation.

Background

With the increasing of the automobile holding amount in cities in China, the motorization level of roads is continuously improved, and the problem of urban traffic jam becomes a pain point and a difficulty point for urban management, thereby restricting the economic development to a certain extent. The urban traffic jam also causes great negative influence on the trip experience of citizens, and the most urgent task in the current urban management is to relieve the urban traffic jam problem, improve the urban traffic operation efficiency and enable the constructed signal system to exert the maximum efficiency.

The operation of urban traffic systems is complex and changeable, and the road traffic systems are continuously changed along with the change of time and space. Due to the openness, randomness and dynamics of the road traffic system, the traffic system is severely blocked or even paralyzed when traffic accidents, rainstorms, snowstorms or other emergencies occur, such as rush hour on duty or off duty. A series of factors affecting road traffic systems are uncertain and sudden, and thus a plurality of factors need to be integrated in the traffic optimization control process. However, in the face of traffic problems caused by excessively complex factors, the traditional traffic signal control method cannot meet the requirements of current traffic control optimization, and how to establish optimal control for seeking traffic signals is the current key.

In order to fully exert the function of the urban traffic control system, traffic signal optimization service has been developed in recent two years, the development of the service is driven by policies from management departments such as the ministry of public security, province and city on the one hand, and the requirement of urban traffic control business really exists on the other hand, especially in first-line cities. The traffic signal optimization service aims to enable the constructed signal system to exert the maximum efficiency and improve the urban traffic operation efficiency to the maximum extent.

In the traffic signal optimization service, under the condition that the change of traffic flow rules is large, such as the peak at morning and evening or holidays, various parameters of crossing monitoring and real-time signal timing adjustment of a signal control system need to be checked manually. The regulation and control mode has the defects of non-reproducibility, low efficiency, low reliability and the like, and a novel technology is urgently needed to be used as an auxiliary means to alleviate the problems.

Disclosure of Invention

The invention provides a signal control scheme recommendation method conforming to urban traffic control rules, aiming at overcoming the defects of low regulation and control efficiency, low reliability and non-replicability in the traffic signal control process in the prior art.

The invention trains a traffic signal control scheme real-time recommendation model based on an algorithm of an artificial intelligence deep learning neural network according to traffic state data, and outputs a signal control scheme suitable for the flow and the saturation of the current signalized intersection. The output scheme can be recommended to the front-line traffic control personnel in real time, the scheme can be issued in real time after the judgment is reasonable, the time for optimizing the intersection is reduced to a certain extent, the working efficiency of the personnel is improved, the recommended scheme is fed back in real time, and the reliability and the reproducibility of the recommended scheme are improved.

The invention achieves the aim through the following technical scheme: a traffic signal control scheme real-time recommendation method based on deep learning comprises the following steps:

1.1 collecting traffic data including traffic control data and traffic status data, including but not limited to: signal system cycle end time, cycle duration, split data, the traffic state data includes but is not limited to: flow, saturation, velocity;

1.2 preprocessing traffic data, including cleaning error data, correcting abnormal data and repairing missing data;

1.3 constructing a time series of data sets, the steps comprising:

1.3.1 extracting sample points of a data set, the sample points referring to cycle end times when control scheme data is satisfied, the control scheme data being a green to green ratio variation;

1.3.2 constructing sample point data, and extracting T groups of traffic data x corresponding to the sample points_iT group control scheme data y _target,i1 group of split data y_his,iWherein i represents the ith sample point;

1.3.3 constructing a time series data set satisfying the training requirements, including a traffic data set x_tControl scheme data set y_targetAnd a split data set y_hisTraining requirements include, but are not limited to: the number of sample points, the length of time for extracting traffic data for the sample points;

1.4, constructing a deep learning algorithm model based on CNN-DA-RNN, wherein the first layer CNN adopts an unsubbed convolutional layer neural network, and obtains output data with unchanged dimensionality by performing convolution calculation on input data, and the output data is used for extracting the short-time dependency relationship and the dependency relationship among variables in the time dimensionality of the input data; the second layer DA-RNN adopts a recurrent neural network based on a two-stage attention mechanism, and is used for performing attention allocation in space and time dimensions on input data, and encoding and decoding, and specifically includes:

1.4.1 performing spatial attention distribution on the traffic data set data output from the convolutional layer;

1.4.2, encoding the data after the space attention distribution;

1.4.3 time attention distribution is carried out on the coded data;

1.4.4 weighted calculation of the time attention assigned data, weighted data and control scheme data set y_targetDecoding is carried out;

1.5 training a traffic signal control scheme real-time recommendation model: taking the data set in the step 1.3 as a training data set, and performing learning training on the deep learning algorithm model in the step 1.4, wherein the method for training the model includes but is not limited to: the method comprises a random gradient descent optimizer method, an Adam optimizer method and an automatic parameter adjusting method, wherein the end condition of a training model is that the convergence degree of a loss function meets requirements, and the loss function is the mean square error of data of a prediction control scheme and data of an actual control scheme.

1.6 recommending a traffic signal control scheme at the next moment: and collecting real-time traffic state data, inputting a traffic signal control scheme real-time recommendation model, and obtaining prediction control scheme data output by the model.

Further, the period duration of the traffic control data in step 1.1 indicates the time required by the signal lamp to display various lamp colors for one week in turn, and the data dimension is 1 dimension; the split of the traffic control data is the split data of each phase at the intersection, and the data dimension is the phase number; the traffic state data refers to traffic state data of all lanes at the intersection, and the data dimension is the number of traffic state data types and the number of lanes.

Further, the error data is cleaned in the step 1.2, wherein the default value and the repeated value are deleted; correcting abnormal data, judging whether the data is an abnormal value by using a t test method in statistics, carrying out interpolation processing on the abnormal value by using a spline function method, and carrying out interpolation by using historical data; repairing missing data by adopting a multivariate linear regression model method, and comprising the following steps of: (1) making a scatter diagram on the existing data and performing multiple regression processing; (2) solving a multiple linear regression polynomial and a confidence interval; (3) making a residual error analysis graph, and verifying the fitting effect, wherein the smaller the residual error is, the better the coincidence degree of the regression polynomial and the source data is, and (4) supplementing the missing data by a polynomial equation with the minimum residual error.

Further, the specific content of the sample point data constructed in step 1.3.2 is as follows: t groups of traffic data, the traffic state data of the sample points and the first T-1 group of the sample points which are sorted according to the cycle end time are taken out and stored in an array form as x of the sample point data_iThe part is specifically as follows:

x_i＝{Cycle_i,A_i,B_i,...,G_i,VO_1i,VO_2i,...,VO_ki,DS_1i,DS_2i,...,DS_ki} (1)

wherein C is_iIndicating the period duration, Cycle_i＝[C_i-T,...,C_i-1,C_i]；A_i,B_i,...,G_iGreen ratio data indicating the control phases A, B, …, G of the signal, A_i＝[a_i-T,...,a_i-1,a_i],B_i＝[b_i-T,...,b_i-1,b_i],…,G_i＝[g_i-T,...,g_i-1,g_i]； VO1_i,VO2_i,…,VOk_iThe traffic data of the traffic lane is indicated,

DS1_i,DS2_i,…,DSk_ithe data of the saturation degree of the lane is indicated,

the number of the signal control phases is related to the intersections, and the number of the phases and the phase sequence of the operation of different intersections are different;

t sets of control scheme data, the control scheme data of the sample points and the first T-1 sets of sample points are taken out and stored in the form of an array as y of sample point data_targetIn part, in particularComprises the following steps:

y_target,i＝{ΔA_i,ΔB_i,ΔC_i,ΔD_i,ΔE_i,ΔF_i,ΔG_i} (2)

wherein Δ A_i,ΔB_i,...,ΔG_iMeans the amount of change Δ A of adjacent split_i＝[Δa_i-T,...,Δa_i-1,Δa_i],ΔB_i＝[Δb_i-T,...,Δb_i-1,Δb_i],…,ΔG_i＝[Δg_i-T,...,Δg_i-1,Δg_i]，y_target,iThe dimensionality of the intersection is determined by the actual running phase quantity of the intersection;

a set of split plan data: the green ratio data of the next moment of the sample point is taken and stored in an array form as y_his,iThe part is specifically as follows:

y_his,i＝{A′_i+1,B′_i+1,C′_i+1,D′_i+1,E′_i+1,F′_i+1,G′_i+1} (3)

wherein A'_i+1,B′_i+1,C′_i+1,D′_i+1,E′_i+1,F′_i+1,G′_i+1Phase-referred green signal ratio value A'_i+1＝[a_i+1],B′_i+1＝[b_i+1], …,G′_i+1＝[g_i+1]，y_his,iThe dimension of (c) is determined by the number of phases actually operated at the intersection.

Further, the extraction method of the time dependency relationship and the dependency relationship between the variables of the unsuccessfully-pooled convolutional layer neural network for the input data, described in step 1.4, is as follows:

the input data of the convolutional layer is a time-series traffic data set x_tThe convolutional layer is composed of a plurality of filters with width omega and height n, wherein the setting of width omega is the same as the green ratio of input data, the setting of height n is the same as the column dimension of input data variables, and the k filter scans the input matrix x_iAnd generating:

h_cnnk＝RELU(W_cnnk*x_i+b_cnnk) (4)

meaning of formula: wherein denotes a convolution operation, h_cnnkIs an output vector, RELU (x) linear modification unit neuron activation function, RELU function can accelerate gradient descent and backward propagation, and avoid the problems of gradient sharp rise and sharp decrease, W_cnnk，b_cnnkThe convolution matrix and the offset to be learned are continuously corrected in the training process, and the range of k is the ratio of the length of input data to the size (omega n) of the filter;

to keep the convolution output h_cnnkIs consistent with the dimension of the input data by aligning the input matrix x_iThe method for increasing the dimension is realized, and the variable value of the dimension is increased to be 0; the method comprises the following implementation processes: x is the number of_iDimension i j, convolution matrix W_kDimension 3 x 3, h for obtaining dimension i x j_cnnkBy varying x_iIs (i +1) × (j +1) and the variable value of the added dimension is 0.

Further, step 1.4.1 specifically includes: the spatial attention allocation is the first stage of the two-stage attention mechanism, the spatial attention is introduced as the input attention mechanism, the correlation is automatically extracted for the input data at each moment, and the input attention weight is calculated according to the previous hidden state of the encoder, and the method comprises the following steps:

for X at each time of input data X_tUsing the attention mechanism, the formula is as follows:

wherein [ h_t-1；s_t-1]Is the last hidden state h_t-1And the last state s_t-1Of a cascade function of v_e，W_e， U_eRefers to the parameters of the high-dimensional matrix to be learned,

is the spatial attention weight assigned to the kth input feature at time t, the output after spatial attention assignment

Comprises the following steps:

the specific process 1.4.2 encodes the data after spatial attention allocation: the neural state of LSTM of the encoder LSTM unit is dynamically summed along with the time, the long-term dependence relationship is memorized, the problem of rapid gradient reduction is easily solved, the method is effective for processing the time sequence problem, and the LSTM method is used for inputting data

The encoding method is as follows:

first the encoder can learn from x_tTo h_tMapping of (2):

wherein h is_tFor the hidden state of the encoder at time t, h_t-1A hidden state on the finger, f₁Is a non-linear activation function;

secondly, the encoding unit updates the state using the LSTM network as an activation function: the LSTM recurrent neural network comprises a forgetting gate f_tInput door i_tOutput gate o_tEach LSTM cell has a state s at time t_tMemory cell, state h_tThe updating method comprises the following steps:

f_t＝σ(W_f[h_t-1；x_t]+b_f) (9)

i_t＝σ(W_i[h_t-1；x_t]+b_i) (10)

o_t＝σ(W_o[h_t-1；x_t]+b_o) (11)

s_t＝f_t⊙s_t-1+i_t⊙tanh(W_s[h_t-1；x_t]+b_s) (12)

h_t＝o_t⊙tanh(s_t) (13)

wherein, [ h ]_t-1；x_t]Is the previous hidden state h_t-1And the current input x_tA cascade function of which W_f，W_i， W_o，W_s，b_f，b_i，b_o，b_sIs the parameter to be trained and learned, sigma and ⊙ are the logical function and element multiplication, respectively;

the specific process 1.4.3 performs time attention allocation on the encoded data: the time attention allocation is the second stage of the two-stage attention mechanism, the time attention mechanism is introduced to capture the long-term timing dependence information of the encoder, and the state data h input is subjected to based on the hidden state of the previous decoder_tThe time attention weight is calculated by the following method:

based on previous decoder hidden state d_t-1And state s 'of the last LSTM cell'_t-1Calculating a time attention weight for each encoder hidden state at time t using an attention mechanism

The formula is as follows:

wherein [ d ]_t-1；s′_t-1]Is a cascade function of the hidden state of the previous decoder and the state of the last LSTM cell, v_d，W_d，U_dAre the high-dimensional matrix parameters that need to be learned,

is the temporal attention weight assigned to the ith set of features at time t;

step 1.4.4 said weighting and calculating the time attention distributed data, weighting data and y_targetAnd (3) decoding: the method specifically comprises the following steps:

1.4.4.1. calculating all hidden states h_iWeighted sum vector c of_t：

Wherein, c_tIs the decoder LSTM unit input.

1.4.4.2. Calculating updated target outputs

Wherein, [ y ]_t-1；c_t-1]Is the output state y of the last decoder_t-1And the last one of the weighted sums of all hidden states c_t-1The function of the cascade of functions of (a),

and

are the parameters to be learned and trained.

1.4.4.3. Updating the hidden state d of the decoder at time t_t: utilizing new target output

And previous hidden state:

wherein f is₂For establishing time series for non-linear activation functionsLong term dependencies, choosing to use LSTM cells as f for updating hidden states₂Function, then hidden state d_tThe specific calculation is as follows:

d_t＝o′_t⊙tanh(s′_t) (23)

wherein the content of the first and second substances,

is the previous hidden state d_t-1And of the preceding objective function

Of cascade function of, wherein W'_f，W′_i，W′_o，W′_s，b′_f，b′_i，b′_o，b′_sIs the parameter to be trained and learned, σ and ⊙ are the logistic function and the element multiplication, respectively.

1.4.4.4. Estimating the output of the current moment:

decoder LSTM cell output y_DTA simulation function F is constructed through a DA-RNN structure, the function F can observe given input and previous output, and the output of the current moment is estimated

Wherein [ d ] is_T；c_T]Is a hidden state d of the decoding layer_TSum vector c_TOf the cascade function, parameter W_yAnd b_wIs a parameter to be learned and trained, the weight of a linear function

And bias b_vIs the parameter to be learned, determines the final prediction result

Further, the loss function calculation and judgment in step 1.5: the training process of the model comprises the steps of grouping all data in small batches, and training the model by using a Stochastic Gradient Descent (SGD) optimizer and an Adam optimizer; designing a smooth and differentiable output result to ensure that the parameters can be obtained through standard reverse propagation learning; designing loss functions of the objective function, namely predicted control scheme data and actual control scheme data:

where N is the amount of samples of training,

is a predicted solution to be used in the future,

is a practical solution, the result of the training causes the loss function to converge rapidly to a very small value β, convention β<0.2％。

Further, the step 1.5 of correcting the predictive control scheme data output by the traffic signal control scheme real-time recommendation model specifically adopts the following method:

correction result y_TIncluding predictive control scheme data y_DTAnd the result y of the linear regression calculation of the mixed regression model_ATTwo parts, the vector sum of the two:

y_T＝y_DT+y_AT(26)

wherein the model for the linear regression calculation is:

where q is the input matrix y_t-kK denotes the kth filter, W_aukAnd b_aukAre parameters that need to be learned.

Further, whether the recommended control scheme data of step 1.6 meets constraints, the constraints including:

(1) whether the actual flow and saturation conditions of the intersection are met or not; (2) whether the recommendation period is less than the maximum period time of the intersection or not is judged; (3) whether the recommended green signal ratio of each phase is greater than the minimum green light or not; (4) whether the safety time of the pedestrian phase is completely met; (5) whether the time setting of the particular phase is completed.

Further, the sample point in step 1.3 refers to the cycle end time when the control scheme data meets the requirement, and the control scheme data is the variation of the split ratio, specifically:

and judging the variation of the green signal ratio of any phase in the two adjacent groups of data, and when the variation exceeds 5% of the total cycle time, the control scheme data meets the requirement.

The invention has the beneficial effects that: and only starting from the traffic state data, calculating a signal control scheme suitable for the flow and the saturation of the current signalized intersection based on an artificial intelligence deep learning neural network algorithm, and recommending the traffic signal control scheme in real time. The output scheme can be recommended to the front-line traffic control personnel in real time, the scheme can be issued in real time after the judgment is reasonable, the time for optimizing the intersection is reduced to a certain extent, the working efficiency of the personnel is improved, the recommended scheme is fed back in real time, and the reliability and the reproducibility of the recommended scheme are improved.

Drawings

FIG. 1 is a sample point extraction flow diagram of the present invention;

FIG. 2 is a flow chart of the training data set construction of the present invention;

FIG. 3 is a diagram of a CNN-DA-RNN framework model of the present invention;

FIG. 4 is a diagram of a DA-RNN neural network model of the present invention;

FIG. 5 is a CNN-DA-RNN framework training loss function convergence curve of the present invention;

FIG. 6a is a graph comparing the average speed of the Yanan road in the 2019 month 1 and the 2018 month 12 weekdays;

fig. 6b is a graph comparing the average speed of the peace road in 2019

month

1 and 2018 month 12 off-weekdays.

Detailed Description

The invention will be further described with reference to specific examples, but the scope of the invention is not limited thereto.

An example is as follows: an important intersection of a main road Yanan road in Hangzhou city is selected: and (4) testing a real-time recommendation scheme generation method on the Qingchun road and the Yangan road, and verifying the effectiveness of the signal control scheme recommendation method designed by the invention. The method comprises the following steps:

firstly, collecting traffic data: collecting six-month traffic state data (data flow, saturation) and traffic control data (period and green letter ratio case data) of the intersection, storing the control data and the state data into the same data table of a database according to the period ending time, wherein the data also comprises the period ending time.

Secondly, preprocessing data: and performing statistics including a data amount, a null value amount, an abnormal null value and an abnormal value (for example, the sum of the split plan is not 0) on the data table al _ input _ scheme _ match _ state _ data, and saving the statistical result in the data quality table. Then, sequentially carrying out data preprocessing:

2.1 deleting default values and removing duplicate values;

2.2, correcting abnormal data, and performing interpolation of abnormal position data by adopting historical data;

and 2.3, repairing the missing data, performing residual error analysis by adopting a multiple linear regression model method, and supplementing the missing data by using a polynomial equation with the minimum residual error.

And after the data are preprocessed, counting abnormal data again and judging the data quality, and constructing a data set after the data quality meets the requirements. The data quality comprises a null value rate of less than 0.01, an abnormal value of less than 0.05 and the like.

And thirdly, constructing a time-series data set, and sequentially extracting sample points, sample point data and constructing a training data set according to the steps of constructing the data set.

3.1 sample points were taken as shown in FIG. 1. Segmenting the data of the trained original data, and determining sample points of the data in a certain time in sequence;

3.2 extract sample point data as shown in FIG. 2.

Extracting T groups of traffic data x corresponding to sample points_iThe traffic control data of the sample point and the first T-1 group of time data of the time are taken.

3.2.1T group traffic data, taking out the traffic state data of the sample point and the first T-1 group of the sample point according to the cycle end time sequence, storing the traffic state data in an array form as x of the sample point data_iThe part is specifically as follows:

x_i＝{Cycle_i,A_i,B_i,...,G_i,VO_1i,VO_2i,...,VO_ki,DS_1i,DS_2i,...,DS_ki} (1)

wherein C is_iIndicating the period duration, Cycle_i＝[C_i-T,...,C_i-1,C_i]；A_i,B_i,...,G_iGreen ratio data indicating the control phases A, B, …, G of the signal, A_i＝[a_i-T,...,a_i-1,a_i],B_i＝[b_i-T,...,b_i-1,b_i],…, G_i＝[g_i-T,...,g_i-1,g_i]；VO1_i,VO2_i,…,VOk_iNumber of traffic flow VO1 for traffic lane_i＝[vo_1i-T,...,vo_1i-1,vo_1i],

DS1_i,DS2_i,…,DSk_iLane-indicating saturation data

3.2.2T group control scheme data y_target,iThat is, the control scheme data of the sample point and the first T-1 group of time data of the time, the T group of control scheme data, the control scheme data of the sample point and the first T-1 group of the sample point are taken and stored in the form of an array as the y of the sample point data_targetThe part is specifically as follows:

y_target,i＝{ΔA_i,ΔB_i,ΔC_i,ΔD_i,ΔE_i,ΔF_i,ΔG_i} (2)

wherein Δ A_i,ΔB_i,...,ΔG_iMeans the amount of change Δ A of adjacent split_i＝[Δa_i-T,...,Δa_i-1,Δa_i]，ΔB_i＝[Δb_i-T,...,Δb_i-1,Δb_i],…,ΔG_i＝[Δg_i-T,...,Δg_i-1,Δg_i]，y_target,iThe dimensionality of the intersection is determined by the actual running phase quantity of the intersection;

3.2.31 sets of split data y_his,iAnd taking a group of split data after the sample point. A set of split plan data: the green ratio data of the next moment of the sample point is taken and stored in an array form as y_his,iThe part is specifically as follows:

y_his,i＝{A′_i+1,B′_i+1,C′_i+1,D′_i+1,E′_i+1,F′_i+1,G′_i+1} (3)

wherein A'_i+1,B′_i+1,C′_i+1,D′_i+1,E′_i+1,F′_i+1,G′_i+1Phase-referred green signal ratio value A'_i+1＝[a_i+1]，B′_i+1＝[b_i+1],…,G′_i+1＝[g_i+1]，y_his,iThe dimension of (c) is determined by the number of phases actually operated at the intersection.

3.3 construct a training data set as shown in FIG. 2. The training data set is a set of sample point data, including a traffic data set x_tControl scheme data set y_targetAnd a split data set y_hisAnd (4) three parts.

Fourthly, a deep learning algorithm model based on the CNN-DA-RNN is constructed, and the model architecture is shown in the attached figure 3. The first layer CNN adopts an unbooled convolutional layer neural network, and the second layer DA-RNN adopts a recurrent neural network based on a two-stage attention mechanism. The first layer is mainly used for extracting the dependency relationship of short time in the time dimension of the input data and the dependency relationship among variables. The second layer mainly functions to perform attention allocation in spatial and temporal dimensions on input data and perform encoding and decoding, and specifically includes:

4.1 extracting the time dependency of the input data and the dependency between the variables:

the convolutional layer is composed of a plurality of filters with width omega and height n, wherein the width omega is set to be the same as the green ratio of input data, the height n is set to be the same as the column dimension of input data variables, and the k filter scans the input matrix x_iAnd generating:

h_cnnk＝RELU(W_cnnk*x_i+b_cnnk) (4)

4.2 performing spatial attention distribution on the traffic data set data output from the convolutional layer;

for each time of input data XX of_tUsing the attention mechanism, the formula is as follows:

Comprises the following steps:

4.3 encoding the data after spatial attention allocation:

first, the encoder can learn from x_tTo h_tMapping of (2):

second, the encoding unit updates the state h using the LSTM network as an activation function_t：

f_t＝σ(W_f[h_t-1；x_t]+b_f) (9)

i_t＝σ(W_i[h_t-1；x_t]+b_i) (10)

o_t＝σ(W_o[h_t-1；x_t]+b_o) (11)

s_t＝f_t⊙s_t-1+i_t⊙tanh(W_s[h_t-1；x_t]+b_s) (12)

h_t＝o_t⊙tanh(s_t) (13)

4.4 time attention allocation is carried out on the encoded data:

The formula is as follows:

4.5 weighted calculation of the time attention assigned data, weighted data and control scheme data set y_targetAnd (3) decoding:

4.5.1 calculate all hidden states h_iWeighted sum vector c of_t：

4.5.2 calculating updated target output

4.5.3 updating the hidden state d at the decoder moment t_t: utilizing new target output

And previous hidden state:

wherein f is₂For long-term dependencies of the nonlinear activation function to build the time series, the choice is made to use the LSTM cell as f for updating the hidden state₂Function, then hidden state d_tThe specific calculation is as follows:

d_t＝o′_t⊙tanh(s′_t) (23)

4.5.4 estimate the output y at the current time_DT：

Decoder LSTM cell output y_DTAnd constructing a simulation function F through a DA-RNN structure, wherein the function F can observe given input and previous output and estimate the output at the current moment:

the output of the current time determines the final prediction result

Fifthly, training a traffic signal control scheme real-time recommendation model: and (3) taking the data set in the step (1.3) as a training data set, carrying out learning training on the deep learning algorithm model in the step (1.4), and adopting a random gradient descent optimizer method, an Adam optimizer method and an automatic parameter adjusting method to finish the training when the loss function converges to 0.02%.

The training set is used for training the model, the training set is divided into small batches, the data set is divided into N batches according to s being 128, training of the model is carried out according to the divided batch data, and batch grouping has the advantage that training of the training model can be accelerated. The training set is from a training data set, and 80% of the training data set is taken as the training set.

The test set is used for judging the model, test data is input into the model, and a loss function of the model is calculated as follows:

the loss function converges rapidly and to 0.02%, and the model under the parameter is judged to be available. The test set is from the training data set, taking 80% of the training data set as the test set.

In the experiment, the size of the small batch of packets is set to be S, S is 128, the learning rate is alpha, alpha is 0.01%, the iteration number is M, M is 5000, and the loss function of the experiment result is converged rapidly, as shown in the figure five. Storing the traffic signal control scheme real-time recommendation model under the parameter;

and sixthly, recommending a traffic signal control scheme at the next moment: collecting real-time traffic state data, inputting a traffic signal control scheme real-time recommendation model, obtaining prediction control scheme data output by the model, and correcting a prediction result by a mixed regression model linear regression method:

y_T＝y_DT+y_AT(26)

the intersection recommends a sample case of the split plan and the actual split plan as follows:

the intersection carries out real-time scheme recommendation model deployment and application in 2019 and 1 month, and fig. 6a and 6b are average speed comparison graphs of the Yanan road in 2019 and 1 month, and working days and non-working days in 2018 and 12 months respectively: the speed equalizing speed in 1 month is obviously improved compared with the speed equalizing speed in 12 months, which shows that the recommended scheme plays a role in optimizing the traffic state of the intersection and improving the speed equalizing speed of the road.

The embodiments described in this specification are merely illustrative of implementations of the inventive concept and the scope of the present invention should not be considered limited to the specific forms set forth in the embodiments but rather by the equivalents thereof as may occur to those skilled in the art upon consideration of the present inventive concept.

Claims

1. A traffic signal control scheme real-time recommendation method based on deep learning comprises the following steps:

1.1 collecting traffic data including traffic control data and traffic status data, including but not limited to: signal system cycle start time, cycle end time, cycle duration, split data, belonging to traffic state data including but not limited to: flow, saturation, velocity;

1.3 constructing a time series of data sets, the steps comprising:

1.3.2 constructing sample point data, and extracting T groups of traffic data corresponding to the sample points

T group control scheme data

1 group of split data

Wherein i represents the ith sample point; the method specifically comprises the following steps:

t groups of traffic data, the traffic state data of the sample point and the first T-1 group of the sample point which are sorted according to the cycle end time are taken out and stored in an array form as the sample point data

The part is specifically as follows:

(1)

wherein

The duration of the cycle is referred to as the period,

the green ratio data of signal control phases A, B, …, G,

；

the traffic data of the traffic lane is indicated,

，

，,…,

；

,…,

the data of the saturation degree of the lane is indicated,

,

,..,

the number of the signal control phases is related to the intersections, and the number of the phases and the phase sequence of different intersections are different;

t sets of control scheme data, the control scheme data of the sample points and the first T-1 sets of sample points are taken out and stored in the form of an array as the sample point data

The part is specifically as follows:

(2)

wherein

Means amount of change in adjacent green ratio of A phase

，

，

The dimensionality of the intersection is determined by the actual running phase quantity of the intersection;

a set of split plan data: the green ratio data of the next moment of the sample point is taken and stored in an array form as

The part is specifically as follows:

(3)

wherein

Green ratio of finger phase

1.3.3 constructing time series data sets that meet training requirements, including traffic data sets

Control scheme data set

Sum-to-Lu ratio data set

Training requirements include, but are not limited to: the number of sample points, the length of time for extracting traffic data for the sample points;

1.4.2, encoding the data after the space attention distribution;

1.4.3 time attention distribution is carried out on the coded data;

1.4.4 weighted calculation of the time attention assigned data, weighted data and control scheme data set

Decoding is carried out;

2. The method according to claim 1, wherein the concept and data dimension of the specific content of the traffic control data in step 1.1 are as follows:

the period duration refers to the time required by the signal lamp for displaying various lamp colors for one week in turn, and the data dimension is 1 dimension;

the split is green ratio data of each phase of the road junction, and the data dimension is the phase number;

the traffic state data refers to traffic state data of all lanes at the intersection, and the data dimension is the number of traffic state data types and the number of lanes.

3. The method of claim 1, wherein the step 1.2 of cleaning the error data comprises deleting default values and duplicate values;

correcting abnormal data, judging whether the data is an abnormal value by using a t test method in statistics, carrying out interpolation processing on the abnormal value by using a spline function method, and carrying out interpolation by using historical data; repairing missing data by adopting a multivariate linear regression model method, and comprising the following steps of: (1) making a scatter diagram on the existing data and performing multiple regression processing; (2) solving a multiple linear regression polynomial and a confidence interval; (3) making a residual error analysis graph, and verifying the fitting effect, wherein the smaller the residual error is, the better the coincidence degree of the regression polynomial and the source data is, and (4) supplementing the missing data by a polynomial equation with the minimum residual error.

4. The method for recommending traffic signal control schemes based on deep learning of claim 1, wherein the extraction method of the time dependency relationship and the dependency relationship between variables of the unsuccessfully-pooled convolutional neural network for the input data is as follows in step 1.4:

the input data of the convolutional layer is a time-series traffic data set

The convolutional layer is composed of a plurality of filters with width omega and height n, wherein the setting of width omega is the same as the green ratio of input data, the setting of height n is the same as the column dimension of input data variable, and the k filter scans the input matrix

And generating:

(4)

meaning of formula: wherein x represents the operation of convolution,

is the output vector of the output,

linearly modify the activation function of the unit neurons,

the function can accelerate gradient descent and backward propagation, avoid the problems of sharp gradient rise and sharp gradient decrease,

the convolution matrix and the offset to be learned are continuously corrected in the training process, and the range of k is the ratio of the length of input data to the size (omega n) of the filter;

to preserve the output after convolution

Is consistent with the dimension of the input data by aligning the input matrix

The method for increasing the dimension is realized, and the variable value of the dimension is increased to be 0; the method comprises the following implementation processes:

of dimension i x j, convolution matrix

Dimension 3 x 3, for obtaining dimension i x j

By variation of

Is (i +1) × (j +1) and the variable value of the added dimension is 0.

5. The deep learning-based traffic signal control scheme real-time recommendation method according to claim 1, wherein the method is characterized in that

The specific process 1.4.1: the spatial attention allocation is the first stage of the two-stage attention mechanism, the spatial attention is introduced as the input attention mechanism, the correlation is automatically extracted for the input data at each moment, and the input attention weight is calculated according to the previous hidden state of the encoder, and the method comprises the following steps:

for input data

At each moment of time of

Using the attention mechanism, the formula is as follows:

(5)

(6)

wherein

Is the last hidden state

And the last state

The function of the cascade of functions of (a),

refers to the parameters of the high-dimensional matrix to be learned,

Comprises the following steps:

(7)

The encoding method is as follows:

first the encoder can learn from

Mapping of (2):

(8)

wherein

For the hidden state of the encoder at time t,

the last one of the hidden states is referred to,

is a non-linear activation function;

secondly, the encoding unit updates the state using the LSTM network as an activation function: the LSTM recurrent neural network comprises a forgetting gate

Output door

Each LSTM cell has a state at time t

Memory cell, state

The updating method comprises the following steps:

(9)

(10)

(11)

(12)

(13)

wherein the content of the first and second substances,

is the previous hidden state

And current input

A cascade function of which

，

，

Is a parameter to be trained and learned,

respectively logical function and element multiplication;

the specific process 1.4.3 performs time attention allocation on the encoded data: the time attention allocation is the second stage of the two-stage attention mechanism, and the time attention mechanism is introduced to capture the long-term timing dependence information of the encoder and apply the hidden state of the previous decoder to the input state data

The time attention weight is calculated by the following method:

based on previous decoder hidden states

And the state of the last LSTM cell

Calculate each compilation using an attention mechanismTime attention weight of decoder hidden state at time t

The formula is as follows:

(14)

(15)

wherein the content of the first and second substances,

is a cascaded function of the hidden state of the previous decoder and the state of the last LSTM unit,

are the high-dimensional matrix parameters that need to be learned,

is allocated at time t

Temporal attention weight of group features;

the specific process 1.4.4) weights and calculates the data after the time attention distribution, and weights the data sum

And (3) decoding: the method comprises the following steps:

1.4.4.1. computing all hidden states

Weighted sum vector of

：

(16)

Wherein the content of the first and second substances,

is the decoder LSTM unit input.

1.4.4.2. Calculating updated target outputs

：

(17)

Wherein the content of the first and second substances,

is the output state of the last decoder

And the last one of the weighted sums of all hidden states

The function of the cascade of functions of (a),

are the parameters to be learned and trained.

1.4.4.3. Updating the hidden state of the decoder at time t

: utilizing new target output

And previous hidden state:

(18)

wherein

For long-term dependencies of the nonlinear activation function for establishing the time series, the choice is made to use LSTM cells as the update of the hidden state

Function, then hidden state

The specific calculation is as follows:

(19)

(20)

(21)

(22)

(23)

wherein the content of the first and second substances,

is the previous hidden state

And of the preceding objective function

Of a cascade function of, wherein

Is a parameter to be trained and learned,

respectively logical functions and element multiplications.

1.4.4.4. Estimating the output of the current time

:

Decoder LSTM cell output

And constructing a simulation function F through a DA-RNN structure, wherein the function F can observe given input and previous output and estimate the output at the current moment:

(24)

wherein

Is a hidden state of the decoding layer

Sum vector

Cascade function of, parameter

Is a parameter to be learned and trained, the weight of a linear function

And bias

Is the parameter to be learned, determines the final prediction result

。

6. The method for recommending traffic signal control scheme based on deep learning of claim 1, wherein the loss function of step 1.5 is calculated and judged by:

the training process of the model comprises the steps of grouping all data in small batches, and training the model by using a Stochastic Gradient Descent (SGD) optimizer and an Adam optimizer; designing a smooth and differentiable output result to ensure that the parameters can be obtained through standard reverse propagation learning; designing loss functions of the objective function, namely predicted control scheme data and actual control scheme data:

(25)

where N is the amount of samples of training,

is a predicted solution to be used in the future,

is a practical solution, the result of the training is that the loss function converges rapidly to a very small value

Contract to

。

7. The method for recommending traffic signal control scheme based on deep learning of claim 1, wherein step 1.6 further comprises: correcting the data of the predictive control scheme output by the real-time traffic signal control scheme recommendation model, and specifically adopting the following steps:

corrected result

Including predictive control scheme data

And the result of the linear regression calculation of the mixed regression model

Two parts, the vector sum of the two:

(26)

wherein the model for the linear regression calculation is:

(27)

where q is the input matrix

K refers to the kth filter,

are parameters that need to be learned.

8. The method for recommending traffic signal control scheme based on deep learning of claim 1, wherein step 1.6 further comprises: judging whether the output control scheme data meets constraint conditions, wherein the constraint conditions comprise:

9. The deep learning-based traffic signal control scheme real-time recommendation method according to claim 1, wherein: the sample point described in step 1.3 refers to the cycle end time when the control scheme data meets the requirement, and the control scheme data is the variation of the split ratio, specifically: