Summary of the invention
The problem of introducing for above-mentioned background technique, the purpose of the present invention is to provide one kind to be based on rolling optimization
City-level universe traffic signals recommended method and system, road network divided into multiple optimization sub-districts in rolling optimization period dynamic
Domain in a dynamic way calculates the traffic signal control scheme of the optimization sub-district of the following period, realize optimization subregion with
The traffic signal optimization of the whole network region subdivision classification controls target, and then improves the transport services ability of city entirety.
The technical solution adopted by the present invention is that:
A kind of smart city traffic signalization recommended method, this method comprises:
Obtain real time traffic data;
Selection prioritization scheme corresponding with current optimization period and optimization subregion generates model, and the prioritization scheme generates
Model is obtained using deeply learning algorithm through traffic data collection learning training;
It exports the prioritization scheme and generates the signal prioritization scheme that model is recommended.
Further, the optimization period determines method, comprising:
Extract optimization zone time upper continuous traffic state data;
Partitioning optimization total period is formed with the periodic feature sequence with the Partitioning optimization period;
According to similarity measure between periodic feature sequence, multiple optimization total periods are divided;
According to optimization Time segments division condition, optimization total period is divided into multiple optimization periods.
Further, optimization subregion determines method, comprising:
Extract the traffic state data at each crossing in the optimization optimization of region period;
The key characteristics for extracting each crossing, determine key crossing;
According to the connection relationship between key crossing, determines optimization subregion and optimize the optimization type of subregion, wherein
Connection relationship includes but is not limited to one or more of combination: physical interconnection relationship, Trunk Road Coordination relationship, regional coordination close
System, optimization type includes single-point type, main line type, sub-district type.
Further, prioritization scheme generates model acquisition methods, comprising:
It constructs hierarchical layered prioritization scheme and generates model, bottom is that an intelligent body will individually optimize sub-district with single crossing
Domain is modeled as a multiple agent, the corresponding input of the optimization type of selection and optimization subregion, output, target;Top layer is adopted
Method is optimized and revised with consultation, under the premise of respectively optimizing subregion target in realization optimization region, is held a conference or consultation to optimization region
It optimizes and revises;Model training is carried out using nitrification enhancement;
Nitrification enhancement includes but is not limited to Nash Q-Learning algorithm, QMIX algorithm, FFQ algorithm, WoLF-PHC
Algorithm, MFMARL algorithm;
Input includes traffic state data and traffic control data, and traffic state data includes but is not limited to practical or prediction
Speed, reality or predicted flow rate, reality or prediction saturation degree;Traffic control data includes but is not limited to: split, signal control
Period coordinates phase difference;
Output includes but is not limited to split, the signal control period, coordinates phase difference;
Target includes but is not limited to that average speed improves, average staturation reduces, vehicle queue length subtracts in average period
It is few.
Further, further includes:
It is according to current optimization period corresponding optimization interval time, the signal of current real time traffic data and output is excellent
Change number of data sets evidence of the protocol as training pattern, realizes roll output signal prioritization scheme;The different optimization periods
Corresponding different optimization interval time.
Further, traffic data may include prediction traffic state data, predict traffic state data acquisition methods, packet
It includes:
Data classification threshold value is set;
Class label is carried out according to traffic state data of the classification thresholds to training traffic status prediction model;
Deep learning algorithm is trained with class label data, obtains traffic status prediction model, traffic behavior is pre-
The input for surveying model is the actual traffic status data of previous period, is exported to execute the traffic behavior number after signal prioritization scheme
According to after execution signal prioritization scheme of the target for traffic state data and model output after actual execution signal prioritization scheme
Traffic state data between gap meet the requirements;
The traffic state data of multiple time granularities is acquired, traffic status prediction model is inputted, exports the traffic behavior of prediction
Data.
A kind of City-level universe traffic signals recommender system based on rolling optimization, comprising: rolling optimization module, optimization side
Case generates model module, receding horizon module, optimization sub-area division module, wherein
Rolling optimization module obtains real time traffic data, and selection and current optimization period and optimization subregion are corresponding excellent
Change schemes generation model, output prioritization scheme generates the signal prioritization scheme that model is recommended;It is corresponding according to the current optimization period
Optimize interval time, using current real time traffic data and the signal prioritization scheme data of output as the data of training pattern
Collect data, roll output signal prioritization scheme;
Prioritization scheme generates model module, obtains traffic data collection, constructs and training generation hierarchical layered prioritization scheme is raw
At model, bottom is that single optimization subregion is modeled as a multiple agent by intelligent body with single crossing, selection with it is excellent
The corresponding input of the optimization type of sub-areas, output, target;Top layer optimizes and revises method using consultation, optimizes realizing
Under the premise of respectively optimizing subregion target in region, consultation is made to optimization region and is optimized and revised;It is carried out using nitrification enhancement
Model training;
Receding horizon module obtains traffic data, according to the periodic characteristics of optimization regional traffic state data, divides
Multiple and different optimization periods and corresponding optimization time interval;
Optimize sub-area division module, obtain traffic data, is determined and closed according to the key characteristics at each crossing in the optimization period
Key crossing divides multiple and different optimization subregions and corresponding optimization type according to the connection relationship between key crossing.
Further, including data processing module, data processing module are used to pre-process Various types of data, the pre- place
Reason includes but is not limited to:
Completion, when there is data the time of periodicity and shortage of data to meet threshold time in short-term, using other complete
The data of period times are supplemented;
It repairs, is grown when data with level characteristics, with other same type data there is relevance, shortage of data time to meet
When threshold time when, utilize the level characteristics of data, advised with the reparation of the relevances of other same type data setting missing data
Then, according to reparation rule repair data;
Matching, when between data and data there are when time, space, relevance in logic, according to relevance by data
It is matched with data, establishes corresponding relationship;
Fusion integrates the data in same data processing link when being related to multiple data processing links;
Data processing lack of standardization, including but not limited to: the cleaning of abnormal data, data format are unified, at data normalization
Reason;
Logic analysis, when data processing link is related to service logic relationship, according to service logic relationship analysis data
Reasonability.
Further, further include traffic status prediction module, obtain traffic state data, input traffic status prediction model,
Export the traffic state data of target crossing object time, the classification of traffic status prediction model traffic state data for identification
Label is simultaneously classified.
It further, further include interactive visual module, interactive visual module includes that unit is deduced in emulation, data show list
Member, manual intervention unit, wherein
Unit is deduced in emulation, obtains the data of traffic status prediction module, display real-time traffic states data, optimization period
Interior traffic state data, prediction traffic state data;
Data display unit shows Various types of data using chart mode;
Manual intervention unit, provides interactive mode, realizes that the setting of parameter, signal prioritization scheme is exectorial issues.
It further, further include database module, for storing and providing Various types of data, in training pattern, using data
The mode of library inquiry obtains historical data required for training pattern;In real-time calculate, single moment is obtained using interface mode
Data point.
Compared with prior art, the present invention its remarkable advantage includes: that (1) realizes that the control of hierarchical layered prioritization scheme is recommended,
More adapt to City-level traffic signalization.(2) prioritization scheme of rolling optimization Different Optimization period Different Optimization subregion generates
Model, closer to actual traffic signal recommended requirements.(3) traffic status prediction data improve excellent as one of input item
Change the effect of schemes generation model.
Specific embodiment
Next combined with specific embodiments below invention is further explained, but does not limit the invention to these tools
Body embodiment.One skilled in the art would recognize that present invention encompasses may include in Claims scope
All alternatives, improvement project and equivalent scheme.
Referring to Fig.1, Fig. 3, Fig. 4, Fig. 5 in one embodiment, provide a kind of City-level universe traffic based on rolling optimization
Signal recommended method, comprising: obtain real time traffic data;Selection optimization side corresponding with current optimization period and optimization subregion
Case generates model, and the prioritization scheme is generated model and obtained using deeply learning algorithm through traffic data collection learning training;
It exports the prioritization scheme and generates the signal prioritization scheme that model is recommended.
The signal prioritization scheme of current real time traffic data and output can be used as the data set of subsequent training pattern
Data, model are attached to time attribute, and time attribute refers to that optimization time interval label, label refer to the sequence after dividing equally according to the optimization period
Column realize rolling optimization by optimization time interval.
In one embodiment, optimization the period can be the same day flat peak period, peak period, wherein the morning peak period be
7:00-9:00, in the period, optimizing time interval is 30 minutes, i.e., optimization in every 30 minutes is primary.
1) optimize the determination of period
There is obviously trip rule, particularly temporal trip in this special mankind's activity phenomenon of traffic
Rule, such as: by the trip rule of Zhou Tian, there are points on working day and festivals or holidays;Monday and consistent traffic trends last Monday;
Tuesday and consistent traffic trends last Tuesday;The flow of Monday and Friday are significantly different etc..Such as: flow rule daily, such as
Workaday morning evening peak phenomenon, Friday special working day nonworkdays transient, festivals or holidays flow to scenic spot or comprehensive
Fit phenomenon.
According to the periodic characteristics of optimization regional traffic state data, it is continuous in time multiple excellent to extract optimization region
Change the traffic state data of total period;Optimization total period is divided into multiple optimization periods.
Traffic state data continuous in time, < Q1,t1>,<Q2,t2>,…,<Qn,tn>。
Extracting cycle feature X, including but not limited to:
(1) traffic state data, such as: < Qi,ti>;
(2) traffic state data variation tendency, such as: (Qi+1-Qi)/(ti+1-ti)、(Qi+2-Qi)/(ti+2-ti);
(3) the traffic state data statistical value in characteristic time section, such as: maximum value max { Qi,Qi+1,…,Qi+tt, it is minimum
Value min { Qi,Qi+1,…,Qi+tt, average value (Qi+Qi+1+…+Qi+tt)/tt, maximum difference max { Qi,Qi+1,…,Qi+tt}-min
{Qi,Qi+1,…,Qi+tt, mean square deviation stdev { Qi,Qi+1,…,Qi+tt, tt is characterized the period, can rule of thumb be arranged, or
Selection is compared according to the period sex expression of feature.
Form periodic feature sequence < X1,t1>,<X2,t2>,…<Xn,tn>。
Multiple optimization total periods are divided, the time span of multiple optimization total periods is equal, YLsum* NN≤tn-t1, and
Similarity magnitude meets threshold condition I between periodic feature sequence in optimization total period, such as: sim (< X1,X2,…,
XYLsum>,<XYLsum+1,XYLsum+2,…,XYLsum+YLsum>)<sim0.Similarity measurement value calculating method includes but is not limited to: cosine
Similarity, Euclidean distance, Pearson's similarity etc..
Optimization total period is divided into multiple optimization periods, and the time span of each optimization period is not necessarily equal, YLsum
=YL1+YL2+…+YLs, optimize the periodic feature in total period and meet optimization Time segments division condition, such as: YL1The interior period
Property feature be all satisfied threshold condition II, periodic feature at this time can extract again, not necessarily with divide multiple optimization total times
Identical periodic feature is used when section.
Traffic state data continuous in time forms S optimization period data collection, and an optimization period data is concentrated
There is the traffic state data of NN optimization period, the traffic state data similarity degree of this NN optimization period is high, periodically special
Point is strong, generates the traffic feature that model more adapts to the optimization period using the prioritization scheme that data set training obtains.
Simple example: 1 year data on flows was divided into for 52 week, according to similarity measure threshold decision, each week
Changes in flow rate trend is roughly the same, and the changes in flow rate of identical Zhou Tian also has certain similitude.According to optimization Time segments division
Condition, the morning peak time (06:00-09:00) on Monday is used as the 1st optimization period, (09:00-17:00) in 1 week
As the 2nd optimization period, the data on flows of the morning peak time (06:00-09:00) of the Monday in 52 week is as Monday
The data set of 1 optimization period.
In one embodiment, it was polymerize according to 30-50 letter control crossing history (15 weeks) flow every 30 minutes of optimization region
Flow rule and real-time traffic variation tendency afterwards, the comparison of the variation tendency of predicted flow rate, divide according to changes in flow rate and size
Optimize the period:
(1) optimization region morning and evening peak flow is obvious, and the optimization of region period is determined as the morning peak period: 06:00-09:00;
The evening peak time: 17:00-19:00;Flat peak time 19:00-06:00 and 09:00-17:00, i.e., optimization the period be divided into peak and
Flat peak executes different optimisation strategies;
(2) optimization region morning evening peak distinguishes unconspicuous situation and flow all day is in larger state, and the optimization period is complete
It, i.e., execute an optimisation strategy to whole day;
(3) early evening peak distinguishes unconspicuous situation and flow all day is in smaller state, and the optimization period is whole day, but this
The effect that situation executes an optimisation strategy may be less obvious, can be executed certainly using ready-made scats whistle control system
Suitable solution scheme.
2) optimize the determination of subregion
The traffic state data for extracting each crossing in the optimization optimization of region period, extracts the key characteristics at each crossing, root
Condition is determined according to key crossing, determines key crossing.The key crossing of Different Optimization period may be different.
The key characteristics at each crossing are extracted, including but not limited to:
(1) the basic key characteristics at crossing, such as: the position at crossing, be connected section quantity, be connected section road
Grade
(2) key characteristics of the crossing in the optimization period, such as: the traffic state data at crossing, crossing traffic behavior
Grade, the traffic behavior grade duration at crossing, crossing traffic behavior grade all traffic state at road cross grades sequence
(3) crossing is in the key characteristics of optimization total period, and such as: crossing is in multiple key characteristics sequences for optimizing the periods
Column, the duration of crossing over-saturation state, crossing are in the traffic state data difference for corresponding to the peak valley optimization period
The key characteristics at certain crossing meet threshold condition III in the optimization optimization of region period, then the crossing is determined as closing
Key crossing.
In one embodiment, meet the key crossing of threshold condition III: (1) daytime (05:00-10:00) flow rule becomes
Changing obviously is there are apparent peak value and peak valley, night flow are greater than night threshold 800 and a daily flow is greater than threshold value 8000
, and flow is much larger than peripheral region crossing, it is determined that and it is key crossing;(2) over-saturation state in crossing is greater than determination in 4 hours
For key crossing;(3) there are two or the flow difference of the above adjacent intersection be not very big when, with the variation of time, traffic
The position of situation spatially can be moved, this key crossing displacement may occur, i.e., occur in the different optimization periods
Different key crossings.
In one embodiment, optimization region generally comprises 30 to 50 letters control crossings, target by key crossing determination 3 to
5 key crossings.Meet threshold condition III: extracting key characteristics Y=< lox,loy,Q>,loxFor crossing x-axis direction position,
loyFor crossing y-axis direction position, Q is traffic state data mean value of the crossing in the optimization period, utilizes density clustering side
Method clusters 30-50 letter control crossing, forms 3-5 cluster, using closest to the crossing at cluster center as key crossing.
According to the connection relationship between key crossing, determines optimization subregion and optimize the optimization type of subregion.Connection
Relationship includes but is not limited to: physical interconnection relationship, Trunk Road Coordination relationship, regional coordination relationship.Optimization type includes single-point type, does
Line style, sub-district type.
Judge each key crossing with the presence or absence of physical interconnection, if there are Trunk Road Coordinations, if domain of the existence is coordinated.Certain
One key crossing belongs to key crossing there is no adjacent crossing, and the key crossing is there is no phenomenon is coordinated, i.e., with adjacent road
Mouth is without green wave band control program, then key crossing crossing adjacent thereto forms an optimization subregion, claims single-point type optimization
Region.Exist between key crossing and be directly connected to or the adjacent intersection of key crossing is there are intersection, there are Trunk Road Coordination crossings, then
This includes that the key crossing interconnected and its respective adjacent intersection form an optimization subregion, claims dry lines optimization sub-district
Domain.Assuming that between key crossing exist be directly connected to or the adjacent intersection of key crossing there are intersection, domain of the existence coordinate crossing,
An optimization subregion, referred to as sub-district type optimization should be then formed comprising the key crossing interconnected and its respective adjacent intersection
Region.
In one embodiment, if adjacent intersection A4, A2, A8, A6 of key crossing A5 are not belonging to key crossing, A5's
Control program is single-point control, is controlled with adjacent intersection without green wave band, then A2, A4, A5, A6, A8 constitute a single-point type optimization
Subregion.If A5 and A11 are key crossing, they have common adjacent intersection A8, and A2, A5, A8, A11, A14 are in the presence of dry
Line coordination, then A2, A5, A8, A11, A14, A4, A6, A10, A12 constitute a dry lines optimization subregion.If A5, A10 are to close
Key crossing, their adjacent intersection A4, A7, A8, A11 domain of the existence are coordinated, then A2, A4, A5, A6, A8, A7, A11, A13 are constituted
One sub- zone type optimizes subregion.
3) prioritization scheme generates the determination of model
It constructs hierarchical layered prioritization scheme and generates model, bottom is that an intelligent body will individually optimize sub-district with single crossing
Domain is modeled as a multiple agent, the corresponding input of the optimization type of selection and optimization subregion, output, target;Top layer is adopted
Method is optimized and revised with consultation, under the premise of respectively optimizing subregion target in realization optimization region, is held a conference or consultation to optimization region
It optimizes and revises;Model training is carried out using nitrification enhancement.
One optimization subregion includes M crossing, and using multiple agent intensified learning method, M crossing is modeled as M
Multiple agent, model are as follows:
T={ r, policy, vπ(s),model,[obs_sp1,...,obs_spM],[act_sp1,...,act_spM]}
Model parameter is specially reward value, greedy strategy policy, value function vπ(s), prediction model model and M road
Observed value space, the motion space of mouth.
According to the optimization type of optimization subregion, the input, output, target of training pattern are determined.
1 model training of table
One optimization region includes K optimization subregion, under the premise of realizing K optimization subregion target, to optimization
Region is made consultation and is optimized and revised, and optimizes the objective function t in region, formula is as follows:,
Wherein, Δ VkRefer to the variable quantity of the speed of k-th of optimization subregion, Δ DkRefer to the saturation degree of k-th of optimization subregion
Variable quantity, Δ QkRefer to k-th of vehicle row to the variable quantity of length.Intensified learning can be used Nash Q-Learning such as and calculate
The methods of method, QMIX algorithm, FFQ algorithm, WoLF-PHC algorithm, MFMARL algorithm.
Nash Q-Learning algorithm expands to the zero-sum game of two intelligent bodies in Minimax-Q algorithm mostly intelligent
Body generally and in game, with Minimax-Q algorithm needs the Nash Equilibrium point by Minimax linear programming for solution stage game
Difference, Nash Q-Learning algorithm use Quadratic Programming Solution Nash Equilibrium point.The Algorithm Convergence condition is, each
In the stage game of a state s, a globe optimum or saddle point can be found, only meets this condition, Nash Q-
Learning algorithm can restrain.Its algorithm flow is as follows:
1. initializing
2. i-th of intelligent body obtains movement a using exploration-Utilization strategies according to current state siAnd it executes
3. obtaining the reward r that next state s ' and intelligent body i observes all intelligent bodies1,r2,Λ,rn, and observe
The tactful a that all intelligent bodies are executed in state s1,a2,Λ,an
4. updating Qi(s,a1,Λ,an), formula is as follows:
Qi(s,a1,Λ,an)←Qi(s,a1,Λ,an)+α[ri+γNashQi(s′)-Qi(s,a1,Λ,an)]
5. using the Quadratic Programming Solution state s Nash Equilibrium strategy gone out and updating NashQi(s) and πi(s,·)
In this way at Nash equilibrium point, the strategy of each intelligent body is the optimal policy under other intelligent body strategies.
QMIX is a multiple agent nitrification enhancement, using a hybrid network to single intelligent body part value function into
Row merges, and global state information auxiliary, Lai Tigao algorithm performance are added in training learning process.Wherein QMIX in order to
The advantage of VDN is continued to use, centralized study obtains distributed strategy.Teamwork value function takes argmax to be equivalent to every
A activities value function takes argmax, and monotonicity is identical, as follows:
Distributed strategy is exactly greedy by local QiObtain optimal movement.QMIX is by argmaxuQtot(τ, u) function turns
A kind of monotonicity constraint is turned to, as follows:
In order to realize above-mentioned constraint, QMIX uses hybrid network (mixing network) Lai Shixian, and specific structure is such as
Shown in attached drawing 4.(c) of attached drawing 4 indicates that each intelligent body was fitted the Q value function of itself using a DRQN arrives Qi(τi,ai,
θi), DRQN circulation inputs current observation oi,tAnd the movement a of last momenti,t-1To obtain Q value.(b) of attached drawing 4 indicates mixed
Close the structure of network.Its output inputted as each DRQN network.In order to meet above-mentioned monotonicity constraint, the institute of hybrid network
Weight is all nonnegative number, with no restrictions to offset, can ensure that meet monotonicity constraint in this way.
In order to more use the status information s of systemt, will using a kind of super-network (hypernetwork)
State stAs input, the weight and offset for hybrid network are exported.It is linear using one in order to guarantee the nonnegativity of weight
Network and absolute value activation primitive guarantee that output is not negative.The same manner but the not pact of nonnegativity are used to offset
The offset of beam, hybrid network the last layer obtains Nonlinear Mapping network by two-tier network and ReLU activation primitive.By
In status information stIt is that Q is mixed by super-networktotIn, rather than as just the input item of hybrid network, it brings in this way
A benefit be the s if as input itemtCoefficient be positive, be then unable to fully improve using status information in this way
System performance is equivalent to the information content for having given up half.
QMIX final cost function are as follows:
Renewal process applies the thought of traditional DQN, and wherein b indicates the sample size sampled from experience memory.
WhereinIndicate target network
Meet monotonicity constraint above, to QtotThe calculation amount for carrying out argmax operation is just being not with intelligent body quantity
It is exponentially increased, but with intelligent body quantity linear increase, greatly improve efficiency of algorithm.
The adjustment of the parameter of parameter optimiser line training process, including sample size, dimension, frequency of training can also be passed through
Have an impact the adjustment of parameter Deng all to training result.
Output after model training is signal prioritization scheme.
4) determination of traffic behavior is predicted
Multistep (multiple periods) prediction, i.e. progress future are carried out to following traffic behavior using historical traffic status data
The Accurate Prediction of certain time length.
The method of use: setting data classification threshold value;Friendship according to classification thresholds to training traffic status prediction model
Logical status data carries out class label;Deep learning algorithm is trained with class label data, obtains traffic status prediction
Model, the input of the traffic status prediction model are the actual traffic status data of previous period, are exported excellent to execute signal
Traffic state data after change scheme, target be it is actual execution signal prioritization scheme after traffic state data and model output
Execution signal prioritization scheme after traffic state data between gap meet the requirements;Acquire the traffic behavior of multiple time granularities
Data input traffic status prediction model, export the traffic state data of prediction.
Prediction of speed is carried out using the artificial intelligence approach of classification, the benefit of classification or classification are more acurrate compared with homing method
It is that the data volume of its classification prediction is few, the variable quantity of data is few, and learning process is relatively easy.
In one embodiment, before training, length and open ended vehicle number according to section, speed limit data and average
Speed data and Baidu, Gao De and drop ooze the principles such as capable congestion in road level data and speed are divided into five grades, and 0,
1,2,3,4, each rate range is 0-10,10-20,20-30,30-40,40 and 40 or more, and unit km/h is respectively represented and seriously gathered around
Stifled, congestion, jogging is normally, unimpeded, thus replaces fuzzy speed with congestion class label.
Before training, classified according to classification thresholds to each speed data, that is, stamp class label, when training, uses label
Data are trained, i.e., will be changed to classification problem to the forecasting problem of speed, prediction the result is that certain in 0-4 is a kind of.
The input data of prediction, the i.e. traffic state data of multiple time granularities, including but not limited to:
(1) small network data, i.e., the traffic state data of T time section before the object time of target crossing;
(2) one data of big net, i.e., the traffic state data of T time section before target crossing adjacent intersection object time;
(3) two data of big net, i.e. target crossing, adjacent intersection at other with object time where the period have it is periodically special
Point period in, the moment corresponding with object time, preceding T time section, rear T time section traffic state data.
Simple example: object time is same day 8:00, and target crossing is A, and adjacent intersection B, small network data can be road
The data of 6:00-8:00 on the day of mouth A;One data of big net can be the data of 6:00-8:00 on the day of the B of crossing;The same day is Monday,
Two data of big net can be with are as follows: the data of crossing A, B Monday last week 6:00-10:00.
The algorithm of prediction, including but not limited to depth map convolutional network, LSTM and RNN.By taking depth map convolution as an example, adopt
With the mode of supervised, training generates prediction model.Training process adjusts ginseng process, loss function using automatic to the adjustment of parameter
Loss is less than 0.1.Using incremental training mode, training is timed subsequent.
The prediction model that each crossing generates may be different.When the traffic state data of prediction is used for training optimization side
When case generates model, using prediction model as one group of function of intensified learning, for judging that prioritization scheme generates model training mistake
The superiority and inferiority of (action) is acted in journey.
Referring to Fig. 2, City-level traffic network structure includes the section between multiple crossings and multiple crossings and crossing, portion
Branch mouth lays the transit equipments such as monitoring device, traffic lights, detector, and traffic signals recommender system is in one or more computer
Upper execution, the signal timing plan for providing recommendation are executed by traffic lights, realize the purpose of traffic signal optimization control.Wherein, it counts
Calculation machine may include the hardware of typical computing device, including processor, input/output (I/O) device, display, memory, can hold
The instruction of row traffic signals recommender system correlation computer, realizes function disclosed in embodiment.
Existing traffic signals recommender system: believed according to the traffic that the historical traffic data of system-wide net carries out the following period
Number control program calculates, to realize a certain traffic signal optimization control target, but optimization of road joints or region be all it is static,
When traffic behavior changes greatly or complex situations occurs, effect of optimization is poor.Not with existing traffic signals recommender system
Together, road network is divided multiple optimization subregions in rolling optimization period dynamic by the present embodiment, in a dynamic way to following a period of time
The traffic signal control scheme of the optimization sub-district of section calculates, realizes the traffic signals of optimization subregion and the classification of the whole network region subdivision
Optimal control target can cope with traffic condition complicated and changeable.
The City-level universe traffic signals recommender system based on rolling optimization that the present embodiment provides a kind of, comprising: database
Module, rolling optimization module, prioritization scheme generate model module, receding horizon module, optimization sub-area division module.
Database module, for storing and providing Various types of data.
A) according to historical data and real time data is divided into the case where use, historical data is used to find traffic circulation rule,
Such as: being learnt using the training that historical data carries out algorithm, or for statistical analysis to historical data, form traffic circulation mould
Type;The traffic operating mode for meeting certain features is found from historical data;Using historical data as reference, demonstration or verifying
Certain traffic circulation rules.Real time data is for reflecting Current traffic moving law, such as: current friendship is calculated using real time data
Logical state.Two kinds of data specifically include that the historical data of traffic actual motion, real time data;Emulate the historical data deduced, reality
When data;Historical data, real time data after data processing.
B) be divided into according to the case where data source: the transit equipments such as monitoring device, traffic lights, detector, internet are geographical
With map datum supplier.
C) it is divided into according to the case where data type: includes but is not limited to control program data, manual operation record, log number
According to, flow, saturation degree, speed, track data, geographic information data.
In one embodiment, storing and providing for data can be in such a way that database statement be directly inquired or use
The mode of interface, since data volume is huge and data are many kinds of, the use of every tables of data and storing frequencies are different, root
Mode is stored and provided according to the data volume of data, data type, using different data are arranged from storing frequencies, it is quick to meet
It obtains and the purpose using data.Such as: a large amount of, multiple types historical datas are needed in training pattern, are at this moment just used
Data needed for the mode of data base querying obtains model;In real-time deduce, the real time data at current time is only needed, without going through
At this moment history data can be used the data point that interface mode obtains single moment, the timeliness of deduction can be improved.
Rolling optimization module obtains real time traffic data, and selection and current optimization period and optimization subregion are corresponding excellent
Change schemes generation model, exports the prioritization scheme and generate the signal prioritization scheme that model is recommended.
It is according to current optimization period corresponding optimization interval time, the signal of current real time traffic data and output is excellent
Change number of data sets evidence of the protocol as training pattern, roll output signal prioritization scheme.
Prioritization scheme generates model module, obtains traffic data collection, constructs and training generation hierarchical layered prioritization scheme is raw
At model.
Building hierarchical layered prioritization scheme generates model: bottom is that an intelligent body will individually optimize sub-district with single crossing
Domain is modeled as a multiple agent, the corresponding input of the optimization type of selection and optimization subregion, output, target;Top layer is adopted
Method is optimized and revised with consultation, under the premise of respectively optimizing subregion target in realization optimization region, is held a conference or consultation to optimization region
It optimizes and revises;Model training is carried out using nitrification enhancement.
Nitrification enhancement includes but is not limited to Nash Q-Learning algorithm, QMIX algorithm, FFQ algorithm, WoLF-PHC
Algorithm, MFMARL algorithm.
Optimizing type includes single-point type, main line type, sub-district type, and different optimization types corresponds to different inputs, output, mesh
Mark.
Input includes traffic state data and traffic control data, and traffic state data includes but is not limited to practical or prediction
Speed, reality or predicted flow rate, reality or prediction saturation degree;Traffic control data includes but is not limited to: split, signal control
Period coordinates phase difference;
Output includes but is not limited to split, the signal control period, coordinates phase difference;
Target includes but is not limited to that average speed improves, average staturation reduces, vehicle queue length in average period
It reduces.
Receding horizon module obtains traffic data, according to the periodic characteristics of optimization regional traffic state data, divides
Multiple and different optimization periods and corresponding optimization time interval.Adoptable method:
1) the optimization period is preset, such as: when one day time was divided into three morning peak, evening peak, flat peak optimization
Section.
2) optimization zone time upper continuous traffic state data is extracted;Partitioning optimization total period is formed to use and divide excellent
Change the periodic feature sequence of period;According to similarity measure between periodic feature sequence, multiple optimization total periods are divided;
According to optimization Time segments division condition, optimization total period is divided into multiple optimization periods.
3) it extracts optimization zone time upper continuous traffic state data or extracts the data after feature;Cutting forms multiple
The period of Different Optimization period gathers.With YLminFor a segmentation unit, the continuous time period t of cuttingn-t1, compare the 1st with
Traffic data similarity magnitude in remaining several segmentation unit, statistics meets the number of similarity measure threshold value IV, if exceeded
Number of thresholds Num then extracts these periods, forms a period set;With YLminFor a segmentation unit, continue
The cutting non-extracted period;If appropriate to increase YL without departing from minimum number Nummin。
Optimize sub-area division module, obtain traffic data, is determined and closed according to the key characteristics at each crossing in the optimization period
Key crossing divides multiple and different optimization subregions and corresponding optimization type according to the connection relationship between key crossing.It can adopt
Method:
1) traffic state data at each crossing in the optimization optimization of region period is extracted;Condition is determined according to key crossing, really
Determine key crossing;According to the connection relationship between key crossing, determines optimization subregion and optimize the optimization type of subregion;Even
Connect relationship such as physical interconnection relationship, Trunk Road Coordination relationship, regional coordination relationship.
2) optimization region generally comprises 30 to 50 letter control crossings, and target determines key crossing in 3 to 5 critical paths
Mouthful.Meet threshold condition III: extracting key feature Y=< lox,loy,Q>,loxFor crossing x-axis direction position, loyFor crossing y
Axis direction position, Q is traffic state data mean value of the crossing in the optimization period, using density clustering method to 30-50
A letter control crossing is clustered, and 3-5 cluster is formed, i.e. optimization subregion.Using closest to the crossing at cluster center as key crossing,
According to the connection relationship between key crossing, determines optimization subregion and optimize the optimization type of subregion;Connection relationship such as object
Manage incidence relation, Trunk Road Coordination relationship, regional coordination relationship.
In one embodiment, can further comprise one or more of component: data processing module, traffic behavior be pre-
It surveys module, visualization model, signal and controls equipment.
Data processing module, including but not limited to one or more of mode:
A, completion, the scope of application: there is data the time of periodicity and shortage of data to meet threshold time in short-term;Using hand
Section: it is supplemented using the data of other complete cycle periods;Citing: the data on flows of a certain crossing multiple peak periods Monday
Identical or deviation meets periodicity condition, when the data on flows of the crossing a certain peak period Monday missing, can use it
He is replaced the data on flows of peak period Monday.
B, repair, the scope of application: data have relevance, shortage of data with level characteristics, with other same type data
Threshold time when time meets long;Using means: being arranged using the level characteristics of data, with the relevance of other same type data
The reparation rule of missing data;Here level characteristics refer to that the data are in different environment and there is different ranks to constrain,
Such as data on flows is in different crossing grades, lane-level, lane function, lane flow restriction environment, and flow is presented
Different level characteristics;Such as the track data of vehicle, in the environment such as different type of vehicle, road type, [time, position
Set] different level characteristics are presented in data.Relevance refers to space, the time, has association, the flow of downstream road junction as above in logic
There are relevance (flowing to relationship), and there are relevance (feasible paths) for the track data of same vehicle.Reparation rule: weighting is asked
With lack, calculated by the flow for flowing to the lane of its all upstream crossing, needle when the data on flows in certain lane is long
The flow undertaken to the grade at crossing, lane-level, lane function and the lane function is configured weight, calculates missing
Lane flow;Feasible data search, certain track of vehicle data lack when long, by the way that [time, position] counts before missing, after missing
According to the feasible path of search vehicle carries out feasible path in conjunction with bayonet position and search if it is the track data that bayonet system obtains
Rope, the track data obtained if it is GPS system.
C, it matches, the scope of application: there are time, space, relevances in logic between data and data;Using means:
Data and data are matched according to relevance, establish corresponding relationship.Corresponding relationship citing:
C.1) control program data and traffic state data are matched, on the time: control program data are according to its road
The mouth period returns.
C.2) spatially: being primarily referred to as the matching and phasing scheme of the speed data of different time to section to function lane
Matching.Control program data show the way port system log, and speed data is that the operator is voluntarily fixed according to certain rule
The link velocity amplitude of justice, and the velocity amplitude is related with speed class division, data are quantitatively possibly different from, each physics
The section of meaning may correspond to multiple link speed, according to the understanding to traffic service, the importance of data and its apart from crossing
Distance it is related, the importance closer to its data of crossing is higher.The present invention is to the processing mode of data first by time difference
Link Data Matching to the section of physical significance on, it is ensured that have data on the adjacent section in crossing, section data are true
Fixed, the velocity amplitude after matching is determining.To keep section data relatively reasonable, there is dynamic to the division of different link weights
Definition, include the case where the section include a link, include the case where 2, including 3 situations.Such as lane split
Matching, due to the special human activity phenomenon of traffic, there are certain rule i.e. trip rules.The signaling plan at crossing is for difference
Trip rule control program is set, have difference in the flow direction that phase number and phase are directed toward.To keep crossing to input number
According to the consistency of dimension, the present invention is converted phasing scheme to lane phase according to service logic using lane grade control program
Position scheme, method mainly carry out matching primitives according to the corresponded manner of phase and lane function, principle be in a cycle not
Controlled right-turn lane split is 1, while the lane split let pass is phase split, repeats the green letter in lane let pass
Than being added.
D, merge, the scope of application: data from collect calculate to application may pass through multiple data processing links;Using
Means: the data in same data processing link are integrated.In one embodiment, fusion includes data Layer fusion and spy
Levy layer data fusion.Wherein data Layer refers to and is directly merged on collected original data layer, to the original of detector data
Begin to acquire not preprocessed before with regard to the synthesis and analysis of progress data.Feature-level fusion belongs to the fusion of the intermediate level, it is first
Feature extraction (edge, direction, speed etc. that feature can be target) is carried out to the raw information from sensor, then to spy
Reference breath carries out comprehensive analysis and processing.Then the amalgamation judging of data is carried out by association process, the final joint that obtains is inferred
As a result.The integration of traffic data and speed data, is integrated according to period end time, and traffic data is according to period end time
It returns, speed data is judged according to period end time, apart from the nearest one group of speed data of period end time.
E, data processing lack of standardization is negative value including data or is the cleaning of the abnormal datas such as zero, hash composition
The standardization of data such as processing, normalization or the average value processing of the data lack of standardization such as input/output format and according to algorithm need
Data format construct training dataset.Training data mainly includes lane from treated correct complete matched data
The split of grade and corresponding speed data.Sample data is as follows:
Wherein roadsect_id shows the way segment number, and sys_code refers to number of the System Number i.e. crossing in signal system,
Dir_name refers to the direction of entrance driveway or entrance ingress, and speed refers to determining section speed, and cycle_end_time refers to that the period ties
Beam time, cycle_length refer to that cycle duration, lane_id refer to lane number, and a-g, which names, claims phase a-g and content to refer to phase
Split, lane_split refer to conversion to lane split.
The training dataset of 2 prediction algorithm of table
F, logic analysis, when data processing link is related to service logic relationship, according to service logic relationship analysis data
Reasonability.In one embodiment, mathematical logic analysis includes two parts, first is that the data of off-line training step are analyzed, two
It is the data analysis in online visualization process.The data of off-line phase are mainly divided to after initial data and standardized data
The analysis of data, with the reliability of training of judgement data and traffic behavior reasonability.Online data analysis is main periodically and real
When building data and curves, to visualize the superiority and inferiority of proof scheme.In data analysis process, carried out according to certain service logic
The polymerization of certain rule.Rule include according to time window mode, according to mean value mode, in the way of intermediate value etc..Such as: this
Invention has carried out trace analysis to traffic situation, collects one week speed data of key crossing or region before optimizing, and make trend
Then variation diagram makees variation diagram, the situation of change of comparison optimization front and back traffic situation, analysis speed to the circular velocity after optimization again
Whether degree has whole promotion, especially in peak period morning and evening.The analysis of this traffic situation is crucial primarily directed to optimization sub-district
Crossing and the analysis of region average speed and effect of optimization, method mainly use every 5min time window smoothing method and average value processing side
Method.
Traffic status prediction module carries out the traffic behavior number of target crossing object time using historical traffic status data
It is predicted that.Traffic state data prediction is carried out using the artificial intelligence approach of classification, traffic state data is obtained, inputs traffic shape
State prediction model, exports the traffic state data of target crossing object time, and the traffic status prediction model is handed over for identification
Lead to the class label of status data and classification.
When traffic signal system executes the signal prioritization scheme of rolling optimization module output, the traffic status prediction model
The real time traffic data of acquisition crossing for the previous period, the traffic behavior after the execution of prediction signal prioritization scheme then may be implemented.
Traffic status prediction model reduces the traffic behavior after actual signal prioritization scheme executes and prediction by continuous learning training
Signal prioritization scheme execute after traffic behavior between gap, to meet model training requirement.
Interactive visual module, including but not limited to lower unit:
1) unit is deduced in emulation, and the effect that mainly emulation data supporting is deduced in emulation is deduced, and emulation data come from traffic
State prediction module.It mainly includes three parts that emulation, which is deduced,.One, real-time speed: real time data is unrelated with prediction data, mainly
Road average-speed data after referring to polymerization.Real-time speed main function be with the Contrast on effect after scheme optimization, it is excellent to verify
Whether change scheme reasonable, if unreasonable, i.e. prediction result gap is larger, predict the classification of traffic behavior across two grades,
It may be selected to recalculate.Two, speed before deducing: calculating before deducing is the speed optimized in the period, i.e. traffic behavior, demonstration variation
Trend.Speed is from the output of simulation velocity, that is, anticipation function, input data building are as follows: this crossing calculation optimization before deducing
The control program, actual speed and time data of 30min actual motion before the scheme moment, adjacent intersection optimize 2* before the moment
Control program, actual speed and the time data of 60min actual motion, 2*60min is real after adjacent intersection optimization moment last week
Control program, actual speed and the time data of border operation.Third is that, deduce after speed in the data main calculation optimization period,
Demonstrate the trend of variation.Data are from emulation data, that is, anticipation function output, input data building are as follows: this road after deduction
30min prioritization scheme, predetermined speed and time hop counts evidence before the mouth calculation optimization scheme moment, adjacent intersection optimize 2* before the moment
Practical solution, predetermined speed and time hop counts evidence before 60min.The deduction for deducing front and back velocity variations and variation tendency mainly supplies
One line timing expert prejudges following traffic situation.It is the effect deduction after prioritization scheme issues.The time that the effect is recommended
Section executes period variation according to prioritization scheme period or the prioritization scheme recalculated.The effect of deduction also resides in can
It is comprehensive to grasp road conditions whole in optimization region, convenient for judge the control strategy exported whether can lifting region monolithic
Condition, even if having the sacrifice at part crossing.
2) data display unit, such as: optimization region and the traffic situation for optimizing subregion, whole situation is with index curve
Form is shown, including existing in real time with history average rate curve, the real-time and history in real time with history time and space utilization rate curve and sub-district
Way amount curve.
The geographical location at key crossing and its adjacent letter control crossing shows, adjacent letter control crossing exists using different labels
Key crossing is clicked to occur later.Key crossing data mainly include real-time speed, historical speed curve and space-time utilization rate number
According to adjacent intersection shows real-time average rate and time and space utilization rate data.Data can be checked by clicking key crossing.
Prioritization scheme data show that is, click sub-district position, which can be checked, closes on optimization period schemes based on sub-district.
3) manual intervention unit, comprising: the parameter setting item of signal recommender system or method.And prioritization scheme audit
, the prioritization scheme of system output is handed down to signal control equipment after manual examination and verification and executes prioritization scheme.