CN111292534A - Traffic state estimation method based on clustering and deep sequence learning - Google Patents

Traffic state estimation method based on clustering and deep sequence learning Download PDF

Info

Publication number
CN111292534A
CN111292534A CN202010090595.XA CN202010090595A CN111292534A CN 111292534 A CN111292534 A CN 111292534A CN 202010090595 A CN202010090595 A CN 202010090595A CN 111292534 A CN111292534 A CN 111292534A
Authority
CN
China
Prior art keywords
data
traffic
road
state
clustering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010090595.XA
Other languages
Chinese (zh)
Inventor
陈阳舟
马鹏飞
师泽宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202010090595.XA priority Critical patent/CN111292534A/en
Publication of CN111292534A publication Critical patent/CN111292534A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • G08G1/0129Traffic data processing for creating historical data or processing based on historical data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • G08G1/0133Traffic data processing for classifying traffic situation
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/052Detecting movement of traffic to be counted or controlled with provision for determining speed or overspeed
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/065Traffic control systems for road vehicles by counting the vehicles in a section of the road or in a parking area, i.e. comparing incoming count with outgoing count

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Analytical Chemistry (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention provides a traffic state estimation method based on kmeans clustering and deep sequence learning, belongs to the field of intelligent traffic systems, and mainly solves the problem of estimating the traffic state of the whole expressway under the condition that traffic flow data of partial road sections in an urban expressway cannot be acquired in real time. The method is characterized by comprising the following steps: (1) dividing a rapid road network; (2) modeling and data acquisition of the expressway; (3) preprocessing and normalizing data; (4) calculating Euclidean distances among traffic flow data through a kmeans clustering algorithm, and determining the traffic state grade of each data point; (5) and (3) designing a deep sequence learning Seq2Seq model, and performing traffic state recognition on the whole road network through model iterative learning. The invention fully considers the relation of traffic flow among road sections, exerts the advantages of a machine learning algorithm in the traffic field, obtains the traffic state of the whole road network in time and can provide reliable traffic information for a driving subject.

Description

Traffic state estimation method based on clustering and deep sequence learning
Technical Field
The invention relates to the field of intelligent traffic systems, in particular to a traffic state estimation method based on kmeans clustering and deep sequence learning.
Background
With the continuous development of social economy and the continuous growth of urban population in China, more and more families have one or more private automobiles, the traffic pressure of various cities in China is increased due to the rapidly-growing number of vehicles, the running efficiency of an urban traffic network is seriously influenced, the travel time of residents is increased, in addition, the energy waste is aggravated when the vehicles are in congestion in low-speed running, the emission of tail gas is increased due to frequent flameout and starting, and the living environment of the residents is polluted. Therefore, how to accurately estimate the traffic flow of the urban road network and relieve traffic pressure under the condition of meeting the travel requirement of people becomes the research focus of the important direction and academic circles of traffic management development.
Traffic state estimation refers to a process of inferring the overall road network traffic state using traffic flow data observed in the road network. At present, the method is mainly divided into two algorithms based on model driving and data driving: the model driving algorithm generally describes the transmission relation among road sections by using a traffic flow model, and deduces the traffic state change condition through a mathematical formula; data-driven algorithms typically use machine learning to analyze historical traffic flow data and mine relationships between the data to estimate or predict road segment traffic conditions.
However, due to the reasons of technology and capital, the current road detectors of urban road networks cannot achieve seamless coverage, and only can detect traffic flow data of partial road segments, so that most of the existing technologies are researched around a single road segment, the road segments without detected data cannot be effectively estimated, and the requirements of travel people on road network traffic information cannot be met. No effective solution has therefore been proposed to the above problems.
Disclosure of Invention
The invention aims to provide a traffic state estimation method based on kmeans clustering and deep sequence learning, aiming at solving the problem of estimating the traffic states of all road sections under the condition that traffic flow data of part of road sections in an urban expressway cannot be acquired in real time.
The technical scheme of the invention is implemented according to the following steps:
s1, fast path division: dividing an urban expressway into a plurality of balanced road sections according to a Cellular Transmission Model (CTM) theory, and ensuring that the traffic flow density inside each divided road section is uniformly distributed, and the section flow, the traffic flow speed and the like are approximately the same;
s2, data acquisition: modeling the selected express way by adopting simulation software, setting a virtual detector, and acquiring historical parameter data of traffic flow of each road section, wherein the characteristics of the data comprise the traffic flow of the section of the road section, the speed and the time occupancy of the road section;
s3, preprocessing data: removing the collected repeated data and abnormal data, and carrying out normalization processing on different traffic flow characteristic data to convert the data into values in an interval of [0,1 ];
s4, dividing traffic states: dividing the traffic state into two state grades of free flow and crowded flow according to the basic map characteristic of the road traffic flow, respectively carrying out cluster analysis on the historical traffic flow data of each road section by adopting a kmeans clustering algorithm, and judging the category of each data point according to the Euclidean distance of the data in a three-dimensional space, thereby achieving the purpose of calibrating the data set of each road section;
s5, traffic state estimation: and constructing a training data set by the calibrated data according to a certain proportion, designing a deep sequence learning model Seq2Seq model, inputting a traffic flow data sequence of a part of road sections in the expressway by the model, outputting a traffic state sequence of all road sections of the expressway, and realizing the traffic state estimation of the whole road section in an iterative learning mode to obtain an estimation result.
Further, in step S1, the segment division rule of the express way is:
s1.1, dividing a road network into a plurality of road sections according to the number and the positions of ramps in the expressway network, the change positions of the number of lanes and the change positions of the curvature radius of the road, wherein each road section is called a link road section; the ramp comprises an entrance ramp and an exit ramp, and the number of lanes is changed into increase or decrease;
s1.2, each link section is further divided into a plurality of smaller sections with the same length according to the length of each link section, each small section is called a cell, and each cell is guaranteed to be balanced.
Further, in step S2, the expressway model is divided according to the road segment division rule of S1, and the traffic demand of each road segment and the split ratio between the entrance ramp and the main road are dynamically set according to the real data, so that the simulated traffic flow can evolve according to the change condition of the real traffic flow. The historical traffic flow parameter data collected by each road section can be data of a plurality of continuous working days and is simulated by changing a software random seed mode.
Further, in the step S3, since the dimension difference between different characteristic variables of the traffic flow data is large, for example, the traffic flow can reach hundreds of thousands, and the traffic flow speed is only dozens, directly applying the data to the subsequent model training may cause inaccurate results. Therefore, the collected data needs to be normalized to make different feature variables fall in a specific area, and the normalization formula is as follows:
Figure BDA0002383576900000021
wherein: x' is the value after data normalization, x is the value before data normalization, xminIs the minimum value, x, in the data setmaxIs the maximum value in the data set.
Further, in the step S4, the traffic state is classified according to the basic graph formed by the cellular internal traffic flow data, and the traffic state is divided into two types of free flow and congestion flow, wherein the traffic operation in the free flow state is relatively stable, the running vehicles are hardly influenced by the outside, and the state is represented by the numeral 0; traffic running in a crowded flow state is extremely unstable and interference between vehicles is severe, and this state is denoted by numeral 1.
Further, in step S4, a kmeans clustering algorithm is used to cluster the historical traffic flow parameter data, calibrate each group of historical traffic flow parameter data (0 or 1 state), calculate the mean value of different traffic state category data, and compare the differences, where the clustering algorithm specifically includes the following steps:
s4.1, determining the number k of clustering categories, and randomly selecting k data points from the data set as an initial clustering center;
s4.2, respectively calculating Euclidean distances between each data point and the clustering centers, and dividing the Euclidean distances into categories where the clustering centers which are close to each other are located, wherein the calculation formula is expressed as follows:
Figure BDA0002383576900000031
wherein x isiFor the ith data point, μ in the datasetjIs the central point of the jth clustering category, and k is the number of the clustering categories.
S4.3, calculating the arithmetic mean value of the data points contained in different categories according to the clustering result, replacing the previous clustering center point with the value, and updating the formula to be expressed as:
Figure BDA0002383576900000032
wherein, cjThe number of data points included for the jth cluster category.
S4.4, comparing the difference between the current clustering center point and the center point before updating, if the current clustering center point and the center point before updating are the same, stopping iteration, and ending the algorithm; if not, the process returns to step S4.2 and continues the iteration.
Further, in step S4, the objective function of the algorithm is represented as:
Figure BDA0002383576900000033
where E represents the squared error of the algorithm, CjIndicating the jth cluster.
Further, in step S5, the depth sequence Seq2Seq model is a model with a "many-to-many" structure, input and output data of the model need to be fitted in advance before training the model, traffic flow data and state data of the expressway network are arranged in a spatial order to obtain a road network traffic flow data (continuous) sequence and a traffic state (discrete) binary sequence corresponding to each other on road segments, and the Seq2Seq model is trained by using the traffic flow data sequence as input and the state sequence as output. In addition, since data of a part of road segments on an actual road network cannot be acquired in real time, the data of the part of road segments in the input sequence needs to be removed when the model is trained, so that the trained model can estimate the state sequence of the whole express way by using the data of the part of road segments in the future estimation process.
Further, in step S5, the Seq2Seq model is composed of two layers of LSTM neural networks, so as to solve the problems of gradient disappearance and gradient explosion faced by RNN neural networks. The former layer LSTM network is used as an encoder and is responsible for analyzing the input traffic flow data sequence, and the formula of the encoder is as follows:
ht=f(ht-1,xt) (5)
where f denotes the activation function of the encoder, xtRepresenting a sequence of traffic flow data at time t, htRepresenting the hidden state of the encoder at time t.
The next layer of LSTM network is used as a decoder and is responsible for analyzing the output of the encoder, the traffic state probability distribution sequence of the whole express way is calculated according to a certain rule, and the output formula of the decoder and the model is expressed as follows:
st=f(st-1,yt-1,c) (6)
p(yt|yt-1,yt-2,...,y1,c)=g(st,yt-1,c) (7)
where f denotes the activation function of the decoder, c denotes the output of the encoder, stIndicating the hidden state of the decoder at time t, ytIndicating the output result at the moment t of the decoder, g indicating the activation of the output layerA function.
Furthermore, in order to enable the model to remember traffic flow data information of more road sections in the encoding stage and improve the subsequent estimation precision, the invention introduces an attention mechanism to optimize on the basis of the existing model, and adds a dynamic weight to the hidden state of the encoder at all times, so that the output c of the encoder can be dynamically updated along with the hidden state, the model is ensured to acquire more important information from the input sequence, and the updating formula is represented as:
Figure BDA0002383576900000041
Figure BDA0002383576900000042
eij=a(si-1,hj) (10)
wherein, ciRepresenting the dynamically updated encoder output, aijRepresenting the weight between the jth hidden state of the encoder and the ith hidden state of the decoder, eijAn alignment model is shown to measure the correlation between the jth hidden state of the encoder and the ith hidden state of the decoder.
Compared with the prior art, the invention has the following beneficial effects:
a traffic state estimation method based on kmeans clustering and deep sequence learning is provided under the condition that traffic flow data of a part of road sections in an urban expressway cannot be acquired in real time. Firstly, state calibration is carried out on historical traffic flow data of roads through a kmeans clustering algorithm, then a deep sequence learning Seq2Seq model is designed, a calibrated traffic flow data training model is adopted, iterative learning is carried out to obtain a traffic state sequence of the whole express way, and the problems that a road detector cannot achieve seamless coverage and only can detect traffic flow data of partial road sections in the existing traffic state estimation problem are solved. The method fully considers the relation of traffic flow among road sections, exerts the advantages of a machine learning algorithm in the traffic field, obtains the traffic state condition of the whole road network in time and provides reliable and accurate information for a driving subject.
Drawings
FIG. 1 is a flow chart of a traffic state estimation method based on kmeans clustering and deep sequence learning according to the present invention;
FIG. 2 is a schematic diagram of the expressway division;
FIG. 3 is a schematic diagram of simulation modeling using a Kyoto express way as an example;
FIG. 4 is a traffic flow data cluster diagram of a kmeans clustering algorithm;
FIG. 5 is a block diagram of a deep sequence learning Seq2Seq model;
Detailed Description
In order to clearly illustrate the present invention, the present invention will be further described with reference to the following examples and the accompanying drawings. It is to be understood that the following detailed description is intended to be illustrative, but not restrictive, and is not intended to limit the scope of the invention.
As shown in fig. 1, the invention discloses a traffic state estimation method based on kmeans clustering and deep sequence learning, which comprises the following steps:
s1, fast path division: in the example, a road section (from west to east) from a Beijing Jingtong express way to a distant bridge is selected as an example for analysis, the length of the express way of the section is about 7km, 7 exit ramps and 6 entrance ramps are provided, lane change and turning conditions exist in the road section, the road section is divided into a plurality of balanced road sections according to a CTM theory, the division result is shown in figure 2, so that the traffic flow density inside each divided road section is uniformly distributed, and the section flow, the traffic flow speed and the like are approximately the same; the specific division rule is as follows:
s1.1, dividing a road network into a plurality of road sections according to the number and the positions of ramps (including entrance ramps and exit ramps), the positions of the change of the number of lanes (the increase or decrease of the number of lanes) and the positions of the change of the curvature radius of the road in the expressway network, wherein each road section is called a link road section;
s1.2, each link section is further divided into a plurality of smaller sections with equal length according to the length of the link section, and each small section is called a cell.
The Jingtong express way is divided into 18 cells by the rule, so that each cell is balanced.
S2, data acquisition: as shown in fig. 3, simulation software is adopted to model a selected jingtong express way, traffic demand and ramp split ratio are dynamically set according to the change situation of the traffic flow of an actual road network, required traffic flow data is obtained by arranging a virtual detector in each cell, the detector counts every 30s, traffic flow data from 6 points to 10 points of an early peak on a working day (Monday to Friday) is collected by changing random seeds, 43200 groups of data are collected in total, the characteristics of the data comprise the traffic flow, the speed and the road section time occupancy rate of a road section, the data of the previous 4 days are selected as a training set, and the data of the last day are used as a test set.
S3, preprocessing data: after data collection is finished, firstly, repeated data and abnormal data in the data need to be cleared, secondly, because dimension difference between characteristic variables of the collected traffic flow data is large, if the traffic flow can reach hundreds and thousands, and the traffic flow speed is only dozens, the result is inaccurate when the data is directly used for subsequent model training, normalization processing needs to be carried out on the collected data, so that different characteristic variables can fall in a [0,1] interval, and a normalization formula is as follows:
Figure BDA0002383576900000051
wherein: x' is the value after data normalization, x is the value before data normalization, xminIs the minimum value, x, in the data setmaxIs the maximum value in the data set.
S4, dividing traffic states: according to the relation between the road traffic flow data characteristics, data are divided into two state grades of free flow and crowded flow by adopting a kmeans clustering algorithm. The traffic running in the free flow state is relatively stable, the running vehicle is hardly influenced by the outside, and the state can be represented by a number 0; traffic running in a crowded flow state is extremely unstable and interference between vehicles is severe, and this state can be represented by the numeral 1. The clustering algorithm comprises the following specific steps: :
s4.1, determining that the number k of the clustering categories is 2, and randomly selecting 2 data points from a traffic flow data set as an initial clustering center;
s4.2, respectively calculating Euclidean distances between each data point and the clustering centers, and dividing the Euclidean distances into categories where the clustering centers which are close to each other are located, wherein the calculation formula is expressed as follows:
Figure BDA0002383576900000061
wherein x isiFor the ith data point, μ in the datasetjIs the central point of the jth clustering category, and k is the number of the clustering categories.
S4.3, calculating the arithmetic mean value of the data points contained in different categories according to the clustering result, replacing the previous clustering center point with the value, and updating the formula to be expressed as:
Figure BDA0002383576900000062
wherein, cjThe number of data points included for the jth cluster category.
S4.4, comparing the difference between the current clustering center point and the center point before updating, if the current clustering center point and the center point before updating are the same, stopping iteration, and ending the algorithm; if not, the process returns to step S4.2 and continues the iteration.
Further, in step S4, the objective function of the algorithm is represented as:
Figure BDA0002383576900000063
where E represents the squared error of the algorithm, CjIndicating the jth cluster.
Specifically, as shown in fig. 4, a traffic state kmeans clustering result graph of one of the road segments is shown, in the graph, triangular data points represent a crowded flow, and circular data points represent a free flow. As can be seen from the figure, in the free flow state, the occupancy rate of the traffic flow of the road section is relatively low, and is approximately in linear relation with the flow, the running vehicles are hardly interfered by external factors, and the high-speed running can be kept; under the crowded flow state, the traffic flow occupancy of the road section begins to rise rapidly, the flow can decline gradually after reaching the peak, at the moment, the traffic operation is extremely unstable, the data discrete degree is higher, the vehicles interfere with each other, and only can be driven at low speed in the road. Therefore, the road section traffic flow data can be well divided into states through k-means clustering, the boundaries among different classes are obvious, the clustering effect is good, and the method accords with the change conditions of a road section basic diagram and actual traffic flow.
S5, traffic state estimation: designing a deep sequence learning model Seq2Seq model which is a model with a structure of many-to-many, wherein input and output data of the model need to be fitted well in advance before training the model, traffic flow data and state data of an expressway network are arranged according to a spatial sequence to obtain road network traffic flow data (continuous) sequences and traffic state (discrete) binary sequences corresponding to road sections one by one, and the traffic flow data sequences are used as input and the state sequences are used as output to train the Seq2Seq model. In addition, since data of a part of road segments on the actual road network cannot be obtained in real time, the data of the part of road segments in the input sequence needs to be eliminated when the model is trained. Based on this, the input of the experimental design model is a traffic flow data sequence composed of cells 1,2,3,4,5,7,9,11,13,15,17, and the output is a traffic state binary (0 or 1) sequence composed of all 18 cells, totaling 2400 group sequence pairs.
Specifically, the model structure designed in this experiment is shown in fig. 5, and is composed of two layers of LSTM neural networks, so as to solve the problems of gradient disappearance and gradient explosion faced by the RNN neural network. The former layer LSTM network is used as an encoder and is responsible for analyzing the input traffic flow data sequence, and the formula of the encoder is as follows:
ht=f(ht-1,xt) (5)
where f denotes the activation function of the encoder, xtRepresenting traffic flow data sequences at time t,htRepresenting the hidden state of the encoder at time t.
The next layer of LSTM network is used as a decoder and is responsible for analyzing the output of the encoder, the traffic state probability distribution sequence of the whole express way is calculated according to a certain rule, and the output formula of the decoder and the model is expressed as follows:
st=f(st-1,yt-1,c) (6)
p(yt|yt-1,yt-2,...,y1,c)=g(st,yt-1,c) (7)
where f denotes the activation function of the decoder, c denotes the output of the encoder, stIndicating the hidden state of the decoder at time t, ytRepresenting the output result at the moment t of the decoder and g the activation function of the output layer.
Furthermore, in order to enable the model to remember traffic flow data information of more road sections in the encoding stage and improve the subsequent estimation precision, the invention introduces an attention mechanism to optimize on the basis of the existing model, and adds a dynamic weight to the hidden state of the encoder at all times, so that the output c of the encoder can be dynamically updated along with the hidden state, the model is ensured to acquire more important information from the input sequence, and the updating formula is represented as:
Figure BDA0002383576900000071
Figure BDA0002383576900000072
eij=a(si-1,hj) (10)
wherein, ciRepresenting the dynamically updated encoder output, aijRepresenting the weight between the jth hidden state of the encoder and the ith hidden state of the decoder, eijRepresenting an alignment model for scaling the jth hidden state of the encoder with the th hidden state of the decoderCorrelation between i hidden states.
And training the designed Seq2Seq model by adopting the fitted traffic flow data sequence, and carrying out verification test on the model by a five-fold intersection method, wherein when the tested model reaches a preset performance index, the traffic state of the road network can be estimated by using real-time data acquired by the actual road network, and further the real-time traffic state sequence of the whole express way is obtained.
In summary, the invention provides a traffic state estimation method based on kmeans clustering and deep sequence learning. Firstly, state calibration is carried out on historical traffic flow data of roads through a kmeans clustering algorithm, then a deep sequence learning Seq2Seq model is designed, a calibrated traffic flow data training model is adopted, iterative learning is carried out to obtain a traffic state sequence of the whole express way, and the problems that a road detector cannot achieve seamless coverage and only can detect traffic flow data of partial road sections in the existing traffic state estimation problem are solved. The invention fully considers the relation of traffic flow among road sections, exerts the advantages of a machine learning algorithm in the traffic field, obtains the traffic state condition of the whole road network in time and can provide reliable and accurate information for a driving subject.
It should be finally noted that the above-mentioned embodiments of the present invention are only examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention, and it is obvious for those skilled in the art to make other variations or modifications based on the above description, and all the express way information cannot be exhaustively listed here, and all the obvious variations or modifications that belong to the technical scheme of the present invention still fall within the protection scope of the present invention.

Claims (9)

1. A traffic state estimation method based on clustering and deep sequence learning is characterized by comprising the following steps:
s1, fast path division: dividing an urban expressway into a plurality of balanced road sections according to a Cellular Transmission Model (CTM) theory, and ensuring that the traffic flow density inside each divided road section is uniformly distributed, and the section flow, the traffic flow speed and the like are approximately the same;
s2, data acquisition: modeling the selected express way by adopting simulation software, setting a virtual detector, and acquiring historical parameter data of traffic flow of each road section, wherein the characteristics of the data comprise the traffic flow of the section of the road section, the speed and the time occupancy of the road section;
s3, preprocessing data: removing the collected repeated data and abnormal data, and carrying out normalization processing on different traffic flow characteristic data to convert the data into values in an interval of [0,1 ];
s4, dividing traffic states: dividing the traffic state into two state grades of free flow and crowded flow according to the basic map characteristic of the road traffic flow, respectively carrying out cluster analysis on the historical traffic flow data of each road section by adopting a kmeans clustering algorithm, and judging the category of each data point according to the Euclidean distance of the data in a three-dimensional space, thereby achieving the purpose of calibrating the data set of each road section;
s5, traffic state estimation: and constructing a training data set by the calibrated data according to a certain proportion, designing a deep sequence learning model Seq2Seq model, inputting a traffic flow data sequence of a part of road sections in the expressway by the model, outputting a traffic state sequence of all road sections of the expressway, and realizing the traffic state estimation of the whole road section in an iterative learning mode to obtain an estimation result.
2. The traffic state estimation method based on clustering and deep sequence learning of claim 1, wherein in step S1, the segment division rule of the expressway is:
s1.1, dividing a road network into a plurality of road sections according to the number and the positions of ramps in the expressway network, the change positions of the number of lanes and the change positions of the curvature radius of the road, wherein each road section is called a link road section; the ramp comprises an entrance ramp and an exit ramp, and the number of lanes is changed into increase or decrease;
s1.2, each link section is further divided into a plurality of smaller sections with the same length according to the length of each link section, each small section is called a cell, and each cell is guaranteed to be balanced.
3. The method according to claim 1, wherein in step S2, the expressway model is divided according to the road segment division rule of S1, and the traffic demand of each road segment and the split ratio between the on-off ramp and the main road are dynamically set according to the real data, and the historical traffic flow parameter data collected by each road segment can be data of a plurality of working days continuously, and is simulated by changing the random seed of the software.
4. The traffic state estimation method based on clustering and deep sequence learning of claim 1, wherein in step S3, the collected data is normalized to make different feature variables fall into a specific area, and the normalization formula is:
Figure FDA0002383576890000011
wherein x' is the value after data normalization, x is the value before data normalization, xminIs the minimum value, x, in the data setmaxIs the maximum value in the data set.
5. The method according to claim 1, wherein in step S4, the traffic status is classified according to a basic map formed by the data of the traffic flow inside the cells, and the classification is divided into two types of free flow and congestion flow, wherein the traffic operation in the free flow state is stable, the running vehicles are hardly affected by the outside, and the state is represented by the number 0; traffic running in a congested flow state is extremely unstable and interference between vehicles is severe, and this state is denoted by numeral 1.
6. The traffic state estimation method based on clustering and deep sequence learning according to claim 1, wherein in step S4, a kmeans clustering algorithm is used to cluster the historical traffic flow parameter data, the historical traffic flow parameter data of each group is calibrated (0 or 1 state), and the mean value of different traffic state category data is calculated, and the difference is compared, wherein the clustering algorithm comprises the following specific steps:
s4.1, determining the number k of clustering categories, and randomly selecting k data points from the data set as an initial clustering center;
s4.2, respectively calculating Euclidean distances between each data point and the clustering centers, and dividing the Euclidean distances into categories where the clustering centers which are close to each other are located, wherein the calculation formula is expressed as follows:
Figure FDA0002383576890000021
s4.3, calculating the arithmetic mean value of the data points contained in different categories according to the clustering result, replacing the previous clustering center point with the value, and updating the formula to be expressed as:
Figure FDA0002383576890000022
wherein x isiFor the ith data point, μ in the datasetjIs the central point of the jth clustering category, and k is the number of the clustering categories;
s4.4, comparing the difference between the current clustering center point and the center point before updating, if the current clustering center point and the center point before updating are the same, stopping iteration, and ending the algorithm; if not, returning to the step S4.2 and continuing iteration;
further, in step S4, the objective function of the algorithm is represented as:
Figure FDA0002383576890000023
where E represents the squared error of the algorithm, CjIndicating the jth cluster.
7. The traffic state estimation method based on clustering and deep sequence learning of claim 1, wherein in step S5, the deep sequence Seq2Seq model is a model with a "many-to-many" structure, input and output data of the model need to be fitted in advance before training the model, traffic flow data and state data of the expressway network are arranged in a spatial order to obtain a traffic flow data continuous sequence and a traffic state discrete binary sequence corresponding to each other on road segments, the traffic flow data sequence is used as input, and the state sequence is used as output to train the Seq2Seq model; in addition, since data of a part of road segments on an actual road network cannot be acquired in real time, the data of the part of road segments in the input sequence needs to be removed when the model is trained, so that the trained model can estimate the state sequence of the whole express way by using the data of the part of road segments in the future estimation process.
8. The traffic state estimation method based on clustering and deep sequence learning of claim 1, wherein in step S5, the Seq2Seq model is composed of two layers of LSTM neural networks, so as to solve the problem of gradient disappearance and gradient explosion faced by RNN neural networks; the former layer LSTM network is used as an encoder and is responsible for analyzing the input traffic flow data sequence, and the formula of the encoder is as follows:
ht=f(ht-1,xt) (5)
where f denotes the activation function of the encoder, xtRepresenting a sequence of traffic flow data at time t, htIndicating the hidden state of the encoder at time t;
the next layer of LSTM network is used as a decoder and is responsible for analyzing the output of the encoder, the traffic state probability distribution sequence of the whole express way is calculated according to a certain rule, and the output formula of the decoder and the model is expressed as follows:
st=f(st-1,yt-1,c) (6)
p(yt|yt-1,yt-2,...,y1,c)=g(st,yt-1,c) (7)
where f denotes the activation function of the decoder, c denotes the output of the encoder, stIndicating the hidden state of the decoder at time t, ytRepresenting the output result at the moment t of the decoder and g the activation function of the output layer.
9. The traffic state estimation method according to claim 1, wherein in step S5, an attention mechanism is introduced to optimize based on the existing model, and a dynamic weight is added to the hidden state of the encoder at all times, so that the output c of the encoder can be dynamically updated accordingly, and it is ensured that the model can obtain more important information from the input sequence, and the update formula is represented as:
Figure FDA0002383576890000031
Figure FDA0002383576890000032
eij=a(si-1,hj) (10)
wherein, ciRepresenting the dynamically updated encoder output, aijRepresenting the weight between the jth hidden state of the encoder and the ith hidden state of the decoder, eijAn alignment model is shown to measure the correlation between the jth hidden state of the encoder and the ith hidden state of the decoder.
CN202010090595.XA 2020-02-13 2020-02-13 Traffic state estimation method based on clustering and deep sequence learning Pending CN111292534A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010090595.XA CN111292534A (en) 2020-02-13 2020-02-13 Traffic state estimation method based on clustering and deep sequence learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010090595.XA CN111292534A (en) 2020-02-13 2020-02-13 Traffic state estimation method based on clustering and deep sequence learning

Publications (1)

Publication Number Publication Date
CN111292534A true CN111292534A (en) 2020-06-16

Family

ID=71024383

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010090595.XA Pending CN111292534A (en) 2020-02-13 2020-02-13 Traffic state estimation method based on clustering and deep sequence learning

Country Status (1)

Country Link
CN (1) CN111292534A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111815046A (en) * 2020-07-06 2020-10-23 北京交通大学 Traffic flow prediction method based on deep learning
CN111862605A (en) * 2020-07-20 2020-10-30 腾讯科技(深圳)有限公司 Road condition detection method and device, electronic equipment and readable storage medium
CN112149588A (en) * 2020-09-28 2020-12-29 北京工业大学 Pedestrian attitude estimation-based intelligent elevator dispatching method
CN112884222A (en) * 2021-02-10 2021-06-01 武汉大学 Time-period-oriented LSTM traffic flow density prediction method
CN112885085A (en) * 2021-01-15 2021-06-01 北京航空航天大学 Confluence control strategy applied to reconstruction and extension of highway construction area
CN113762338A (en) * 2021-07-30 2021-12-07 湖南大学 Traffic flow prediction method, equipment and medium based on multi-graph attention mechanism
CN113935090A (en) * 2021-10-11 2022-01-14 大连理工大学 Random traffic flow fine simulation method for bridge vehicle-induced fatigue analysis
CN114999181A (en) * 2022-05-11 2022-09-02 山东高速建设管理集团有限公司 ETC system data-based highway vehicle speed abnormity identification method
CN116543560A (en) * 2023-07-05 2023-08-04 深圳市诚识科技有限公司 Intelligent road condition prediction system and method based on deep learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190122373A1 (en) * 2018-12-10 2019-04-25 Intel Corporation Depth and motion estimations in machine learning environments
CN109785629A (en) * 2019-02-28 2019-05-21 北京交通大学 A kind of short-term traffic flow forecast method
CN110210509A (en) * 2019-03-04 2019-09-06 广东交通职业技术学院 A kind of road net traffic state method of discrimination based on MFD+ spectral clustering+SVM
CN110648527A (en) * 2019-08-20 2020-01-03 浙江工业大学 Traffic speed prediction method based on deep learning model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190122373A1 (en) * 2018-12-10 2019-04-25 Intel Corporation Depth and motion estimations in machine learning environments
CN109785629A (en) * 2019-02-28 2019-05-21 北京交通大学 A kind of short-term traffic flow forecast method
CN110210509A (en) * 2019-03-04 2019-09-06 广东交通职业技术学院 A kind of road net traffic state method of discrimination based on MFD+ spectral clustering+SVM
CN110648527A (en) * 2019-08-20 2020-01-03 浙江工业大学 Traffic speed prediction method based on deep learning model

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111815046B (en) * 2020-07-06 2024-03-22 北京交通大学 Traffic flow prediction method based on deep learning
CN111815046A (en) * 2020-07-06 2020-10-23 北京交通大学 Traffic flow prediction method based on deep learning
CN111862605B (en) * 2020-07-20 2022-03-08 腾讯科技(深圳)有限公司 Road condition detection method and device, electronic equipment and readable storage medium
CN111862605A (en) * 2020-07-20 2020-10-30 腾讯科技(深圳)有限公司 Road condition detection method and device, electronic equipment and readable storage medium
CN112149588A (en) * 2020-09-28 2020-12-29 北京工业大学 Pedestrian attitude estimation-based intelligent elevator dispatching method
CN112149588B (en) * 2020-09-28 2024-05-28 北京工业大学 Intelligent elevator dispatching method based on pedestrian attitude estimation
CN112885085A (en) * 2021-01-15 2021-06-01 北京航空航天大学 Confluence control strategy applied to reconstruction and extension of highway construction area
CN112884222B (en) * 2021-02-10 2022-06-14 武汉大学 Time-period-oriented LSTM traffic flow density prediction method
CN112884222A (en) * 2021-02-10 2021-06-01 武汉大学 Time-period-oriented LSTM traffic flow density prediction method
CN113762338B (en) * 2021-07-30 2023-08-25 湖南大学 Traffic flow prediction method, equipment and medium based on multiple graph attention mechanism
CN113762338A (en) * 2021-07-30 2021-12-07 湖南大学 Traffic flow prediction method, equipment and medium based on multi-graph attention mechanism
CN113935090A (en) * 2021-10-11 2022-01-14 大连理工大学 Random traffic flow fine simulation method for bridge vehicle-induced fatigue analysis
CN113935090B (en) * 2021-10-11 2022-12-02 大连理工大学 Random traffic flow fine simulation method for bridge vehicle-induced fatigue analysis
CN114999181A (en) * 2022-05-11 2022-09-02 山东高速建设管理集团有限公司 ETC system data-based highway vehicle speed abnormity identification method
CN116543560A (en) * 2023-07-05 2023-08-04 深圳市诚识科技有限公司 Intelligent road condition prediction system and method based on deep learning
CN116543560B (en) * 2023-07-05 2023-09-22 深圳市诚识科技有限公司 Intelligent road condition prediction system and method based on deep learning

Similar Documents

Publication Publication Date Title
CN111292534A (en) Traffic state estimation method based on clustering and deep sequence learning
CN107862864B (en) Driving condition intelligent prediction estimation method based on driving habits and traffic road conditions
CN112085947B (en) Traffic jam prediction method based on deep learning and fuzzy clustering
CN108269401B (en) Data-driven viaduct traffic jam prediction method
CN111080029B (en) Urban traffic road speed prediction method and system based on multi-path segment space-time correlation
CN113096388B (en) Short-term traffic flow prediction method based on gradient lifting decision tree
CN109191849B (en) Traffic jam duration prediction method based on multi-source data feature extraction
CN111145546B (en) Urban global traffic situation analysis method
CN106652441A (en) Urban road traffic condition prediction method based on spatial-temporal data
CN114783183A (en) Monitoring method and system based on traffic situation algorithm
CN113222385B (en) Method for constructing and evaluating driving condition of electric automobile
CN112613225B (en) Intersection traffic state prediction method based on neural network cell transmission model
CN111797768B (en) Automatic real-time identification method and system for multiple reasons of urban road traffic jam
CN112182962A (en) Hybrid electric vehicle running speed prediction method
CN112884014A (en) Traffic speed short-time prediction method based on road section topological structure classification
CN111179592B (en) Urban traffic prediction method and system based on spatio-temporal data flow fusion analysis
CN112633602B (en) Traffic congestion index prediction method and device based on GIS map information
CN116050670B (en) Road maintenance decision method and system based on data driving
CN113205698A (en) Navigation reminding method based on IGWO-LSTM short-time traffic flow prediction
CN113449905A (en) Traffic jam early warning method based on gated cyclic unit neural network
CN117238126A (en) Traffic accident risk assessment method under continuous flow road scene
CN111311907B (en) Identification method for uncertain basic graph parameter identification based on cellular transmission model
CN112732905A (en) Traffic accident analysis, prevention and control method and system based on knowledge graph
CN112149922A (en) Method for predicting severity of accident in exit and entrance area of down-link of highway tunnel
CN111967308A (en) Online road surface unevenness identification method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200616