CN113744525A - Traffic distribution prediction method based on feature extraction and deep learning - Google Patents

Traffic distribution prediction method based on feature extraction and deep learning Download PDF

Info

Publication number
CN113744525A
CN113744525A CN202110941891.0A CN202110941891A CN113744525A CN 113744525 A CN113744525 A CN 113744525A CN 202110941891 A CN202110941891 A CN 202110941891A CN 113744525 A CN113744525 A CN 113744525A
Authority
CN
China
Prior art keywords
traffic
cell
departure
deep learning
feature extraction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110941891.0A
Other languages
Chinese (zh)
Inventor
王炜
于维杰
秦韶阳
华雪东
赵德
陈思远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202110941891.0A priority Critical patent/CN113744525A/en
Publication of CN113744525A publication Critical patent/CN113744525A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • G08G1/0129Traffic data processing for creating historical data or processing based on historical data

Abstract

The invention discloses a traffic distribution prediction method based on feature extraction and deep learning, which is characterized in that features of a departure traffic cell and an arrival traffic cell are input into a trained deep learning prediction model to obtain predicted traffic distribution, namely traffic volume between the departure traffic cell and the arrival traffic cell. The method can predict the traffic distribution among the traffic districts with high precision, provides basis for traffic planning and traffic control, and has higher popularization and application values.

Description

Traffic distribution prediction method based on feature extraction and deep learning
Technical Field
The invention belongs to the field of urban traffic, and particularly relates to a traffic distribution prediction method.
Background
In recent years, with the increasingly rapid urbanization process of China, the urbanization level is gradually improved, and the contradiction between the increasing population and the limited land resources causes a series of urban traffic problems, such as traffic jam, tail gas pollution and the like. In the aspect of traffic distribution, the problem of unbalanced traffic demand distribution is increasingly prominent, the problem of urban congestion is more serious, and the problem of unbalanced demand distribution cannot be fundamentally solved by only depending on a traffic management scheme and urban road network extension. Therefore, traffic distribution needs to be predicted based on the multivariate urban characteristics and the deep learning algorithm, so that traffic planning and traffic control are performed from the perspective of balancing traffic demands, and the problem of urban traffic congestion at the present stage is relieved.
At present, with the rapid development of perception technology, the omnibearing coverage of a mobile communication network, the popularization of a smart phone and the wide use of an electronic map, Point of Interest (Point of Interest) data in a city provides data support for resident trip characteristic research and trip demand analysis, and lays a foundation for urban traffic demand distribution prediction analysis. With the further deep research of the deep learning algorithm, the accuracy and efficiency of traffic distribution prediction are greatly improved, and conditions are provided for urban traffic demand distribution prediction.
Although much research has been focused on urban traffic demand prediction and analysis of its spatiotemporal distribution, various limitations still exist. The existing research mostly focuses on the influence of a single land use type on traffic distribution, but few scholars research and predict the traffic distribution based on population structure characteristics, traffic demand characteristics, land use characteristics and travel distance characteristics. Therefore, the existing research results have certain limitations.
Disclosure of Invention
In order to solve the technical problems mentioned in the background art, the invention provides a traffic distribution prediction method based on feature extraction and deep learning.
In order to achieve the technical purpose, the technical scheme of the invention is as follows:
a traffic distribution prediction method based on feature extraction and deep learning comprises the steps of inputting features of a departure traffic cell and an arrival traffic cell into a trained deep learning prediction model to obtain predicted traffic distribution, namely traffic volume between the departure traffic cell and the arrival traffic cell;
the deep learning prediction model is constructed by the following steps:
(1) data acquisition: collecting historical travel data of residents, traffic zone dividing data, urban population data and urban interest point data;
(2) data arrangement: the method comprises two steps of spatial data matching and traffic distribution extraction;
the spatial data matching is to match the departure position and the arrival position of each piece of historical travel data with the boundary of the traffic cell to determine a departure traffic cell and an arrival traffic cell
The traffic distribution extraction is to count departure traffic cells and arrival traffic cells of all historical travel data to obtain the traffic volume between every two traffic cells;
(3) feature extraction: extracting characteristics influencing traffic volume between traffic districts, including population structure characteristic extraction, traffic demand characteristic extraction, land utilization characteristic extraction and travel distance characteristic extraction;
(4) characteristic screening;
(5) constructing a model: and (4) constructing a deep learning prediction model, wherein the input of the model is the characteristics screened in the step (4), and the output of the model is traffic distribution.
Further, in the step (1), the resident historical travel data includes a travel date, a departure time, a departure position, an arrival time and an arrival position;
the traffic cell division data includes a traffic cell number, a traffic cell boundary, and a traffic cell area;
the city population data comprises the number of the standing population, the age of the standing population and the gender of the standing population;
the city interest point data comprises interest point types and interest point positions.
Further, the types of points of interest include catering services, shopping services, science and education culture services, public facilities, corporate enterprises, transportation facility services, financial insurance services, business residences, living services, healthcare services, government agencies and social groups, and lodging services.
Further, in the step (1), in the collected historical travel data of the residents and the urban interest point data, the departure position, the arrival position and the interest point position are all represented by longitude and latitude, and the boundary of the traffic cell is the existing street dividing boundary.
Further, in step (3), the population structure feature extraction: extracting the number of the constant-living population, the density of the constant-living population, the sex structure of the constant-living population and the age structure of the constant-living population of each traffic cell;
the traffic demand feature extraction: extracting the daily departure traffic volume, arrival traffic volume, per-person departure times and per-person arrival times of each traffic cell;
the land utilization characteristic extraction: extracting the interest point density of each traffic cell and the proportion of interest points of different types;
the travel distance feature extraction: and extracting the linear distance between every two traffic districts.
Further, the number of departure times per capita and the number of arrival times per capita of the traffic cell are obtained by dividing the number of the standing population of the traffic cell by the number of the departure traffic volume and the number of arrival traffic volume of the traffic cell.
Further, in the step (4), the feature screening comprises two steps of correlation analysis and significance analysis;
and (3) correlation analysis: calculating a correlation coefficient between every two features aiming at the features extracted in the step (3), if the correlation coefficient of the two features is larger than 0.3, calculating the correlation coefficient of the two features and the traffic distribution extracted in the step (2) respectively, and deleting the feature with the smaller correlation coefficient;
the significance analysis comprises the following steps: and (3) performing regression analysis on the retained features after the correlation analysis and the traffic distribution extracted in the step (2), and removing the features with the significance difference larger than 0.05.
Further, in step (5), the mean square error is calculated in the model training process, and when the mean square error is less than 0.03, the training is stopped, and the model construction is completed.
Adopt the beneficial effect that above-mentioned technical scheme brought:
firstly, carrying out spatial data matching and traffic distribution extraction based on resident historical trip data, traffic zone dividing data, city population data and city interest point data, and establishing a deep learning prediction model through extraction and screening of population structure characteristics, traffic demand characteristics, land utilization characteristics and trip distance characteristics; secondly, the traffic distribution among all traffic districts can be predicted with high precision through the traffic distribution prediction based on the feature extraction and the deep learning, and a basis is provided for traffic planning and traffic control.
Drawings
FIG. 1 is a flow chart of an embodiment of the present invention;
FIG. 2 is a schematic diagram of traffic cell segmentation data in an embodiment;
fig. 3 is a correlation matrix diagram in the embodiment.
Detailed Description
The technical scheme of the invention is explained in detail in the following with the accompanying drawings.
According to the method, historical travel data of residents in Nanjing city, traffic zone dividing data, Nanjing city population data and Nanjing city interest point data in 2019 are adopted, and Nanjing city traffic distribution prediction based on feature extraction and deep learning is achieved according to data processing steps in the technical scheme. The method flow is shown in figure 1, and comprises the following 6 steps:
(1) data acquisition: the example takes Nanjing as the main research area. Historical travel data of residents in Nanjing city, traffic zone dividing data, population data of Nanjing city and interest point data of Nanjing city are collected. The resident historical travel data comprises five fields of travel date, departure time, departure position, arrival time and arrival position; the population data comprises the number of the permanent population, the age of the permanent population and the gender of the permanent population; dividing each administrative district of Nanjing city into 3324 traffic districts by taking the current street dividing boundary as the boundary of the traffic district, as shown in FIG. 2; the interest point data includes interest point types and interest point positions, as shown in table 1, wherein the interest point types are divided into 12 categories: catering services, shopping services, science and education culture, public facilities, corporate enterprises, transportation facility services, financial insurance services, business housing, living services, healthcare services, government agencies and social groups, lodging services.
TABLE 1 points of interest data of Nanjing City
Type (B) Longitude (G) Latitude
Catering service 118.713855 32.215343
Catering service 118.713702 32.214973
Life service 118.738974 32.067836
Life service 118.73895 32.06745
…… …… ……
Commercial residence 118.865525 31.777959
Commercial residence 118.869361 31.77501
(2) Data arrangement: the method comprises two steps of spatial data matching and traffic distribution extraction. Spatial data matching: matching the departure position and the arrival position of each piece of historical travel data with the boundary of a traffic cell, and determining the numbers of the traffic cells where the departure position and the arrival position are located, namely the departure traffic cell and the arrival traffic cell; traffic distribution extraction: and counting the departure traffic districts and the arrival traffic districts of all historical travel data to obtain the traffic volume between every two traffic districts.
(3) Feature extraction: the method comprises four steps of population structure feature extraction, traffic demand feature extraction, land utilization feature extraction and travel distance feature extraction. Population distribution characteristic extraction: and extracting the number of the standing population, the density of the standing population, the gender structure of the standing population and the age structure of the standing population of each traffic cell. The number of surviving population, gender structure and age structure in this example are shown in table 2. Extracting traffic distribution characteristics: extracting the daily departure traffic volume, arrival traffic volume, per-person departure times and per-person arrival times of each traffic cell; land utilization feature extraction: and extracting the interest point density of each traffic cell and the proportion of interest points of different types. The proportions of the different types of interest points in this example are shown in table 3. And (3) travel distance feature extraction: and extracting the linear distance between every two traffic districts.
TABLE 2 population distribution feature extraction results
Figure BDA0003215376330000051
Figure BDA0003215376330000061
TABLE 3 statistics of interest Point type ratios
Categories Number of Ratio of
Food and beverage 49276 16.93%
Public facility 4576 1.57%
Company(s) 44181 15.18%
Shopping 71271 24.49%
Traffic facility 23620 8.12%
Education 15134 5.20%
Financial insurance 6566 2.26%
Life saving 45251 15.55%
Medical treatment 5734 1.97%
Government 9291 3.19%
Accommodation device 5907 2.03%
House with a plurality of rooms 10236 3.52%
Total of 291043 100%
(4) And (3) feature screening: the method comprises two steps of correlation analysis and significance analysis. And (3) correlation analysis: and (4) carrying out correlation analysis on the features extracted in the step (3), wherein a correlation matrix is shown in figure 3. As can be seen from the correlation matrix, the land use intensity, the per-person attraction amount, the shopping and company POI ratio and at least one other index have the correlation of more than 0.3, and the four indexes are deleted in the subsequent analysis. And (3) significance analysis: and performing regression analysis on the residual characteristics subjected to the correlation analysis and the traffic distribution, and removing the characteristics with the significance difference larger than 0.05.
(5) Constructing a model: and (4) constructing a deep neural network model, wherein the input of the model is the characteristics screened in the step (4), and the output of the model is the traffic volume between every two traffic districts, namely traffic distribution. The parameters of the model take the following values: the maximum training times are 100, the initial learning rate of the network is 0.1, the expected learning error of the network is 0.0004, the number of neurons in an input layer is 13, the number of neurons in a hidden layer is 3, and the number of neurons in an output layer is 1. And calculating the mean square error in the training process, stopping the training when the mean square error is less than 0.03, and finishing the model construction.
(6) And (3) distribution prediction: the constructed deep neural network model is adopted to predict the traffic distribution, the Mean Square Error (MSE) of a predicted value and an actual value is 1.625 x 10-3, and the Mean Absolute Percentage Error (MAPE) is 2.543%, which shows that the method can realize high-precision prediction of the traffic distribution based on feature extraction and deep learning.
The embodiments are only for illustrating the technical idea of the present invention, and the technical idea of the present invention is not limited thereto, and any modifications made on the basis of the technical scheme according to the technical idea of the present invention fall within the scope of the present invention.

Claims (8)

1. A traffic distribution prediction method based on feature extraction and deep learning is characterized by comprising the following steps: inputting the characteristics of the departure traffic cell and the arrival traffic cell into the trained deep learning prediction model to obtain predicted traffic distribution, namely the traffic volume between the departure traffic cell and the arrival traffic cell;
the deep learning prediction model is constructed by the following steps:
(1) data acquisition: collecting historical travel data of residents, traffic zone dividing data, urban population data and urban interest point data;
(2) data arrangement: the method comprises two steps of spatial data matching and traffic distribution extraction;
the spatial data matching is to match the departure position and the arrival position of each piece of historical travel data with the boundary of the traffic cell to determine a departure traffic cell and an arrival traffic cell
The traffic distribution extraction is to count departure traffic cells and arrival traffic cells of all historical travel data to obtain the traffic volume between every two traffic cells;
(3) feature extraction: extracting characteristics influencing traffic volume between traffic districts, including population structure characteristic extraction, traffic demand characteristic extraction, land utilization characteristic extraction and travel distance characteristic extraction;
(4) characteristic screening;
(5) constructing a model: and (4) constructing a deep learning prediction model, wherein the input of the model is the characteristics screened in the step (4), and the output of the model is traffic distribution.
2. The traffic distribution prediction method based on feature extraction and deep learning of claim 1, wherein: in the step (1), the resident historical travel data comprises a travel date, a departure time, a departure position, an arrival time and an arrival position;
the traffic cell division data includes a traffic cell number, a traffic cell boundary, and a traffic cell area;
the city population data comprises the number of the standing population, the age of the standing population and the gender of the standing population;
the city interest point data comprises interest point types and interest point positions.
3. The traffic distribution prediction method based on feature extraction and deep learning of claim 2, characterized in that: the types of points of interest include catering services, shopping services, science and education culture services, public facilities, corporate enterprises, transportation facility services, financial insurance services, business residences, living services, healthcare services, government agencies and social groups and lodging services.
4. The traffic distribution prediction method based on feature extraction and deep learning of claim 1, wherein: in the step (1), in the collected historical travel data of residents and the urban interest point data, the departure position, the arrival position and the interest point position are all represented by longitude and latitude in a unified manner, and the boundary of the traffic cell is the existing street dividing boundary.
5. The traffic distribution prediction method based on feature extraction and deep learning of claim 1, wherein: in step (3), the population structure feature extraction: extracting the number of the constant-living population, the density of the constant-living population, the sex structure of the constant-living population and the age structure of the constant-living population of each traffic cell;
the traffic demand feature extraction: extracting the daily departure traffic volume, arrival traffic volume, per-person departure times and per-person arrival times of each traffic cell;
the land utilization characteristic extraction: extracting the interest point density of each traffic cell and the proportion of interest points of different types;
the travel distance feature extraction: and extracting the linear distance between every two traffic districts.
6. The traffic distribution prediction method based on feature extraction and deep learning of claim 5, wherein: and the number of the departure times and the arrival times of the average people in the traffic cell are obtained by dividing the departure traffic volume and the arrival traffic volume of the traffic cell by the number of the permanent population of the cell.
7. The traffic distribution prediction method based on feature extraction and deep learning of claim 1, wherein: in the step (4), the feature screening comprises two steps of correlation analysis and significance analysis;
and (3) correlation analysis: calculating a correlation coefficient between every two features aiming at the features extracted in the step (3), if the correlation coefficient of the two features is larger than 0.3, calculating the correlation coefficient of the two features and the traffic distribution extracted in the step (2) respectively, and deleting the feature with the smaller correlation coefficient;
the significance analysis comprises the following steps: and (3) performing regression analysis on the retained features after the correlation analysis and the traffic distribution extracted in the step (2), and removing the features with the significance difference larger than 0.05.
8. The traffic distribution prediction method based on feature extraction and deep learning of claim 1, wherein: in the step (5), the mean square error is calculated in the model training process, and when the mean square error is less than 0.03, the training is stopped, and the model construction is completed.
CN202110941891.0A 2021-08-17 2021-08-17 Traffic distribution prediction method based on feature extraction and deep learning Pending CN113744525A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110941891.0A CN113744525A (en) 2021-08-17 2021-08-17 Traffic distribution prediction method based on feature extraction and deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110941891.0A CN113744525A (en) 2021-08-17 2021-08-17 Traffic distribution prediction method based on feature extraction and deep learning

Publications (1)

Publication Number Publication Date
CN113744525A true CN113744525A (en) 2021-12-03

Family

ID=78731483

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110941891.0A Pending CN113744525A (en) 2021-08-17 2021-08-17 Traffic distribution prediction method based on feature extraction and deep learning

Country Status (1)

Country Link
CN (1) CN113744525A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114694378A (en) * 2022-03-21 2022-07-01 东南大学 Two-stage traffic distribution prediction method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110276947A (en) * 2019-06-05 2019-09-24 中国科学院深圳先进技术研究院 A kind of traffic convergence analysis prediction technique, system and electronic equipment
CN110910659A (en) * 2019-11-29 2020-03-24 腾讯云计算(北京)有限责任公司 Traffic flow prediction method, device, equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110276947A (en) * 2019-06-05 2019-09-24 中国科学院深圳先进技术研究院 A kind of traffic convergence analysis prediction technique, system and electronic equipment
CN110910659A (en) * 2019-11-29 2020-03-24 腾讯云计算(北京)有限责任公司 Traffic flow prediction method, device, equipment and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
张政等: "基于出行特征的用地类型推断方法研究", 《交通运输系统工程与信息》, no. 05, 15 October 2020 (2020-10-15) *
张政等: "基于网约车数据的城市区域出行时空特征识别与预测研究", 《交通运输系统工程与信息》, no. 03, 15 June 2020 (2020-06-15) *
李雪琪: "基于POI数据的交通生成预测研究及软件实现", 《中国优秀博硕士论文全文数据库(硕士) 技术科学辑》, no. 6, 15 June 2020 (2020-06-15) *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114694378A (en) * 2022-03-21 2022-07-01 东南大学 Two-stage traffic distribution prediction method
CN114694378B (en) * 2022-03-21 2023-02-14 东南大学 Two-stage traffic distribution prediction method

Similar Documents

Publication Publication Date Title
CN107247938B (en) high-resolution remote sensing image urban building function classification method
CN110753307B (en) Method for acquiring mobile phone signaling track data with label based on resident survey data
CN112966899B (en) Urban public service facility construction decision method influencing population density
CN108717676A (en) Evaluation space method and system are lived in duty under different scale based on multi-data fusion
CN106503829A (en) A kind of crowding Forecasting Methodology of the Urban Public Open Space based on multi-source data
CN107527240B (en) System and method for identifying public praise marketing effect of operator industry product
CN112954623B (en) Resident occupancy rate estimation method based on mobile phone signaling big data
CN113033110B (en) Important area personnel emergency evacuation system and method based on traffic flow model
CN115641718B (en) Short-time traffic flow prediction method based on bayonet flow similarity and semantic association
CN113554466A (en) Short-term power consumption prediction model construction method, prediction method and device
CN115809378A (en) Medical shortage area identification and layout optimization method based on mobile phone signaling data
CN115034524A (en) Method, system and storage medium for predicting working population based on mobile phone signaling
CN115099542A (en) Cross-city commuting trip generation and distribution prediction method, electronic device and storage medium
CN114662774A (en) City block vitality prediction method, storage medium and terminal
CN111461197A (en) Spatial load distribution rule research method based on feature extraction
CN113744525A (en) Traffic distribution prediction method based on feature extraction and deep learning
CN108399736B (en) Service time-based method for acquiring number of effective bicycles shared by regions
CN113256978A (en) Method and system for diagnosing urban congestion area and storage medium
CN113537596A (en) Short-time passenger flow prediction method for new line station of urban rail transit
CN110852547B (en) Public service facility grading method based on position data and clustering algorithm
CN113408867B (en) Urban burglary crime risk assessment method based on mobile phone user and POI data
CN113298314B (en) Rail transit passenger flow prediction method considering dynamic space-time correlation
Yu et al. Vulnerability assessment and spatiotemporal differentiation of provinces tourism economic system based on the projection pursuit clustering model
CN115730731A (en) Automatic identification method and display platform for urban high-carbon emptying room unit
CN115115233A (en) Method for determining transfer service level between public transport and subway

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20211203

RJ01 Rejection of invention patent application after publication