CN115658666A - Multi-source data fusion sparse traffic flow completion method and system - Google Patents

Multi-source data fusion sparse traffic flow completion method and system Download PDF

Info

Publication number
CN115658666A
CN115658666A CN202211247900.7A CN202211247900A CN115658666A CN 115658666 A CN115658666 A CN 115658666A CN 202211247900 A CN202211247900 A CN 202211247900A CN 115658666 A CN115658666 A CN 115658666A
Authority
CN
China
Prior art keywords
data
traffic flow
intersection
missing
functional area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211247900.7A
Other languages
Chinese (zh)
Inventor
黄河
孙玉娥
娄陈
杜扬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou University
Original Assignee
Suzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou University filed Critical Suzhou University
Priority to CN202211247900.7A priority Critical patent/CN115658666A/en
Publication of CN115658666A publication Critical patent/CN115658666A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention relates to a sparse traffic flow completion method for multi-source data fusion, which comprises the steps of constructing a traffic network into a directed graph, dividing the directed graph into a plurality of functional areas, and dividing intersections belonging to the same functional area together to obtain functional areas; calculating the traffic conditions of all intersections in each functional area, and completing the data of the missing intersections, wherein the traffic conditions are represented by the average speed of public transport vehicles; and constructing a fitting function between the continuous traffic flow and the average speed of the public transport vehicles and the weather change data by taking the functional area as a unit, and recovering the continuous traffic flow of the intersection to be completed based on the fitting function and the average speed of the public transport vehicles and the weather change data. The method provided by the invention integrates various data for data completion by considering the similarity of the measuring points in the functional area and the influence of weather change on the traffic flow, has good robustness and accuracy, can recover data for intersections without historical traffic flow, and realizes traffic flow statistics of all intersections.

Description

Multi-source data fusion sparse traffic flow completion method and system
Technical Field
The invention relates to the technical field of intelligent traffic, in particular to a sparse traffic flow completion method and system for multi-source data fusion.
Background
With the rapid development of economy, traffic congestion becomes more serious, and serious negative effects are generated on the social and economic activities of cities. In order to reduce the burden of the underlying road network, an Intelligent Transportation System (ITS) has come into force. Traffic flow Measurement (Traffic Measurement) plays a significant role in an intelligent Traffic system, and aims to determine Traffic conditions of different road connections, which is an important step for realizing active congestion control. Many tasks, such as trip planning, road engineering and infrastructure planning, may benefit from traffic flow measurements. The internet of vehicles technology integrates wireless communication and computers into a traffic system, allows wireless data exchange between vehicles and roadside equipment, and enables large-scale, complex traffic flow measurements. However, due to the cost problem, a traffic information acquisition device is not installed at a relatively small intersection, and the flow information cannot be directly obtained. In addition, sensor failures, communication network problems, limited power conditions, inclement weather, or sensor aging can cause data loss. Many applications of the intelligent transportation system depend on complete data, and the popularization of the intelligent transportation system and the application thereof are seriously influenced by the data loss problem. Therefore, it is necessary to recover the missing traffic data to better accomplish tasks such as traffic management.
At present, the missing traffic flow recovery algorithm is mainly divided into the following three types, namely a prediction-based method, an interpolation-based method and a statistical-based method. The prediction-based approach is to estimate the data set by building a prediction model from historical data, predicting values of missing data. Originally, unidirectional prediction methods were based primarily on temporal neighborhood information, such as auto-regressive integration moving average ARIMA (Nihan N L. Air to computing from rates and monitoring loop errors [ J ]. Journal of transport Engineering,1997,123 (6): 454-458.), feedforward neural network FFNN (Vlagogenani E I, karlaftis M G, gold J C. Optimized and meta-optimized neural network for short-term transport flow prediction: A genetic Engineering [ J ]. Transport Research Part C: engineering, 2005,13 (3-211-234.) and data enhancement DA (Smith L, transfer W, linking J.) and data enhancement processing [ J ], the time of day of the random fluctuation of these processes [ 2003-1836 ] assumed that there was no difference in the temporal neighborhood information, such as the time of the random traffic flow prediction, the time of the random arrival processing, etc.: 2003, and the time of the random arrival of the traffic flow prediction. Therefore, more and more research is taking into account spatial and temporal dependencies and using matrix-based methods for prediction. Typical examples are the adaptive minimum absolute shrinkage and selection operators LASSO (Sun S, huang R, gao Y. Network-scale traffic modeling and formatting with graphical and neural networks [ J ]. Journal of Transportation Engineering,2012,138 (11): 1358-1367.) by Sun et al and the extended Kalman Filter method (Gazis D, liu C. Kalman filtering of traffic computing for two network links in distance [ J ]. Transportation Research Part B: spatial, 2003,37 (8): 737-745), etc. The interpolation method is characterized in that data acquired nearby missing data are used for data restoration, values of the missing data are replaced by an average value or a weighted average value of known multidimensional data in the adjacent state of the same acquisition point or adjacent acquisition points mainly through regression and clustering of spatio-temporal information, and the interpolation method is divided into a time adjacent interpolation method and a space adjacent interpolation method. Typical examples are modified KNN local least squares by Chang et al (Chang G, zhang Y, yao D.Missingdata imputation for traffic flow based on improved local least squares [ J ]. Tsinghua Science and Technology,2012,17 (3): 304-309.) and FCM clustering by Li et al (Li D, gu H, zhang L.A fuzzy c-means clustering for adding encoded data [ J ]. Ext Systems with Applications,2010,37 (10 6942-7.), etc. Statistical learning-based methods train statistical models by fitting and mapping with observed data, then estimating missing data multiple times and performing statistical inferences in an iterative process. The statistical model is a model based on probability theory and mathematical statistics, sometimes the model cannot be deduced through a theoretical analysis method, and a functional relationship between variables needs to be obtained through experimental data and mathematical statistics, which is called as a statistical model. The classical statistical models are the KPPCA algorithm of Li et al (Li L, li Y, li Z. Efficient missing data interpolating for traffic by conditioning temporal and spatial dependency [ J ]. Transmission Research Part C: estimating Technologies,2013, 34.) and the interpolation method based on the bucket decomposition of Tan et al (Tan H, feng G, feng J, et al. A. Transducer-based method for compensating for the traffic data [ J ]. Transmission Research Part C: estimating Technologies,2013, 28).
On the existing missing traffic flow recovery algorithms, most existing algorithms usually make explicit assumptions in advance, for example, assume that data is lost randomly or continuously at a plurality of moments, that is, a measurement point has historical data or partial historical data. However, due to installation and maintenance costs, it is impossible for all intersections to have data acquisition and receiving devices, that is, the obtained data is sparse, which makes the data loss rate of some measurement points 100%, and for such cases, the existing data recovery algorithm cannot process at all. In addition, the existing research mostly only considers the space-time correlation between intersections in prediction, and does not fully consider the influence of other factors (such as the functional area of the city, the weather change and the special time). Therefore, it is urgently needed to provide a sparse traffic flow completion method for multi-source data fusion to solve the existing problems.
Disclosure of Invention
Therefore, the technical problem to be solved by the invention is to overcome the problems in the prior art, and provide a multi-source data fused sparse traffic flow completion method and system, which can fuse various data for data completion, have good robustness, consider the similarity of a measuring point in a functional area, the influence of weather change on traffic flow and the like when the missing data is completed, improve the accuracy of the missing data completion, can also perform data recovery on intersections without historical traffic flow, can establish road network traffic flow reappearance with higher precision, wider range and stronger timeliness, and support the application of large-scale road network demand real-time reappearance.
In order to solve the technical problem, the invention provides a sparse traffic flow completion method for multi-source data fusion, which comprises the following steps of:
s1: the method comprises the steps that the whole traffic network is constructed into a directed graph, the directed graph is divided into a plurality of functional areas, and intersections belonging to the same functional area are divided together to obtain functional areas;
s2: calculating the traffic conditions of all intersections in each functional area, and complementing the data of the missing intersections by using a local least square interpolation algorithm, wherein the traffic conditions are represented by the average speed of public transport vehicles;
s3: and constructing a fitting function between the continuous traffic flow and the average speed of the public transport vehicles and the weather change data by taking the functional area as a unit, and recovering the continuous traffic flow of the intersection to be supplemented based on the fitting function and the average speed of the public transport vehicles and the weather change data of the intersection to be supplemented.
In an embodiment of the invention, in Sl, the whole traffic network is constructed as a directed graph G = (V, E), V represents a set of intersections in a city, E represents a set of roads in a city, each point V ∈ V in the graph represents an intersection, if the intersection V ∈ V represents an intersection i V at the intersection j Adjacent, there is a directed edge v in the directed graph i →v j Belongs to the group E.
In one embodiment of the invention, in S2, the traffic conditions of a functional area are reflected using a matrix D, which is
Figure BDA0003887442890000031
Wherein each row
Figure BDA0003887442890000032
Representing the average speed of mass-transit vehicles passing through the ith intersection over a period,
Figure BDA0003887442890000033
it shows that a functional area has m intersections, each intersection has n periods per day, and m > n.
In an embodiment of the present invention, in S2, a method for completing data of a missing intersection by using a local least squares interpolation algorithm includes:
s2.1: suppose the vector of missing data is d 1 =(α 1 ,...,α q ,d 1,q+1 ,...,d 1,n ) T In which α is 1 ,...,α q Representing missing q data, d 1,q+1 ,...,d 1,n Representing known data;
s2.2: finding d in the same functional area 1 Nearest k neighbor vectors;
s2.3: calculating d 1 Finding out k neighbor vectors with maximum absolute value of the coefficient with Pearson correlation coefficient of neighbor vectors in the same functional region, and defining as
Figure BDA0003887442890000041
S2.4: the vector of missing data and k neighbor vectors are formed into a matrix:
Figure BDA0003887442890000042
wherein the vector α = (α) 1 ,...,α q ) T For missing data, matrices
Figure BDA0003887442890000043
(Vector)
Figure BDA0003887442890000044
Matrix array
Figure BDA0003887442890000045
S2.5: formulating the least squares problem as min based on the matrix θ ||A T θ-w|| 2 Calculate q missing data as α = B T θ=B T (A T ) + w wherein (A) T ) + Is A T Theta is a vector composed of the coefficients to be solved.
In one embodiment of the invention, in S2.3, d is calculated 1 Missing values, two vectors d ', are not considered for Pearson correlation coefficients with neighbor vectors in the same functional region' 1 =(d 1,q+1 ,...,d 1,n ) T And d' j =(d j,q+1 ,...,d j,n ) T Pearson's correlation coefficient P between 1,j Is defined as
Figure BDA0003887442890000046
Wherein
Figure BDA0003887442890000047
Is d' j Mean value of (a) j Is d' j Standard deviation of (2).
In one embodiment of the invention, for intersections without mass transit vehicle data, i.e., intersections
Figure BDA0003887442890000048
If the intersection is empty, the average value of the speeds of k vectors nearest to the physical space of the intersection is selected for completion,
Figure BDA0003887442890000049
in one embodiment of the present invention, in S3, the weather change data includes precipitation and temperature, and a fitting function between the continuous traffic flow and the average speed of the public transportation vehicle, the precipitation and the temperature is constructed in units of functional zones.
In one embodiment of the invention, in S3, the method for constructing the fitting function between the continuous traffic flow and the average speed, the precipitation and the temperature of the public transport vehicles by taking the functional zone as a unit is as follows:
s3.1: suppose that each intersection measures a traffic flow y lasting t (t ≧ 1) cycles i Then y i Corresponds to t public transportationAverage speed S of vehicle i =[S i,1 ,...,S i,t ]T precipitation amounts P i =[P i,1 ,...,P i,t ]And T temperatures T i =[T i,1 ,...,T i,t ];
S3.2: suppose the function to be fitted is h θ (S i ,P i ,T i )=θ 01 S i,1 +…+θ t S i,tt+1 P i,1 +…+θ 2t P i,t2t+1 T i,1 +…+θ 3t T i,t Then the objective function is
Figure BDA00038874428900000410
S3.3: solving the objective function to obtain a parameter theta 0 ,...,θ 3t The parameter theta is set 0 ,...,θ 3t And substituting the function to be fitted into the function to be fitted to obtain a fitting function.
In addition, the invention also provides a multi-source data fused sparse traffic flow completion system, which comprises:
the functional area dividing module is used for constructing the whole traffic network into a directed graph, dividing the directed graph into a plurality of functional areas, and dividing intersections belonging to the same functional area together to obtain a functional area;
the traffic condition calculation module is used for calculating the traffic conditions of all intersections in each functional area and complementing the data of the missing intersections by using a local least square interpolation algorithm, wherein the traffic conditions are represented by the average speed of public transport vehicles;
and the traffic flow completion module is used for constructing a fitting function between the continuous traffic flow and the average speed of the public transport vehicles and the weather change data by taking the functional area as a unit, and recovering the continuous traffic flow of the intersection to be completed based on the fitting function and the average speed of the public transport vehicles and the weather change data of the intersection to be completed.
In an embodiment of the present invention, the traffic condition calculation module includes a missing data completion sub-module, and the missing data completion sub-module is configured to perform completion on data of a missing intersection by using a local least squares interpolation algorithm, and includes:
suppose the vector of missing data is d 1 =(α 1 ,...,α q ,d 1,q+1 ,...,d 1,n ) T In which α is 1 ,...,α q Representing missing q data, d 1,q+1 ,...,d 1,n Representing known data;
finding d in the same functional area 1 Nearest k neighbor vectors;
calculating d 1 Finding out k neighbor vectors with maximum absolute value of the coefficient with Pearson correlation coefficient of neighbor vectors in the same functional region, and defining as
Figure BDA0003887442890000055
The vector of missing data and k neighbor vectors are formed into a matrix:
Figure BDA0003887442890000051
wherein the vector α = (α) 1 ,...,α q ) T For missing data, matrices
Figure BDA0003887442890000052
(Vector)
Figure BDA0003887442890000053
Matrix array
Figure BDA0003887442890000054
Formulating the least squares problem as min based on the matrix θ ||A T θ-w|| 2 Calculate q missing data as α = B T θ=B T (A T ) + w of whichIn (A) T ) + Is A T Theta is a vector composed of the coefficients to be solved.
Compared with the prior art, the technical scheme of the invention has the following advantages:
the sparse traffic flow completion method based on multi-source data fusion provided by the invention has the advantages that various data are fused for data completion, the robustness is good, when missing data are completed, the similarity of a measuring point in a functional area, the influence of weather change on traffic flow and the like are considered, the accuracy of missing data completion is improved, data recovery can be carried out on intersections without historical traffic flow, the traffic flow statistics of all intersections in a road network is realized, the road network traffic flow reappearance with higher precision, wider range and stronger timeliness can be established, and the application of large-scale road network demand real-time reappearance is supported.
Drawings
In order that the present invention may be more readily and clearly understood, reference will now be made in detail to the present invention, examples of which are illustrated in the accompanying drawings.
Fig. 1 is a schematic flow chart of a sparse traffic flow completion method for multi-source data fusion according to an embodiment of the present invention.
Detailed Description
The present invention is further described below in conjunction with the following figures and specific examples so that those skilled in the art may better understand the present invention and practice it, but the examples are not intended to limit the present invention.
Referring to fig. 1, a sparse traffic flow completion method for multi-source data fusion provided in an embodiment of the present invention includes the following steps:
s1: the method comprises the steps that the whole traffic network is constructed into a directed graph, the directed graph is divided into a plurality of functional areas, and intersections belonging to the same functional area are divided together to obtain functional areas;
s2: calculating the traffic conditions of all intersections in each functional area, and complementing the data of the missing intersections by using a local least square interpolation algorithm, wherein the traffic conditions are represented by the average speed of public transport vehicles;
s3: and constructing a fitting function between the continuous traffic flow and the average speed of the public transport vehicles and the weather change data by taking the functional area as a unit, and recovering the continuous traffic flow of the intersection to be supplemented based on the fitting function and the average speed of the public transport vehicles and the weather change data of the intersection to be supplemented.
The invention provides a sparse traffic flow completion method based on multi-source data fusion, which fuses various data including road side unit data, public transport vehicle data, weather change and the like, can perform complete traffic flow detection, namely, statistics is performed on the flow of all intersections of a city, so that tasks such as road engineering and the like can be performed better.
Each road side unit RSU sends the information collected anonymously in the measuring period to the central server, the central server analyzes the collected data, estimates the traffic flow, models the relation between public transport vehicles and the traffic flow and recovers the flow information of the missing intersection.
The RSUs are deployed at interested positions such as street intersections, all the RSUs are connected to a central server in a wireless or wired mode, and data are collected and processed on the central server to achieve a traffic management function. The data obtained by the central server is anonymous, i.e. does not reveal the privacy of the user, including vehicle ID (anonymity processing), weather, etc. The point traffic flow of an intersection, namely the number of vehicles passing through the intersection in one period can be obtained by analyzing the vehicle ID of a single period; vehicle IDs for multiple cycles can also be analyzed for sustained point traffic flow, i.e., the number of vehicles that pass through the intersection per cycle for a given number of cycles, which can provide more detailed flow information on a given road segment. For example, it may be desirable to know the continuous traffic volume on a week's day of work, a few weeks of saturdays, or all the days of a month, and these data tell us the core of one location to stabilize traffic. Assuming that the measured continuous traffic flow corresponds to t (t ≧ 1) cycles, the point traffic flow of a single intersection is measured when t = 1.
Due to the consideration of cost, a traffic information acquisition device is not installed at a relatively small intersection, and the flow information cannot be directly obtained. Therefore, the invention is to combine the obtained traffic flow and public traffic data, weather change and the like to recover the flow information of the uncovered road section
In S1, the whole traffic network is constructed as a directed graph G = (V, E), V represents a set of intersections in a city, E represents a set of roads in a city, each point V ∈ V in the graph represents an intersection, and if an intersection V = (V, E), the intersection V ∈ V is represented as a set of roads i V at the intersection j Adjacent, there is a directed edge v in the directed graph i →v j Belongs to E. After the directed graph is constructed, the whole graph is divided into a plurality of functional areas (such as residential areas, business areas, factory areas and the like), intersections belonging to the same functional area are divided into one subgraph, and the subgraph (namely the functional area) G is obtained 1 =(V 1 ,E 1 ),...,G N =(V N ,E N ) In which V is 1 ∪...∪V N =V,
Figure BDA0003887442890000074
In S2, the average traveling speed of public transportation vehicles (such as taxis and buses) may reflect the traffic conditions of the passing road segments, and they are not sensitive to the position privacy and track privacy information during the operation period, so that the running tracks of the vehicles may be restored directly according to the GPS information and speed information uploaded by the vehicles, and the traffic flow of the passing road segments may be estimated according to the average traveling speed of the vehicles. The invention uses a division of a day into n periods according to the period length (e.g. 1 hour) in the flow estimation
Figure BDA0003887442890000073
To show that a functional area in a city has m intersections, each intersection has n periods per day, and m is greater than n. Each element in D represents the average travel speed of the vehicle through the intersection in one cycle. In the matrix, each row
Figure BDA0003887442890000071
The speed information of the ith intersection can reflect the traffic condition of the intersection:
Figure BDA0003887442890000072
public transport vehicles often select road sections with large pedestrian flows to drive, and still cannot cover all road sections in a transport network. Therefore, the invention uses a local least square interpolation algorithm to complement the information of the missing road section,
specifically, the method for completing the data of the missing intersection by using a local least square interpolation algorithm comprises the following steps:
s2.1: suppose the vector of missing data is d 1 =(α 1 ,...,α q ,d 1,q+1 ,...,d 1,n ) T In which α is 1 ,...,α q Representing missing q data, d 1,q+1 ,...,d 1,n Representing known data;
s2.2: search for d in the same functional area 1 Nearest k neighbor vectors;
s2.3: calculating d 1 Finding out k neighbor vectors with maximum absolute value of the coefficient with Pearson correlation coefficient of neighbor vectors in the same functional region, and defining as
Figure BDA0003887442890000081
S2.4: the vector of missing data and k neighbor vectors are formed into a matrix:
Figure BDA0003887442890000082
wherein the vector α = (α) 1 ,...,α q ) T For missing data, matrices
Figure BDA0003887442890000083
(Vector)
Figure BDA0003887442890000084
Matrix array
Figure BDA0003887442890000085
S2.5: formulating the least squares problem as min based on the matrix θ ||A T θ-w|| 2 Calculate q missing data as α = B T θ=B T (A T ) + w wherein (A) T ) + Is A T Theta is a vector composed of the coefficients to be solved.
In S2.3, d is calculated 1 Missing values, two vectors d ', are not considered for Pearson correlation coefficients with neighbor vectors in the same functional region' 1 =(d 1,q+1 ,...,d 1,n ) T And d' j =(d j,q+1 ,...,d j,n ) T Pearson's correlation coefficient P between 1,j Is defined as
Figure BDA0003887442890000086
Wherein
Figure BDA0003887442890000087
Is d' j Mean value of (a) j Is d' j Standard deviation of (2).
The present invention recovers each missing vector according to the above algorithm, and if there is a missing value in the adjacent vector of the missing data, it is padded with the average of the known values in the vector. For intersections where there is no mass transit vehicle data throughout, i.e.
Figure BDA0003887442890000088
For null, or 0 vector, the invention selects the average of the k vectors nearest to the intersection physical space to fill in,
Figure BDA0003887442890000089
i.e. the mean value of the speed of k intersections at the same time.
Wherein, in S3, the step ofThe information obtained in step S2 is only the traffic conditions of a single intersection in a city, and the present invention needs to know the continuous traffic flow of the single intersection. Assuming that one functional area has r intersections provided with road side units RSUs, a total of r continuous traffic flow measurement results can be obtained, and Y = (Y) 1 ,...,y r ) T Denotes y i (1 is more than or equal to i is less than or equal to r) represents the continuous traffic flow of the ith intersection. The flow of vehicles at an intersection can be reflected by the speed of the mass transit vehicle, where speed is inversely proportional to the flow of vehicles. In addition, whether the rain falls (rainfall) and the influence of the temperature on the traffic flow measurement are considered, because the change of the weather can also influence the traveling of people. For example, there may be relatively few vehicles traveling at high temperatures or during heavy rain. Therefore, in the fitting function of the present invention, each intersection corresponds to three data per cycle, which are the average speed, rainfall and temperature of the mass transit vehicle.
Let it be assumed that the traffic flow for t (t ≧ 1) cycles is measured at each intersection, r > t. Then a y i Corresponding to the average speed S of t public vehicles i =[S i,1 ,...,S i,t ]T precipitation amounts P i =[P i,1 ,...,P i,t ]T temperatures T i =[T i,1 ,...,T i,t ]. Let the function to be fitted be h θ (S i ,P i ,T i )=θ 01 S i,1 +…+θ t S i,tt+1 P i,1 +…+θ 2t P i,t2t+1 T i,1 +…+θ 3t T i,t Then the objective function is
Figure BDA00038874428900000810
Figure BDA0003887442890000091
Obtaining the optimal parameter theta by solving the objective function 0 ,...,θ 3t . For the flow y of the intersection to be completed, the invention uses the public traffic data S = S 1 ,...,S t Rainfall P = P 1 ,...,P t And temperature T = T 1 ,...,T t To recover, i.e. y = h θ (S,P,T)。
As an example, the present invention sets the measurement period to 1 hour; when a measuring period begins, each road side unit RSU anonymously records passing vehicles, and sends collected data to a central server after the period is ended, and the central server estimates point traffic flow or continuous traffic flow of each intersection by using collected vehicle IDs (the number of the measured periods is determined by specific requirements); when the complete traffic data (such as traffic management, road planning, etc.) needs to be used, the central server recovers the missing traffic information. Public transport vehicle (such as bus) data and taxi data based on privacy protection are obtained from an official system, temperature and rainfall data of each functional area of a city are obtained from a meteorological department, and flow information of uncovered intersections is recovered.
The sparse traffic flow completion method based on multi-source data fusion provided by the invention has the advantages that various data are fused for data completion, the robustness is good, when missing data are completed, the similarity of a measuring point in a functional area, the influence of weather change on traffic flow and the like are considered, the accuracy of missing data completion is improved, data recovery can be carried out on intersections without historical traffic flow, the traffic flow statistics of all intersections in a road network is realized, the road network traffic flow reappearance with higher precision, wider range and stronger timeliness can be established, and the application of large-scale road network demand real-time reappearance is supported.
In the following, the sparse traffic flow completion system for multi-source data fusion disclosed in the embodiment of the present invention is introduced, and a sparse traffic flow completion system for multi-source data fusion described below and a sparse traffic flow completion method for multi-source data fusion described above may be referred to correspondingly.
The embodiment of the invention also provides a multi-source data fused sparse traffic flow completion system, which comprises the following steps:
the functional area dividing module is used for constructing the whole traffic network into a directed graph, dividing the directed graph into a plurality of functional areas, and dividing intersections belonging to the same functional area together to obtain a functional area;
the traffic condition calculation module is used for calculating the traffic conditions of all intersections in each functional area and complementing the data of the missing intersections by using a local least square interpolation algorithm, wherein the traffic conditions are represented by the average speed of public transport vehicles;
and the traffic flow completion module is used for constructing a fitting function between the continuous traffic flow and the average speed of the public transport vehicles and the weather change data by taking the functional area as a unit, and recovering the continuous traffic flow of the intersection to be completed based on the fitting function and the average speed of the public transport vehicles and the weather change data of the intersection to be completed.
In an embodiment of the present invention, the traffic condition calculation module includes a missing data completion sub-module, and the missing data completion sub-module is configured to perform completion on data of a missing intersection by using a local least squares interpolation algorithm, and includes:
suppose the vector of missing data is d 1 =(α 1 ,...,α q ,d 1,q+1 ,...,d 1,n ) T In which α is 1 ,...,α q Representing missing q data, d 1,q+1 ,...,d 1,n Representing known data;
finding d in the same functional area 1 Nearest k neighbor vectors;
calculating d 1 Finding out k neighbor vectors with maximum absolute value of the coefficient with Pearson correlation coefficient of neighbor vectors in the same functional region, and defining as
Figure BDA0003887442890000101
The vector of missing data and k neighbor vectors are formed into a matrix:
Figure BDA0003887442890000102
wherein the vector α = (α) 1 ,...,α q ) T For missing data, matrix
Figure BDA0003887442890000103
(Vector)
Figure BDA0003887442890000104
Matrix array
Figure BDA0003887442890000105
Formulating the least squares problem as min based on the matrix θ ||A T θ-w|| 2 Calculate q missing data as α = B T θ=B T (A T ) + w wherein (A) T ) + Is A T Theta is a vector composed of the coefficients to be solved.
The sparse traffic flow completion system for multi-source data fusion of the embodiment is used for realizing the aforementioned sparse traffic flow completion method for multi-source data fusion, so that the specific implementation of the system can be seen from the foregoing part of the embodiment of the sparse traffic flow completion method for multi-source data fusion, and therefore, the specific implementation thereof can refer to the description of the corresponding part of the embodiment, and is not further described herein.
In addition, since the sparse traffic flow completion system for multi-source data fusion of the embodiment is used for implementing the sparse traffic flow completion method for multi-source data fusion, the function of the sparse traffic flow completion system corresponds to that of the method, and details are not repeated here.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be understood that the above examples are only for clarity of illustration and are not intended to limit the embodiments. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. This need not be, nor should it be exhaustive of all embodiments. And obvious variations or modifications of the invention may be made without departing from the spirit or scope of the invention.

Claims (10)

1. A multi-source data fusion sparse traffic flow completion method is characterized by comprising the following steps:
s1: the method comprises the steps that the whole traffic network is constructed into a directed graph, the directed graph is divided into a plurality of functional areas, and intersections belonging to the same functional area are divided together to obtain functional areas;
s2: calculating the traffic conditions of all intersections in each functional area, and complementing the data of the missing intersections by using a local least square interpolation algorithm, wherein the traffic conditions are represented by the average speed of public transport vehicles;
s3: and constructing a fitting function between the continuous traffic flow and the average speed of the public transport vehicles and the weather change data by taking the functional area as a unit, and recovering the continuous traffic flow of the intersection to be supplemented based on the fitting function and the average speed of the public transport vehicles and the weather change data of the intersection to be supplemented.
2. The sparse traffic flow completion method for multi-source data fusion according to claim 1, characterized in that: in S1, the whole traffic network is constructed into a directed graph G = (V, E), V represents a set of intersections in a city, E represents a set of roads in a city, each point V ∈ V in the graph represents an intersection, and if the intersection V ∈ V i V at the intersection j Adjacent, there is a directed edge v in the directed graph i →v j Belongs to the group E.
3. The sparse traffic flow completion method for multi-source data fusion according to claim 1, characterized in that: in S2, the traffic condition of a functional area is reflected by using a matrix D, wherein the matrix D is
Figure FDA0003887442880000011
Wherein each row
Figure FDA0003887442880000012
Represents a periodThe average speed of mass transit vehicles passing through the ith intersection,
Figure FDA0003887442880000013
it shows that a functional area has m intersections, each intersection has n periods per day, and m > n.
4. The sparse traffic flow completion method for multi-source data fusion according to claim 2, characterized in that: in S2, the method for completing the data of the missing intersection by using the local least square interpolation algorithm includes:
s2.1: suppose the vector of missing data is d 1 =(α 1 ,…,α q ,d 1,q+1 ,…,d 1,n ) T In which α is 1 ,…,α q Representing missing q data, d 1,q+1 ,…,d 1,n Representing known data;
s2.2: finding d in the same functional area 1 Nearest k neighbor vectors;
s2.3: calculating d 1 Finding out k neighbor vectors with maximum absolute value of the coefficient with Pearson correlation coefficient of neighbor vectors in the same functional region, and defining as
Figure FDA0003887442880000014
S2.4: the vector of missing data and k neighbor vectors are formed into a matrix:
Figure FDA0003887442880000021
wherein the vector α = (α) 1 ,…,α q ) T For missing data, matrices
Figure FDA0003887442880000022
(Vector)
Figure FDA0003887442880000023
Matrix of
Figure FDA0003887442880000024
S2.5: formulating the least squares problem as min based on the matrix θ ||A T θ-w|| 2 Calculate q missing data as α = B T θ=B T (A T ) + w wherein (A) T ) + Is A T Theta is a vector composed of the coefficients to be solved.
5. The sparse traffic flow completion method for multi-source data fusion of claim 4, wherein: in S2.3, d is calculated 1 Missing values, two vectors d ', are not considered for Pearson correlation coefficients with neighbor vectors in the same functional region' 1 =(d 1,q+1 ,…,d 1,n ) T And d' j =(d j,q+1 ,…,d j,n ) T Pearson correlation coefficient P therebetween 1,j Is defined as
Figure FDA0003887442880000025
Wherein
Figure FDA0003887442880000026
Is d' j Mean value of (a) j Is d' j Standard deviation of (2).
6. The sparse traffic flow completion method for multi-source data fusion of claim 4, wherein: for intersections without mass transit vehicle data, i.e.
Figure FDA0003887442880000027
If the intersection is empty, the average value of the speeds of k vectors nearest to the physical space of the intersection is selected for completion,
Figure FDA0003887442880000028
7. the sparse traffic flow completion method for multi-source data fusion according to claim 1, characterized in that: in S3, the weather change data comprises precipitation and temperature, and a fitting function between the continuous traffic flow and the average speed, precipitation and temperature of the public transport vehicle is constructed by taking the functional area as a unit.
8. The sparse traffic flow completion method for multi-source data fusion of claim 7, wherein: in S3, the method for constructing the fitting function between the continuous traffic flow and the average speed, the precipitation and the temperature of the public transport vehicles by taking the functional area as a unit comprises the following steps:
s3.1: suppose that each intersection measures a traffic flow y lasting t (t ≧ 1) cycles i Then y i Corresponding to the average speed S of t public transport vehicles i =[S i,1 ,…,S i,t ]T precipitation amounts P i =[P i,1 ,…,P i,t ]And T temperatures T i =[T i,1 ,…,T i,t ];
S3.2: let the function to be fitted be h θ (S i ,P i ,T i )=θ 01 S i,1 +…+θ t S i,tt+1 P i,1 +…+θ 2t P i,t2t+ 1 T i,1 +…+θ 3t T i,t Then the objective function is
Figure FDA0003887442880000029
S3.3: solving the objective function to obtain a parameter theta 0 ,…,θ 3t The parameter theta is set 0 ,…,θ 3t And substituting the function to be fitted into the function to be fitted to obtain a fitting function.
9. The utility model provides a sparse traffic flow completion system of multisource data fusion which characterized in that includes:
the functional area dividing module is used for constructing the whole traffic network into a directed graph, dividing the directed graph into a plurality of functional areas, and dividing intersections belonging to the same functional area together to obtain a functional area;
the traffic condition calculation module is used for calculating the traffic conditions of all intersections in each functional area and completing the data of the missing intersections by using a local least square interpolation algorithm, wherein the traffic conditions are represented by the average speed of public transport vehicles;
and the traffic flow completion module is used for constructing a fitting function between the continuous traffic flow and the average speed of the public transport vehicles and the weather change data by taking the functional area as a unit, and recovering the continuous traffic flow of the intersection to be completed based on the fitting function and the average speed of the public transport vehicles and the weather change data of the intersection to be completed.
10. The multi-source data fused sparse traffic flow completion system of claim 9, wherein: the traffic condition calculation module comprises a missing data completion submodule, wherein the missing data completion submodule is used for completing data of a missing intersection by using a local least square interpolation algorithm, and comprises the following steps:
suppose the vector of missing data is d 1 =(α 1 ,…,α q ,d 1,q+1 ,…,d 1,n ) T In which α is 1 ,…,α q Representing missing q data, d 1,q+1 ,…,d 1,n Representing known data;
finding d in the same functional area 1 Nearest k neighbor vectors;
calculating d 1 Finding out k neighbor vectors with maximum absolute value of the coefficient with Pearson correlation coefficient of neighbor vectors in the same functional region, and defining as
Figure FDA0003887442880000031
The vector of missing data and k neighbor vectors are formed into a matrix:
Figure FDA0003887442880000032
wherein the vector α = (α) 1 ,…,α q ) T For missing data, matrices
Figure FDA0003887442880000033
(Vector)
Figure FDA0003887442880000034
Matrix array
Figure FDA0003887442880000035
Formulating the least squares problem as min based on the matrix θ ||A T θ-w|| 2 Calculate q missing data as α = B T θ=B T (A T ) + w wherein (A) T ) + Is A T Theta is a vector composed of the coefficients to be solved.
CN202211247900.7A 2022-10-12 2022-10-12 Multi-source data fusion sparse traffic flow completion method and system Pending CN115658666A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211247900.7A CN115658666A (en) 2022-10-12 2022-10-12 Multi-source data fusion sparse traffic flow completion method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211247900.7A CN115658666A (en) 2022-10-12 2022-10-12 Multi-source data fusion sparse traffic flow completion method and system

Publications (1)

Publication Number Publication Date
CN115658666A true CN115658666A (en) 2023-01-31

Family

ID=84988419

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211247900.7A Pending CN115658666A (en) 2022-10-12 2022-10-12 Multi-source data fusion sparse traffic flow completion method and system

Country Status (1)

Country Link
CN (1) CN115658666A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116403409A (en) * 2023-06-06 2023-07-07 中国科学院空天信息创新研究院 Traffic speed prediction method, traffic speed prediction device, electronic equipment and storage medium
CN116628435A (en) * 2023-07-21 2023-08-22 山东高速股份有限公司 Road network traffic flow data restoration method, device, equipment and medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116403409A (en) * 2023-06-06 2023-07-07 中国科学院空天信息创新研究院 Traffic speed prediction method, traffic speed prediction device, electronic equipment and storage medium
CN116403409B (en) * 2023-06-06 2023-08-15 中国科学院空天信息创新研究院 Traffic speed prediction method, traffic speed prediction device, electronic equipment and storage medium
CN116628435A (en) * 2023-07-21 2023-08-22 山东高速股份有限公司 Road network traffic flow data restoration method, device, equipment and medium
CN116628435B (en) * 2023-07-21 2023-09-29 山东高速股份有限公司 Road network traffic flow data restoration method, device, equipment and medium

Similar Documents

Publication Publication Date Title
Hu et al. Stochastic origin-destination matrix forecasting using dual-stage graph convolutional, recurrent neural networks
Karami et al. Smart transportation planning: Data, models, and algorithms
US9599488B2 (en) Method and apparatus for providing navigational guidance using the states of traffic signal
CN115658666A (en) Multi-source data fusion sparse traffic flow completion method and system
US7953544B2 (en) Method and structure for vehicular traffic prediction with link interactions
Essien et al. Improving urban traffic speed prediction using data source fusion and deep learning
CN103295414B (en) A kind of bus arrival time Forecasting Methodology based on magnanimity history GPS track data
Elhenawy et al. Dynamic travel time prediction using data clustering and genetic programming
CN112489426B (en) Urban traffic flow space-time prediction scheme based on graph convolution neural network
US20240054321A1 (en) Traffic prediction
Rao et al. FOGS: First-Order Gradient Supervision with Learning-based Graph for Traffic Flow Forecasting.
Gloudemans et al. Interstate-24 motion: Closing the loop on smart mobility
Yu et al. Citywide traffic volume inference with surveillance camera records
CN117251722A (en) Intelligent traffic management system based on big data
Zhou et al. Queue profile identification at signalized intersections with high-resolution data from drones
Ding et al. A deep learning based traffic state estimation method for mixed traffic flow environment
Sinha et al. Sustainable time series model for vehicular traffic trends prediction in metropolitan network
Zhou et al. Stack ResNet for short-term accident risk prediction leveraging cross-domain data
Anastasiou et al. Data-driven traffic index from sparse and incomplete data
Yin et al. Queue intensity adaptive signal control for isolated intersection based on vehicle trajectory data
Li et al. Short-term iot data forecast of urban public bicycle based on the dbscan-tcn model for social governance
Park et al. Applying Clustered KNN Algorithm for Short-Term Travel Speed Prediction and Reduced Speed Detection on Urban Arterial Road Work Zones
Woo et al. Data-driven prediction methodology of origin–destination demand in large network for real-time service
Ahanin et al. An efficient traffic state estimation model based on fuzzy C-mean clustering and MDL using FCD
Gao et al. Trajectory data-driven pattern recognition of congestion propagation in road networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination