CN114120635A

CN114120635A - Tensor decomposition-based urban road network linear missing flow estimation method and system

Info

Publication number: CN114120635A
Application number: CN202111308250.8A
Authority: CN
Inventors: 邢吉平; 柳伟; 李荣成; 汪磊; 张爱华
Original assignee: Xingmin Zhitong Wuhan Automobile Technology Co ltd
Current assignee: Xingmin Zhitong Wuhan Automobile Technology Co ltd
Priority date: 2021-11-05
Filing date: 2021-11-05
Publication date: 2022-03-01
Anticipated expiration: 2041-11-05
Also published as: CN114120635B

Abstract

The invention relates to a tensor decomposition-based urban road network linear missing flow estimation method and system, wherein a road section which is to be analyzed and is provided with license plate photo recognition in a road network is obtained; acquiring license plate photo identification data of the road section at a specific time period, and preprocessing the data to acquire traffic flow; mobile phone positioning data of the road section in a specific time period are obtained, and preprocessing is carried out to obtain the traffic of a vehicle-mounted mobile phone user of the road section; filling the constructed 4-dimensional tensor model by using the acquired traffic flow and the traffic flow of the vehicle-mounted mobile phone user; and decomposing and reconstructing the DFCP tensor model to finish the estimation of the missing traffic flow. The invention uses the mobile phone positioning data in the mobile detector as full-time-space coverage, and fills the linear loss of the traffic volume of the LPR detector in the fixed detector based on the time-space correlation among multi-source data, thereby acquiring the complete LPR data of the full-time-space coverage and ensuring the integrity and the accuracy of the data.

Description

Tensor decomposition-based urban road network linear missing flow estimation method and system

Technical Field

The invention relates to the technical field of positioning methods, in particular to a tensor decomposition-based urban road network linear missing flow estimation method and system.

Background

In recent years, digital intelligent transportation and car networking systems are rapidly emerging in various big cities, and a large amount of accurate and timely traffic information data is needed for supporting the normal operation of the systems. For example, a city intelligent traffic control system requires sufficient traffic flow data (flow, density, and speed) to formulate a reasonable city traffic management strategy. When the urban traffic planning guidance system carries out traffic prediction, when information data is missing, the prediction precision is greatly influenced. However, due to the problems of software or hardware failure of the detection device, blocked information communication network, failure of the power supply device, and loss of regular maintenance, the problem of data loss during the information data acquisition process in the traffic field is difficult to avoid.

The problem of missing data is widely existed in the fields of statistics, sociology, epidemiology and the like. However, the problem of missing data is widely present in the fields of statistics, sociology, and epidemiology. The problem also always troubles the intensive research in the traffic field, such as missing data of traffic flow, travel time, tail gas emission, vehicle noise and the like.

In the research of data missing problem in traffic field, the data missing problem can be further divided according to the mode of the missing data and the mechanism of the missing data. The deletion of traffic data can be classified into three types, specifically, complete random deletion (MCAR), random deletion (MAR), and non-random deletion (MNAR), according to the location of the deletion. The data is classified into point-like deletion, linear deletion, and planar deletion according to the length of the data. In the conventional studies, the prediction is mainly performed based on traffic history data in the study of the non-random loss pattern, but the studies on the completely random loss pattern and the random loss pattern are few. In addition, in summary, studies on the case of missing data length have been considered in the past, and studies on the case of linear missing have been relatively rare. In addition, in the estimation research of the missing traffic in the urban road network, it is necessary to consider the influence of the complicated road network structure and the non-fixed position of the missing traffic in the road network.

The conventional research methods for filling missing data can be classified into three types, namely a machine learning-based method, a space-time interpolation-based method and a statistical learning-based method. There are the following problems:

(1) in the study of large-scale missing data, most studies focus on the influence of data missing rate on flow estimation, and the missing mode for which the influence is small is divided. Among studies of different loss patterns, there are few studies on the occurrence of linear loss scenes for links. This phenomenon is particularly prevalent when the LPR detector fails to detect and store the LPR signal.

(2) In the study of estimating partial missing flow in a complex urban road network, when a tensor decomposition model is constructed by using real data acquired by different fixed detectors, the difference between road sections where the fixed detectors are located is not considered. When the detectors are arranged sparsely, the difference between the fixed detectors is large, and the correlation of the spatial dimension in the data in the input tensor is poor at the moment, so that certain influence is caused on the interpolation accuracy.

(3) In creating a CP tensor decomposition model for missing value filling, neighboring road segment data is often selected based on similarity between data to construct the tensor in the same dimension. But does not compare the similarity between the different tensor dimensions. If the similarity between different road segments is inferior to that between the other two when tensors are constructed by using data between different days, different road segments and different months, the CP tensor model does not use differentiated weight. In addition, when structures with different dimensions in a tensor model are built, the tensor structures are not determined after pre-screening in the same dimension, and the influence among different road sections in a complex urban road network is particularly important.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides a tensor decomposition-based urban road network linear missing flow estimation method and system, which are used for acquiring LPR data in a full space-time mode and ensuring the integrity and accuracy of the data.

In order to achieve the above object, the present invention provides a method for estimating linear missing traffic of an urban road network based on tensor decomposition, comprising:

acquiring a road section for installing license plate photo recognition in a road network to be analyzed;

acquiring license plate photo identification data of the road section at a specific time period, and preprocessing the data to acquire traffic flow; mobile phone positioning data of the road section in a specific time period are obtained, and preprocessing is carried out to obtain the traffic of a vehicle-mounted mobile phone user of the road section;

constructing a 4-dimensional tensor model, wherein dimensions comprise days, time periods, road sections and data types; the data types comprise traffic flow and the flow of a vehicle-mounted mobile phone user; filling the 4-dimensional tensor model by using the acquired traffic flow and the traffic flow of the vehicle-mounted mobile phone user;

and (5) tensor decomposition and recovery are repeatedly carried out, and estimation of the missing traffic flow is completed.

Further, the data preprocessing is carried out to obtain the traffic flow, and the method comprises the following steps:

carrying out image recognition to obtain license plate numbers and driving time;

selecting accurate data for license plate number identification, eliminating redundant data, and performing data format conversion and storing in a database; the LPR detector is mapped to a road network dotting map to be analyzed of the road network, and the traffic flow of each road section in each day in each time period is obtained.

Further, preprocessing is performed to obtain the traffic of the vehicle-mounted mobile phone user in the road section, and the method comprises the following steps: mapping the mobile phone positioning data to a road network map, calculating the average speed in unit time, and rejecting the mobile phone positioning data if the average speed is lower than a set threshold value; and counting the flow of the vehicle-mounted mobile phone users in each time period of each road section of each day based on the remaining mobile phone positioning data.

Further, the average velocity is calculated as follows:

wherein m represents the number of positioning points of the mobile phone user in unit time period T, T_kRepresents a time period between the kth time and the previous time, and x (k) represents latitude and longitude coordinate information of the position at the kth time.

Further, after the mobile phone positioning data are mapped to the road network map, the maximum instantaneous speed is calculated, and the mobile phone positioning data of which the maximum instantaneous speed exceeds the highest speed limit of each road section are removed.

Further, the maximum instantaneous speed is calculated as follows:

wherein, T_kThe time intervals of the mobile phone user at the kth moment and the kth-1 moment are represented, and x (k) and x (k-1) respectively represent the longitude and latitude of the geographic position of the mobile phone user at the kth moment and the kth-1 moment.

Further, repeating the tensor decomposition and restoration includes:

constructing a weighted original matrix based on the tensor matrix χ:

the element value of the weighting matrix omega and the element value of the tensor matrix chi data missing position are 0, otherwise, the element value is 1;

performing CP decomposition on the tensor matrix x to obtain

A is a decomposed time factor matrix, B is a decomposed road section factor matrix, C is a decomposed day factor matrix, and D is a decomposed data category factor matrix;

the weighted tensor after the same size recovery is:

wherein x_ijklThe method comprises the steps that a tensor matrix x is the mobile phone positioning data or license plate data of the kth road section in the ith day and the jth time period, wherein l is 1 and represents the mobile phone positioning data, and l is 2 and represents the license plate data; omega_ijklThe weight value of the mobile phone or license plate data of the ith day and the kth road section of the weighting matrix omega is obtained; a is_ir，b_jr，c_kr，d_lrThe time factor matrix A, the road section factor matrix B, the day factor matrix C and the data factor matrix D are elements of the time factor matrix A, the road section factor matrix B, the day factor matrix C and the data factor matrix D respectively; r is rank number of tensor decomposition; n is₁,n₂,n₃,n₄Respectively the total number of days, the total number of time periods, the total number of road sections and the total number of data types;

updating the gradient G⁽ⁿ⁾＝-2Y_(n)A^(-n)+2Z_(n)A^(-n)

In the formula, Y_(n)For weighting the original value of the nth tensor of the original matrix y, Z_(n)For the recovered weighted tensor

A restored value of the nth tensor; g⁽ⁿ⁾An updated gradient for the nth tensor;

A^(-n)as ═ D &, _ C &, _ a, where | _ indicates the Khatri-Rao product of the matrix;

and calculating a loss function, if the precision requirement is met, completing iteration, and if the precision requirement is not met, returning to the step of performing CP decomposition on the tensor matrix χ.

Further, the loss function of the CP decomposition process is:

wherein, T_ATraffic matrix, T, for averaging all data categories per road segment in each day in a corresponding time dimension_BFor the corresponding road section dimensionAveraging the traffic matrix, T, for all data classes in each time period of the day_CTraffic matrix, T, for all data classes in each time segment in each road segment averaged over the corresponding dimension of days_typeFor the traffic matrix, lambda, in the corresponding data category dimension, averaged over all time periods in each day in each road section₁，λ₂，λ₃，λ₄P is the weight sparseness of each partial regularization term; the traffic volume is the average value of the traffic volume and the traffic volume of the vehicle-mounted mobile phone user.

The second aspect provides a tensor decomposition-based linear missing traffic estimation system for an urban road network, which comprises:

the road section selection module is used for acquiring a road section which is to be analyzed and is provided with license plate photo recognition;

the data acquisition module is used for acquiring license plate photo identification data of the road section at a specific time period, and carrying out data preprocessing to acquire traffic flow; mobile phone positioning data of the road section in a specific time period are obtained, and preprocessing is carried out to obtain the traffic of a vehicle-mounted mobile phone user of the road section;

the tensor construction module is used for constructing a 4-dimensional tensor model, and dimensions comprise days, time periods, road sections and data types; the data types comprise traffic flow and the flow of a vehicle-mounted mobile phone user; filling the 4-dimensional tensor model by using the acquired traffic flow and the traffic flow of the vehicle-mounted mobile phone user;

and the tensor decomposition module is used for repeatedly carrying out tensor decomposition and recovery to finish the estimation of the missing traffic flow.

A third aspect provides a computer-readable storage medium, where program instructions are stored, and when the program instructions are executed by a processor, the method for estimating linear missing traffic of an urban road network based on tensor decomposition is implemented.

The technical scheme of the invention has the following beneficial technical effects:

(1) the invention uses the mobile phone positioning data in the mobile detector as a full-time-space coverage and the time-space correlation among multi-source data to fill the linear loss of the traffic volume of the LPR detector in the fixed detector, thereby acquiring the complete LPR data of the full-time-space coverage and ensuring the integrity and the accuracy of the data.

(2) According to the traffic missing flow filling method, the tensor decomposition method is selected for filling the traffic missing flow, so that the advantages of the tensor decomposition method can be kept, and the inherent relevance of multi-dimensional traffic data can be better mined.

(3) According to the method, the relationship between the weight among data and the space-time weight in the flow filling estimation is better considered by adding the regularization penalty coefficient in the constructed tensor model.

(4) The method carries out interpolation estimation on road sections with linear flow loss in the urban road network, wherein the research on the accuracy of flow interpolation under different loss lengths and different loss rates is considered. In addition, the accuracy of flow interpolation under different tensor methods is compared, and tensor estimation models under different dimensions are tried to be built.

Drawings

FIG. 1 is a schematic diagram of a linear missing traffic estimation flow of an urban road network based on tensor decomposition;

FIG. 2 is a schematic diagram of a linear missing traffic estimation system of an urban road network based on tensor decomposition;

FIG. 3 is a schematic diagram of LPR point location arrangement in a road network;

FIG. 4 is a schematic diagram of positioning distribution of mobile phones in a road network;

fig. 5 is a representation of a four-dimensional tensor.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings in conjunction with the following detailed description. It should be understood that the description is intended to be exemplary only, and is not intended to limit the scope of the present invention. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present invention.

The method is divided into three types of completely random deletion, random deletion and non-random deletion according to the distribution of data deletion positions.

In the random missing mode, there is a correlation between a missing data point and its neighboring data points. The data missing pattern is also further divided according to the length type of the missing data. The types of the missing traffic are classified into 3 types of point-like missing, line-like missing, and planar missing, respectively. The definition of each type thereof is as follows:

point-like deletion: indicating that the traffic flow data acquired in the road section only has deficiency at a certain moment, and other adjacent positions are complete data;

linear deletions: indicating that the flow data in the acquired traffic flow data at continuous moments are missing, and the rest positions are complete data;

surface area deletion: this indicates that the flow data is missing at consecutive times in the acquired data and that the data at the positions adjacent to the missing flow data is missing.

Through research on the road sections provided with the LPR detector devices, the continuous linear missing flow and the point-shaped missing flow exist in the partial road sections.

The traffic data of the road sections in the same space-time range are acquired through the mobile phone data and the LPR data, wherein the LPR data can obtain the complete traffic volume in a road network, the mobile phone data can only obtain partial traffic volume data due to the limitation of the market share of the obtained data, but the mobile phone data has the property of uniform distribution in the space-time range of the whole road network, so that the two data have high similarity, and the two data can be regarded as isomorphic data. When the LPR data is linear missing, the integrity of the trip data can be recorded in any time by the mobile phone data, and the missing LPR data can be filled better.

In some embodiments, a tensor decomposition-based urban road network linear missing flow estimation method is provided, and includes the following steps:

s100, a road section of a road network to be analyzed, which is identified by a license plate photo installed in the road network, is obtained.

In a road network, license plate photo recognition LPRs are installed on part of road sections, and missing flow in the road sections where the license plate photo recognition LPRs are installed is estimated.

S200, acquiring license plate photo identification data of the road section at a specific time period, and preprocessing the data to acquire traffic flow; and acquiring mobile phone positioning data of the road section at a specific time period, and preprocessing the mobile phone positioning data to acquire the traffic of the vehicle-mounted mobile phone user of the road section.

The data preprocessing is carried out to obtain the traffic flow, and the method comprises the following steps:

(1) and (5) carrying out image recognition to obtain the license plate number and the driving time.

(2) Selecting accurate data for license plate number identification, eliminating redundant data, and performing data format conversion and storing in a database; the LPR detector is mapped to a road network dotting map to be analyzed of the road network, and the traffic flow of each road section in each day in each time period is obtained.

Preprocessing is carried out to obtain the traffic of the vehicle-mounted mobile phone user in the road section, and the method comprises the following steps: mapping the mobile phone positioning data to a road network map, calculating the average speed in unit time, and rejecting the mobile phone positioning data if the average speed is lower than a set threshold value; calculating the maximum instantaneous speed, and eliminating the mobile phone positioning data of which the maximum instantaneous speed exceeds the highest speed limit of each road section; and counting the flow of the vehicle-mounted mobile phone users in each time period of each road section of each day based on the remaining mobile phone positioning data.

Mapping the mobile phone positioning data to a road network map, specifically comprising: firstly, mobile phone positioning data is mapped to each road section in a research road network under the condition of considering the positioning error precision, namely the road network mapping process of the mobile phone positioning data is obtained. The step is a precondition for obtaining traffic parameter information, and is mainly a process for associating mobile phone positioning data containing geographic position information with road network map information.

The original coordinate system adopted by the mobile phone triangulation location data is GCJ-02, which is a geographic information coordinate system established by the China national surveying and mapping bureau. And converting the mobile phone positioning data based on the GCI-02 coordinate into a WGS84 coordinate scene by adopting a coordinate conversion algorithm disclosed in Github. In one embodiment, a Transform _ files module developed by a Python toolkit is selected to realize the conversion operation of two coordinate systems.

The mobile phone triangulation location data only records the longitude and latitude position information of the mobile phone and does not have any label related to the road section position information. Therefore, the mobile phone positioning data needs to be matched with the road network in the electronic map, so that the mobile phone positioning data and the road network geographic coordinate information are correlated. The matching is realized by a topological algorithm, an automatic dotting method, a probability method and the like.

The average moving speed of the mobile phone user is calculated firstly, and the travel modes of the non-motor vehicles are screened out by setting speed thresholds of different travel modes. And then, judging whether the instantaneous speed maximum value of the mobile phone user meets the maximum speed limit requirement of the road where the mobile phone user is located or not by evaluating, so as to judge whether the data is wrong or drift data. If the maximum speed limit threshold is exceeded, the data may be drift or error data of the mobile phone user, and the data of the mobile phone user also needs to be removed.

The average speed calculation formula based on the mobile phone positioning data is as follows:

wherein k represents the kth moment of each mobile phone user, m represents the number of positioning points of the mobile phone users in a time period T, and T_kRepresents a time period between the kth time and the previous time, and x (k) represents latitude and longitude coordinate information of the position at the kth time.

Considering the actual traffic running conditions of urban road networks in Nanjing, the research institute, the speed threshold range of walking travel is set to be 0-7km/h in this chapter, and the speed threshold range of bicycle travel is set to be 0-15 km/h. And when the road sections in the urban road network are not in traffic jam, the average speed of the vehicles is generally more than 16 km/h. Therefore, the average speed of 16km/h is selected as the lowest speed threshold value of the travel of the number of the mobile phone users in the vehicle. If the calculated average speed is greater than 16km/h, we can consider the handset positioning data as sent by the vehicle handset user.

In addition, the maximum instantaneous speed of the cell phone data is also used to determine whether the cell phone positioning data is from a continuous signal transmitted in the moving vehicle. The maximum instantaneous speed calculation formula is as follows:

wherein, T_kRepresenting the time period of the user of the mobile phone within the kth time and the kth-1 time. x (k) and x (k-1) respectively represent the longitude and latitude of the geographic position of the mobile phone user at the kth moment and the kth-1 moment.

All roads in the city are divided into main roads, secondary roads and branch roads. And the maximum speed requirements demanded at different road grades are different. In one embodiment, the instantaneous speed of the handset user of 80km/h is selected as the maximum speed threshold. If the maximum speed exceeds 80km/h, the positioning data of the mobile phone user is considered to be discontinuous, and then the positioning data needs to be eliminated.

S300, constructing a 4-dimensional tensor model, wherein dimensions comprise days, time periods, road sections and data types; the data types comprise traffic flow and the flow of a vehicle-mounted mobile phone user; and filling the 4-dimensional tensor model by using the acquired traffic flow and the traffic flow of the vehicle-mounted mobile phone user.

In conjunction with fig. 5, a 4-dimensional tensor model is formed, with dimensions of days × period × link × data type. And filling the data acquired in the step S200 into a 4-dimensional tensor model to acquire an original tensor matrix χ.

The number of days may be selected, for example, as one week; the time interval is divided into time intervals in each day, for example, if 15 minutes is selected as one time interval, the whole day is divided into 96 time intervals; the road sections are selected in the step (1). And filling the acquired traffic flow and the traffic flow of the vehicle-mounted mobile phone user to corresponding positions as data in each time interval.

And S400, tensor decomposition and recovery are repeatedly carried out, and estimation of the missing traffic flow is completed.

The invention is based on prior information in traffic flow data filling, and improves the CP decomposition method. A Data fusion based CP strategy decomposition method (DFCP for short) is proposed by introducing a regularization penalty coefficient to describe the weight of factor matrixes of different days, different road sections and different Data.

Firstly, to tensorThe matrix χ is decomposed:

and decomposing by adopting a CP tensor decomposition form to obtain factor matrixes A, B, C and D.

The method converts the low-rank approximation problem into an unconstrained optimization problem based on an optimization method in sparse tensor decomposition, takes the residual square sum of the recovered tensor and a true value as the unconstrained optimization problem, and solves a target function by using a gradient descent method, wherein the form of the method is as follows:

for decision variable a in objective function_ir，b_ir，c_ir，d_lrRespectively solving partial derivatives to obtain:

wherein x_ijklThe method comprises the steps that a tensor matrix x is the mobile phone positioning data or license plate data of the kth road section in the ith day and the jth time period, wherein l is 1 and represents the mobile phone positioning data, and l is 2 and represents the license plate data; omega_ijklThe weight value of the mobile phone or license plate data of the ith day and the kth road section of the weighting matrix omega is obtained; a is_ir，b_jr，c_kr，d_lrThe time factor matrix A, the road section factor matrix B, the day factor matrix C and the data factor matrix D are elements of the time factor matrix A, the road section factor matrix B, the day factor matrix C and the data factor matrix D respectively; r is rank number of tensor decomposition; n is₁,n₂,n₃,n₄Respectively the total number of days, the total number of time periods, the total number of road sections and the total number of data types; according to the gradient descent method, the update formula in each iteration is:

here, α represents a step size of gradient descent, which is also called a learning rate. The normal dense matrix can be gradually updated a according to the above updating formula_ir，b_ir，c_kr，d_lrAnd (5) waiting for parameters, and terminating iteration when the objective function is smaller than a certain value, so that all results are output.

Thereafter, from the decomposed factor matrices A, B, C, D, from the original tensors

To tensor

And (6) filling. Assuming ω is a weighting matrix of the same size as the original tensor matrix χ, the weighted tensor y is:

where, denotes a Hadamard product obtained by multiplying elements corresponding to the matrix, in the weighting matrix ω, the element value at the data missing position is 0, and conversely, 1.

The CP method uses the objective function to optimize the estimation error of the observed position in the original data tensor, but does not update the other missing positions separately.

Suppose that

In order to be the weighted original matrix, the matrix is,

for the same size restored weighted tensor matrix, the original CP decomposition computes the loss function as follows:

the DFCP method adds regularization term constraint on the basis of the CP decomposition method according to the correlation mode among different dimensions in traffic flow data. The core idea is that because all road sections in the urban network are connected with each other, the traffic volume between the road sections has certain correlation in a space-time range. According to the space-time correlation among the road sections, the weight of the road section for adjusting the flow to be filled is larger than that of the similar road section, and the weight of the road section for adjusting the flow to be filled is smaller than that of the dissimilar road section, so that the filled flow is kept in a reasonable range. Therefore, assuming that A is a decomposed time factor matrix, B is a decomposed road section factor matrix, C is a decomposed day factor matrix, and D is a decomposed data category factor matrix, a regularization item for reflecting the time-varying characteristics of the road section is added on the basis of considering the original objective function, and the regularization item comprises the fitting time period characteristic | | A-T_AI, road section characteristic B-T_BC-T, | characteristics of days | |_CAnd data class characteristics D-T_typeFour kinds of constraints, | | | A | | luminance need to be added to prevent the regular term of overfitting in addition²+||B||²+||C||²+||D||²So as to improve the completion effect of the algorithm on unknown data. The loss function after adding various regularization terms described above can be expressed as:

wherein, T_ATraffic matrix, T, for averaging all data categories per road segment in each day in a corresponding time dimension_BTraffic matrix, T, for all data classes in each time period of the average day in the corresponding road section dimension_CTraffic matrix, T, for all data classes in each time segment in each road segment averaged over the corresponding dimension of days_typeFor the traffic matrix, lambda, in the corresponding data category dimension, averaged over all time periods in each day in each road section₁，λ₂，λ₃，λ₄And p is a weight coefficient of each partial regularization term. The traffic volume is the sum of the traffic volume of the mobile phone user and the traffic volume, and then is divided by 2 to be the average value of the traffic volume and the traffic volume.

The correlation between partial flow acquired by mobile phone positioning data in a research road section and real flow acquired by an LPR detector and the continuity and integrity of data of the mobile detector in a space-time range can be used for better describing linear missing data information. And further effectively makes up the influence of sparse layout of fixed detectors in the road network.

The tensor matrix χ restored using equation (15),

for each decomposed tensor A in each iteration process⁽ⁿ⁾The gradient is recalculated as follows:

G⁽ⁿ⁾＝-2Y_(n)A^(-n)+2Z_(n)A^(-n) (16)

in the formula G⁽ⁿ⁾For the updated gradient of the nth tensor, Y_(n)For weighting the nth matrix of the original matrix yOriginal value of the tensor, Z_(n)For the recovered weighted tensor

The restored value of the nth tensor.

Wherein A is^(-n)The calculation method of (2) is as follows: a. the^(-n)D⊙C⊙B⊙A

And indicates the Khatri-Rao product of the matrix. The matrix is updated iteratively gradually and the procedure can be terminated early when the objective function meets the requirements.

And (3) completing the iterative process, namely completing the filling of the tensor matrix, reading complete flow data, and reading the data according to days, time periods and road sections.

The invention provides a linear missing traffic estimation system of an urban road network based on tensor decomposition, which comprises a road section selection module, a data acquisition module, a tensor construction module and a tensor decomposition module in combination with a graph 2.

And the road section selection module is used for acquiring a road section which is to be analyzed and is provided with license plate photo recognition.

The data acquisition module is used for acquiring license plate photo identification data of the road section at a specific time period, and carrying out data preprocessing to acquire traffic flow; and acquiring mobile phone positioning data of the road section at a specific time period, and preprocessing the mobile phone positioning data to acquire the traffic of the vehicle-mounted mobile phone user of the road section.

The tensor construction module is used for constructing a 4-dimensional tensor model, and dimensions comprise days, time periods, road sections and data types; the data types comprise traffic flow and the flow of a vehicle-mounted mobile phone user; and filling the 4-dimensional tensor model by using the acquired traffic flow and the traffic flow of the vehicle-mounted mobile phone user.

There is provided a computer readable storage medium having stored therein program instructions, which when executed by a processor, implement the tensor decomposition-based urban road network linear missing flow estimation method.

The computer-readable storage medium may include, for example, a memory card of a smart phone, a storage component of a tablet computer, a hard disk of a personal computer, a Read Only Memory (ROM), an Erasable Programmable Read Only Memory (EPROM), a portable compact disc read only memory (CD-ROM), a USB memory, or any combination of the above storage media. The computer-readable storage medium may be any combination of one or more computer-readable storage media.

Examples

Urban road networks within 3 kilometers of the north in the south Beijing south station of Jiangsu province are selected as research objects, wherein road sections in the road networks have diversity and comprise road sections with different road grades such as main roads, secondary roads, branches and the like. The study road network has 36 bidirectional road segments, wherein the number of the road segments provided with LPR detector devices is 20, and the positions of LPR layout points in the road network are shown in the following figure 3. The sections laid by these LPR detectors are regarded as sections capable of acquiring real flow, and the filling of linear missing flow occurring in these sections is the estimation object. Meanwhile, all mobile phone positioning data covering the same time period as LPR data in the research road network are covered. The distribution of the positioning data of the mobile phone is shown in fig. 4.

The invention carries out decomposition and reconstruction based on a DFCP mode, and selects a CP decomposition algorithm (CP-ALS) of a least square method, a high-precision low-rank tensor filling algorithm (HaLRTC for short) and a Tucker decomposition filling algorithm as a comparison method.

In a Python software platform, the software platform is realized by means of a tensolly open source software package, the computer is configured to be Intel (R) Core i7-6500U CPU @2.50GHz 2.59GHZ, and the installation memory is 8 GB. To ensure the confidence of the experimental results, 10 runs were performed for each experiment, and the average of 10 runs was obtained.

The flow filling MAPE estimation results for different linear miss lengths are seen in table 1. From the MAPE estimation results of different linear deletion lengths at different deletion rates in table 1, it can be seen from the overall results that the DFCP decomposition method proposed by the present invention is significantly better than the estimation results of the other three methods. And in turn is the CP-ALS estimation method, the Tucker (5,5,5) estimation method, and the HaLRTC estimation method. The estimation errors of the four methods are increased along with the increase of the deletion length and the deletion rate, but the estimation result is relatively stable compared with the DFCP estimation. Whereas, the accuracy of the Tucker (5,5,5) and HaLRTC estimation methods varies greatly with the deletion length, wherein a larger error occurs when the deletion rate is 50% with a deletion length of 3, and the estimation result of the accuracy of MAPE is reduced when the deletion length is increased but the deletion rate is decreased. This reflects the relative large impact of these two methods on the deletion rate. When the occurrence rate of deletion is high, the DFCP decomposition method provided in this chapter has more advantages. Among them, as can be seen from the estimation results in this section, when the deletion length is 12 units and the deletion rate is 10%, the DFCP method estimation results have a greater advantage over other estimation methods.

Table 1 flow filling MAPE estimation results for different linear missing lengths

In summary, the invention relates to a tensor decomposition-based urban road network linear missing flow estimation method and system, which are used for acquiring a road section to be analyzed, wherein the road section is identified by a license plate photo installed in a road network; acquiring license plate photo identification data of the road section at a specific time period, and preprocessing the data to acquire traffic flow; mobile phone positioning data of the road section in a specific time period are obtained, and preprocessing is carried out to obtain the traffic of a vehicle-mounted mobile phone user of the road section; constructing a 4-dimensional tensor model; filling the 4-dimensional tensor model by using the acquired traffic flow and the traffic flow of the vehicle-mounted mobile phone user; and (5) tensor decomposition and recovery are repeatedly carried out, and estimation of the missing traffic flow is completed. The invention uses the mobile phone positioning data in the mobile detector as a full-time-space coverage and the time-space correlation among multi-source data to fill the situation that the linear loss of the traffic volume occurs in the LPR detector in the fixed detector, so as to obtain the LPR data of the full time space and guarantee the integrity and the accuracy of the data.

It is to be understood that the above-described embodiments of the present invention are merely illustrative of or explaining the principles of the invention and are not to be construed as limiting the invention. Therefore, any modification, equivalent replacement, improvement and the like made without departing from the spirit and scope of the present invention should be included in the protection scope of the present invention. Further, it is intended that the appended claims cover all such variations and modifications as fall within the scope and boundaries of the appended claims or the equivalents of such scope and boundaries.

Claims

1. A tensor decomposition-based linear missing traffic estimation method for an urban road network is characterized by comprising the following steps:

2. The tensor decomposition-based urban road network linear missing traffic estimation method according to claim 1, wherein the data preprocessing is performed to obtain traffic flow, and the method comprises the following steps:

3. The tensor decomposition-based urban road network linear missing flow estimation method according to claim 1 or 2, wherein the preprocessing for obtaining the flow of the vehicle-mounted mobile phone user on the road section comprises: mapping the mobile phone positioning data to a road network map, calculating the average speed in unit time, and rejecting the mobile phone positioning data if the average speed is lower than a set threshold value; and counting the flow of the vehicle-mounted mobile phone users in each time period of each road section of each day based on the remaining mobile phone positioning data.

4. The tensor decomposition-based urban road network linear missing flow estimation method according to claim 3, wherein the average speed is calculated as follows:

5. The tensor decomposition-based linear missing urban road network flow estimation method as recited in claim 3, wherein after mobile phone positioning data are mapped into a road network map, the method further comprises calculating a maximum instantaneous speed, and eliminating mobile phone positioning data of each road section, of which the maximum instantaneous speed exceeds the highest speed limit of the road section.

6. The tensor decomposition-based urban road network linear missing flow estimation method according to claim 5, wherein the maximum instantaneous speed is calculated as follows:

7. The method for estimating linear missing traffic of urban road network based on tensor decomposition as claimed in claim 1 or 2, wherein the repeating tensor decomposition and recovery comprises:

constructing a weighted original matrix based on the tensor matrix χ:

performing CP decomposition on the tensor matrix x to obtain

the weighted tensor after the same size recovery is:

wherein x_ijklThe method comprises the steps that a tensor matrix x is the mobile phone positioning data or license plate data of the kth road section in the ith day and the jth time period, wherein l is 1 and represents the mobile phone positioning data, and l is 2 and represents the license plate data; omega_ijklThe weight value of the mobile phone or license plate data of the ith day and the kth road section of the weighting matrix omega is obtained; a is_ir，b_jr，c_kr，d_lrThe time factor matrix A, the road section factor matrix B, the day factor matrix C and the data factor matrix D are elements of the time factor matrix A, the road section factor matrix B, the day factor matrix C and the data factor matrix D respectively; r is rank number of tensor decomposition; n is₁,n₂,n₃,n₄Total number of days, total number of time periods, total number of road sections and numberTotal number of data types;

updating the gradient G⁽ⁿ⁾＝-2Y_(n)A^(-n)+2Z_(n)A^(-n)

8. The method for estimating linear missing traffic of urban road network based on tensor decomposition as recited in claim 7, wherein the loss function of the CP decomposition process is as follows:

wherein, T_ATraffic matrix, T, for averaging all data categories per road segment in each day in a corresponding time dimension_BTraffic matrix, T, for all data classes in each time period of the average day in the corresponding road section dimension_CTraffic matrix, T, for all data classes in each time segment in each road segment averaged over the corresponding dimension of days_typeFor the traffic matrix, lambda, in the corresponding data category dimension, averaged over all time periods in each day in each road section₁，λ₂，λ₃，λ₄P is the weight sparseness of each partial regularization term; the traffic volume is the average value of the traffic volume and the traffic volume of the vehicle-mounted mobile phone user.

9. A tensor decomposition-based linear missing traffic estimation system for an urban road network is characterized by comprising the following steps:

10. A computer-readable storage medium, wherein the computer-readable storage medium stores program instructions, which when executed by a processor, implement the tensor decomposition-based urban road network linear missing flow estimation method according to one of claims 1 to 8.