WO2018122801A1

WO2018122801A1 - Method for detecting traffic anomaly of urban road

Info

Publication number: WO2018122801A1
Application number: PCT/IB2017/058531
Authority: WO
Inventors: 杜豫川; 邓富文
Original assignee: 同济大学; 杜豫川; 许军
Priority date: 2016-12-30
Filing date: 2017-12-30
Publication date: 2018-07-05
Also published as: GB2587588A; GB2582532A; CN109716414A; GB201909409D0; WO2018122585A1; CN110168520A; GB2572717B; GB2572717A; GB2569924A; CN109844832B; WO2018122806A1; GB201905907D0; GB2582532B; WO2018122805A1; CN109923595B; WO2018122802A1; CN109923595A; CN109791729B; CN109844832A; GB202100341D0

Abstract

A method for detecting a traffic anomaly of an urban road. By using a vehicle-mounted GNSS positioning device of a floating vehicle, spatial position information on different time points is obtained; and by means of the analysis and excavation of a great quantity of track information of the floating vehicle, a traffic anomaly event of an urban road can be smartly detected. In the detection method, a traffic state is indicated by using the probability distribution of speeds in travels, a traffic state difference is reflected by using probability distribution difference measurement indictors, the principle is clear, the implementation is simple and convenient, and the detection rate is high.

Description

Urban road traffic anomaly detection method

Technical field

The invention belongs to the technical field of traffic detection. In particular, the present invention relates to a method for real-time detection of urban road traffic anomalies. Through the on-board GNSS positioning device of the floating car, the spatial position information of different time points can be obtained. After data preprocessing, map matching and data fusion, the travel speed probability distribution of the specific time and space range can be obtained; according to the change of the speed distribution, the position can be effectively identified. Urban road traffic anomalies. Background technique

Traffic anomaly detection is an important part of urban traffic management and one of the core functions of intelligent transportation systems. Traffic anomalies mainly include traffic accidents, vehicle dumping, falling objects, damage or malfunction of road traffic facilities, and other special events that cause traffic flow disturbances. Such incidents are prone to traffic congestion, reduced road capacity, and severely affect the normal operation of the entire road traffic system. Through traffic anomaly detection, traffic managers can timely understand traffic anomaly information and take appropriate inducement and control measures to reduce the adverse effects of traffic anomalies.

Traffic anomaly detection can be divided into manual mode and automatic mode. Manual methods include patrol cars, emergency telephone reporting, and video surveillance. Due to the human and material resources and poor real-time performance, traffic management needs cannot be met. The automatic method relies on the automatic event detection (AID, Automated Incidence Detection) algorithm. The basic principle is to identify traffic anomalies by detecting changes in road traffic at different locations. Currently used AID algorithms include pattern recognition algorithms (such as Califorma algorithm, Monica algorithm), statistical prediction algorithms (such as exponential smoothing, Kalman filtering), traffic flow model algorithms (such as McMaster algorithm), and intelligent recognition algorithms. (such as artificial neural networks, fuzzy logic algorithms).

However, the current detection methods have disadvantages such as high requirements on facilities, high computational complexity, and inability to make further judgments on the situation of abnormal conditions. The invention utilizes the trajectory data returned by the taxi and the bus GNSS positioning device to establish a historical traffic state database and a real-time traffic state database, and analyzes the traffic anomaly events by analyzing the difference of the traffic flow characteristics reflected by the two. The method has the characteristics of good real-time performance, parallel processing, high recognition rate and low requirements for detection facilities, and is suitable for detecting urban road traffic anomalies in a data environment with real-time floating vehicle positioning data.

At present, for the monitoring of traffic anomalies, the following representative technologies are available:

A US patent application, US 20160148512, discloses a composition principle and implementation method of a traffic anomaly detection and reporting system. The system consists of a sensor, a communication module, a mobile processing module, and a user interaction module. The sensor is used to collect relevant data around the vehicle; the communication module is used for transmitting the vehicle data and receiving data of the surrounding vehicle; the mobile processing module is for processing and analyzing the data of the relevant vehicle in a certain area and generating a traffic event report; user interaction The module is able to provide traffic incident reports like a user. The scheme is a traffic anomaly detection technology based on the vehicle and vehicle communication network, which can use various types of information collected by sensors to identify abnormal events. However, since the sensor and the communication unit need to be separately installed and debugged, the implementation is difficult; the processing capacity of the mobile processing unit is limited; and the mobile and fixed message receiving end is required, and the system itself has a failure probability and the reliability is not good.

A Chinese patent application, CN 104809878 A, discloses a method for detecting abnormal state of urban road traffic using bus GPS data. The scheme obtains the link delay time index according to the GPS historical data, obtains the instantaneous speed, the cycle average speed, the weighted moving average speed and the multi-vehicle average speed according to the current GPS data, and uses the gauge variable analysis algorithm to detect the abnormality. This program does not need new Increase inspection facilities and facilitate implementation. However, the characterization of the traffic situation is too simplistic, and it is impossible to analyze the characteristics and causes of traffic anomalies. There is no basis for the division of traffic scenarios, and the influence of weather and other factors on traffic situation changes cannot be considered. Summary of the invention

In order to explain the contents of the present invention more clearly, the technical terms involved will first be explained as follows:

Floating car: Also known as the probe car. Refers to buses and taxis that have on-board positioning devices and are driving on city roads.

GNSS: Global Navigation Satellite System. Including GPS, GLONASS, GALILEO and Beidou satellite navigation systems.

Space-time sub-zone: A zone divided by two dimensions, time and space, reflected in a certain space within a certain period of time. Divide the day into several time segments, for example 0:00-0: 10, 0: 10-0:20..., each time segment is called a time sub-zone; divide the implementation area of urban road traffic anomaly detection into A number of spatial segments, such as the longitude 121.58° E-121.59 ⁰ E, the latitude 31.16° N-31.17° N, each spatial segment is called a spatial sub-region; any one time sub-region and any one spatial sub-region The space-time segment formed by the intersection is called the spatiotemporal sub-region, for example, the space-time segment of the region between 0:00-0:10 in the region between the longitude 121.58° E-121.59 ⁰ E and the latitude 31.16° N-31.17 ⁰ N.

Historical trajectory data: Historical trajectory data is trajectory data accumulated over a long period of time and stored in a database. Historical trajectory data is dynamically changing data that needs to be updated in a timely manner and periodically reprocessed and analyzed to ensure the accuracy of historical traffic feature extraction. The data for each time-space sub-area can be processed in parallel to increase efficiency. In the present invention, it may be simply referred to as historical data.

Real-time trajectory data: The real-time trajectory data is a trajectory data set within a time zone that is closest to the current time. In the present invention, it may be simply referred to as real-time data.

Traffic situation: A general term for the comprehensive situation of traffic operations within a certain period of time and within a certain space.

Traffic anomalies: traffic flow disturbances caused by traffic accidents, vehicle dumping, truck falling, road traffic facilities damage or malfunctions.

Abnormal traffic severity: The severity of traffic flow disorder is the difference in traffic flow characteristics after traffic flow and traffic anomalies in normal conditions.

Traffic Anomaly Index: A measure of the severity of traffic. The range is 0~10. The larger the value, the more serious the traffic anomaly.

Traffic environment: The sum of all external influences and forces acting on road traffic participants. This includes road conditions, transportation facilities, landforms, meteorological conditions, and traffic activities of other transportation participants.

Map Matching: The process of associating geographic coordinates with a city road network.

Peak hourly traffic: The maximum hourly traffic flow in a city's road section.

Finite Mixing Model: A mathematical method of simulating complex density with simple density. A finite mixed model with a set of variables y and a component number K can be expressed as: p(y) = _{J l} p _l (y) Response variable: A variable that changes according to an independent variable, also called a dependent variable.

Bayesian information criterion: It is an evaluation index of the reliability of the result of correcting the probability of occurrence by using the Bayesian formula for subjective probability estimation under partially incomplete information. Its calculation method is:

BIC = -2lnL + k - lnn

Where, is the maximum value of the likelihood function, ^ is the number of unknown parameters, "for the sample size. Likelihood function: The likelihood function is a function of the parameters of the statistical model. Given the output X, the likelihood function ( |x) on the parameter ( (in numerical value) is equal to the probability of the variable after the given parameter: Z( |x)=P( =x| ).

Parameter Estimation: A method of estimating unknown parameters contained in the overall distribution based on samples taken from the population.

EM algorithm: The Expectation Maximization Algorithm is an iterative algorithm for the maximum likelihood estimation or the maximum posterior probability estimation of a probability parameter model with implicit variables.

Kullback-Leibler divergence: A measure of the difference between two probability distributions, P and Q.

Jensen-Shannon divergence: is a symmetrized form of Kullback-Leibler divergence.

K-Medoids algorithm: A clustering algorithm that selects such a point from the current category for each iteration. It has the smallest sum of distances to all other points (in the current category) as the new center point. The object of the present invention is to establish a scheme based on a floating vehicle trajectory recording system, using historical GNSS positioning data and real-time GNSS positioning data, combined with traffic environment information to identify road traffic anomalies. In order to achieve the above object, the present invention provides the following technical solutions:

The premise of the present invention is: a floating car (a taxi, a bus, etc.) equipped with a GNSS track recorder; a data center having a large-scale storage, calculation, and real-time task processing capability.

The scope of application of the present invention is: Urban roads (including ground roads and elevated roads) through which the above-mentioned floating vehicles pass.

The implementation steps of the present invention include:

1) Determine the spatio-temporal range of the test and establish the spatio-temporal sub-area.

Based on actual application requirements, determine the time range and spatial extent for which traffic anomaly events need to be detected. The time range can be set to all day, that is, 0:00-24:00; it can also be set to a specific time period. For example, to detect the traffic abnormal time during the period from 17:00 to 20:00, the time will be detected. The range is set from 17:00 to 20:00. Here is just a special example. There are many other situations, which are not explained here. The spatial scope can be set as a certain city area according to the administrative division, such as Beijing, Shanghai, Huangpu District, etc. It can also be set as a certain urban functional area according to the urban spatial structure, such as a central business district and industrial area of a certain city.

The establishment of the spatiotemporal sub-area refers to dividing the detected time range into a number of smaller time segments, and dividing the detected spatial range, that is, the implementation area of the urban road traffic anomaly detection, into a plurality of smaller spatial segments. For the establishment of spatiotemporal sub-areas, a variety of empirical division methods can be used, including equidistant space-time division method and non-equidistant space-time division method.

2) Data preprocessing.

The GNSS positioning data is used for data cleaning, data integration, data conversion, and data reduction to improve the structure of the data. GNSS, the Global Navigation Satellite System Positioning System, is a space-based radio navigation and positioning system that provides all-weather three-dimensional coordinates and speed and time information anywhere on the Earth's surface or near-Earth space. It mainly includes GPS (Global Positioning System) in the United States, GLONASS (Global Navigation Satellite System) in Russia, GALILEO in the European Union and China's Beidou satellite navigation system. It also includes QZSS in Japan and IRNSS in India. Such as regional navigation and positioning systems, as well as satellite positioning enhancement systems such as WASS in the United States and MSAS in Japan. In order to establish a unified data distribution standard among different navigation and positioning system devices, the National Ocean Electronics Association of the United States has developed a unified NEMA (National Marine Electronics Association) communication protocol to regulate GNSS data broadcasting. Therefore, each member system in GNSS, such as GPS, GLONASS, etc., although established and maintained by different countries and organizations, has a consistent data distribution format, so there is no need to transform the data format.

Within the selected space, there are many vehicles equipped with GNSS positioning equipment, such as taxis, buses, freight cars, private cars, etc. Based on the current application status of urban traffic data, in practical applications, urban taxis are often used as floating vehicles as data sources for traffic anomaly detection systems. The collected GNSS positioning information contains some unreasonable information. In order to ensure the accuracy of the traffic abnormal state detection and discrimination results, it is first necessary to identify the abnormal data and ensure the reliability of the data. These anomalous data include: data that falls outside the time and space of detection, and spatial position jumps that are clearly out of reasonable range. The so-called "space position jump beyond the reasonable range" is illustrated below. If the positioning point uploaded by a floating vehicle positioning device is recorded as A at 10:30:00 on a certain day, the positioning point uploaded by the floating vehicle positioning device at time 10:30:30 is recorded as B, the distance between position A and position B. It is 1500 meters, then the speed of the floating car is calculated to be at least 180km/h, which is beyond the common sense, so it is an abnormal spatial position jump, which should be eliminated in data processing.

3) Quick map matching.

After pre-processed GNSS positioning data, it is necessary to combine the urban road network data, map the GNSS positioning points to the city map through the map matching algorithm, establish the matching relationship between the positioning points and the road segments, and correct the error caused by the positioning drift.

At present, the electronic maps of various geographical regions are relatively detailed. Such electronic maps can be derived from the city's geographic information system, and of course can also be derived from other ways and means. These electronic maps detail the urban road information, and several sections can be obtained by dividing. By matching the anchor points to the road segments by means of distance, angle, etc., the positioning information is matched to the actual geographical environment.

4) The representation of the floating vehicle path and the matching of different vehicle paths.

The path of the vehicle may not be unique given a set of starting and ending points. The complex urban traffic network consists of several sections, which are numbered, for example, Ll, L2, etc. Roads may have two different directions of travel. In this case, two different directions of travel should be represented as two different sections, given different sections.

For a given starting point and ending point, the intersection of the road segments in the urban road network can usually be used. Knowing the path of a floating car, it is now necessary to select the same path as the floating car path from the path information that has been sent by other floating cars, so as to obtain the same path group between the starting point and the ending point.

5) Data sampling.

The positioning data of the floating car includes information such as position coordinates, instantaneous vehicle speed, and recording time. In the urban road traffic anomaly detection method based on floating car data proposed in this patent, data sampling refers to screening part of the data from all floating car data for subsequent analysis and processing, and the screening is based on the computing power of the data center and Pre-proposed accuracy requirements were made. Different data sampling methods can be used based on different calculation capabilities and accuracy requirements. For example, when the computing power of the data center is strong and the accuracy of the detection is high, all the floating vehicle positioning data can be treated as a processing object, and comprehensive processing analysis is performed; and when the computing power of the data center is limited, it is assumed The current data center can process 500 data for each spatial sub-area within 1 minute, but the actual situation is that in 2000, each floating sub-area can generate 2000 floating-vehicle positioning data, so it can be from 2000 data. Randomly extract 500 data for analysis, so as to obtain processing results with limited accuracy within the computing power of the data center.

Depending on how the floating car data is used, it is possible to sample different properties of the floating car data, such as the travel speed and travel time. The urban road traffic anomaly detection method based on floating car data proposed in this patent uses the travel speed as the basis for urban road traffic anomaly detection. Therefore, data sampling refers to sampling the speed of the journey.

6) Historical trajectory data analysis and feature extraction.

The so-called historical trajectory data refers to the floating vehicle trajectory data accumulated in long-term urban road traffic operations. Using historical floating vehicle trajectory data, an urban road traffic feature model can be established to reflect the general characteristics of urban traffic operations. The urban road traffic feature model mentioned here can refer to certain specific indicators, such as average speed, weighted average speed, etc.; it can also refer to various statistical models, such as the probability distribution of travel speed. In many previous models, a single indicator was used to represent the traffic characteristics of a certain section or area (such as the historical average speed). Although this method is simple, the accuracy is not high, the sensitivity is poor, and often it is not in traffic abnormalities. Good results in state detection. Therefore, this patent proposes to describe the traffic characteristics by using the probability distribution of traffic characteristic variables for each spatiotemporal sub-area, establish a traffic feature model and perform parameter estimation.

The traffic characteristic variables that can be collected, including the travel speed and travel time, etc., the probability distribution of the traffic characteristic variables described in this patent refers to the probability distribution of the travel speed.

7) Real-time trajectory data analysis and feature extraction.

The so-called real-time trajectory data refers to the trajectory data of the floating car in traffic operation in a period of time not far from the current time. Using real-time floating car trajectory data, you can grasp the dynamics of traffic characteristics and reflect the current characteristics of current traffic operations. This patent uses the travel speed of the current space-time sub-zone to describe the current traffic characteristics.

8) Anomaly detection.

The idea of system state anomaly detection was first proposed by Dennrng, that is, by monitoring the abnormality of the system used in the system audit record, it is possible to detect an event that violates security and may cause a system abnormality. Dennrng's model is independent of any particular system, application environment, system vulnerability, and fault type, and is therefore a general anomaly detection model. The model consists of five parts: subject, object, audit record, outline, exception record and activity rule. A contour is a normal behavior of a subject relative to an object, represented by metrics and statistical models. Dennrng's model defines three metrics, namely event counter, interval timer, resource measurer, and proposes five statistical models, namely, operational model, mean and standard deviation models, multivariate models, Markov process models, and Time series model. The model proposed by Denning establishes the statistically-based normal behavioral feature profile of the system subject through the analysis of the system audit data. When the detection, the audit data in the system is compared with the normal behavioral feature profile of the established subject. Exceeding a certain threshold is considered an abnormal event. This model lays the foundation for anomaly detection, and many anomaly detection methods and systems developed in the future are developed on the basis of it.

In recent years, in the development of anomaly detection technology, more artificial intelligence methods have been introduced to improve the performance of anomaly detection. These methods of artificial intelligence mainly include data mining, artificial neural networks, and fuzzy evidence theory. Data mining methods are used to determine what features are most important in a large data set. This technique is used in anomaly detection mainly to seek a more concise definition of the normal mode, rather than simply enumerating all normal modes as in the conventional anomaly detection method. The introduction of data mining methods enables the detection system to generally include normal patterns not included in the training data simply by identifying the main features in the normal mode. The artificial neural network anomaly detection problem can be regarded as a general data classification problem. In the statistical anomaly detection mentioned above, user behavior data is divided into two categories according to certain statistical criteria: abnormal behavior and normal behavior. Because the statistical-based method has certain difficulties in extracting and abstracting the audit instance, it may cause large errors, and must rely on some probability distribution hypotheses. Generally, it is necessary to describe the measure of user behavior by experience and feeling, so the artificial neural network is introduced. Clustering method. The artificial neural network has the self-learning self-adaptive ability to train the neural network with sample points representing the normal user behavior. Through repeated learning, the neural network can extract the normal user or system activity patterns from the data and encode them into the network structure. In the detection, the audit data can be judged whether the system is normal by learning a good neural network. Because the anomaly evaluation criterion has certain ambiguity, the fuzzy evidence theory is introduced into the anomaly. For example, an intrusion detection framework model based on fuzzy expert system is established, which can better reduce the false alarm rate and false alarm rate.

This patent proposes an anomaly detection scheme based on statistical features. The basic idea is to measure the difference between historical traffic characteristics and real-time traffic characteristics by Jensen-Shannon divergence to achieve the detection of abnormal traffic conditions. The scheme has the advantages of good interpretability and little computational burden. It not only overcomes the inaccurate and untimely weakness of single statistic detection, but also avoids the defects of artificial neural network and other methods to calculate the negative load and high hardware requirements.

9) Quantitative characterization of abnormal severity and release of abnormal information.

The severity of traffic anomalies should be released to the public in a clear and concise manner to avoid possible congestion areas and improve the efficiency of urban traffic. The severity of the abnormal condition is characterized by the traffic anomaly index, ranging from 0-10, where 0 means no abnormality, 10 The height is abnormal.

The location of the anomaly is projected onto the electronic map and published publicly through the smart mobile device APP or the like.

10) System performance evaluation.

The evaluation of system performance refers to the evaluation of the accuracy of traffic abnormal state detection, and its evaluation indicators include false positive rate and false negative rate. The lower the false positive rate and the false negative rate, the better the performance of the system. In the step 1), the division of the spatiotemporal sub-area may specifically adopt the following method:

11) Isometric space-time division method. Determining the segment scale of the time dimension, the time segment span is a fixed value, usually taking 30mm as a time segment; determining the segment scale of the spatial dimension, the spatial segment span is a fixed value, usually taking a spatial grid of 200m×200m as a spatial segment;

12) Non-equidistant space-time division method based on road network density: Based on road network density as a judgment index, when the road network density is greater than or equal to 2km/km ² , take 30min time segment and 200m X 200m space segment; When the density is less than 2km/km ² , take a 30 min time segment and a 400 m X 400 m spatial segment;

13) Non-equidistant space-time division method based on peak hour flow: based on peak hour flow as a judgment indicator, when the peak hour flow rate is greater than or equal to 1000 vehicles/hour, take 30 min time segment and 200 m X 200 m space segment; When the flow rate is less than 1000 vehicles/hour, take a time segment of 30 minutes and a space segment of 400m X 400m. The step 3) specifically includes the following steps:

31) Divide the space area to be processed into a grid of a certain size, and the range of each grid area can be expressed as

4 = {(x _s , y _s ) \ x _{s Γ} Λ ₅ Each grid area contains several road segments, and the set of these road segments is represented as R _S , and each road segment in the set of the road segments is represented as ij. And assign a number to each road segment;

32) Determine the grid area where the anchor point is located, and use the distance and azimuth angle to search for the section where the anchor point A is located in the set of road segments. ^ The matching scheme includes:

321) Single point matching scheme:

Searching for the nearest road segment from the point A, when the difference between the traveling direction angle of the point A and the direction angle of the road segment ij is less than the threshold value, that is, the matching is completed, and the threshold value may be 2.5 °, 5. , 10°, etc.; if not satisfied, < , delete the segment in the search space and continue to search for other segments until the condition is met. The matching method is shown in Figure 3.

322) Point sequence matching scheme:

This program is suitable for high frequency floating car data. The floating vehicle GNSS data acquisition frequency is expressed as f ₀ = \l , and the point POHO), Pfc+i, which is adjacent to A in time. ;) is defined as the 1-adjacent point of Α, P04-2i _Q ;), P 4+2i _Q ) is defined as the 2-adjacent point of A, and so on, then P(t _A -kk) , Pfc+ is defined as A /t-adjacent point. When / _Q <lHz, take /t=l or 2. Take the distance between the distances A and A/t-adjacent points and calculate the mean value of the driving direction angles of the neighboring points of A and A^, and if they meet | . - | < , complete the matching; otherwise, search for other road segments. Until the satisfaction of | a | <.

33) Use the straight line equation of the road segment (if it is a curved road segment, it is roughly split into straight lines), calculate the projection coordinates of the GNSS positioning point on the road segment, and reduce the error caused by the GNSS positioning drift. The specific method uses the GNSS positioning point linear projection method as:

Determine the straight line equation of the section ^ (if the section is a curve, divide it into several straight sections): y, - y,

Where the slope is: The projection line equation is: yy _A =- ^x - ky _A -ky _t +k ² x _t +x _A

Solve the projected coordinates p is:

k ² +l

k ² y _A + y _t +kx _A - x _j

y _P

After the map matching process, combining the timestamp data of the coordinates of the positioning point, matching the positioning point to the space-time sub-region ₍ the step 5) may specifically adopt one of the following methods:

51) Full sample plan for speed information. The total travel speed data of each sub-floating vehicle in a time and space sub-region constitutes the whole. The implementation method is to calculate the travel speed of each vehicle in the space-time sub-region: v^W ..""- ¹ '", where 4 ₂ ... 4 - _ln is the first and second in the space-time sub-area ^ The distance between the GNSS anchor points, ..., the distance between the -1 and the GNSS anchor points, -t _n is the first in the space-time sub-region, ..., the first "Timestamp of a GNSS anchor point; the data in each spatio-temporal sub-area is not filtered to form a set ^ for subsequent processing.

52) Time-smooth sampling plan for speed information. Specify the length of the time segment, set the upper limit of the number of segments of the same time; search for the velocity data in each time segment of a time-space sub-region, if the number of velocity data in the time segment exceeds the upper limit, the data of the upper limit is taken Follow-up processing. The implementation method is to calculate the travel speed of each vehicle in the space-time sub-region: v _f W ..""- ¹ '", where ₂ ... is the first and second GNSS positioning points in the space-time sub-area f Distance, ..., the distance between -1 and the nth GNSS anchor point, ^...^ is the space-time sub-area ^, ......, the first GNSS Time stamp of the anchor point; specify the upper limit of the number of segments of the time segment at the same time; ^ _∞; search for the velocity data in the time segment of the time and space sub-region, if the number of velocity data in the time segment exceeds the upper limit; ^« , randomly fetching; ^ «bar data is added to ^ and used for subsequent processing. The step 6) may specifically adopt one of the following methods:

61) Simple historical trajectory data fusion method. The historical data under the condition of no traffic anomalies, as a whole, the traffic characteristic model establishment and parameter estimation. The method uses a finite mixing model to establish a traffic feature model and perform parameter estimation. Specifically, one of the following three options can be used:

611) Mixed Gaussian model of fixed composition

This scheme uses a mixed Gaussian model with a fixed component quantity to describe the probability distribution of vehicle speed. The number of components is manually specified according to the distribution pattern of the vehicle speed in a typical case. In order to ensure the reliability of the probability distribution, the number of components cannot be too small. Generally available = 4~6.

612) Mixed Gaussian model with variable composition

This program uses a model-based evaluation method to select the appropriate number of components, as follows: Determine the maximum number of possible components K, and estimate the parameters of the mixed Gaussian model of "1, 2, ... ^ components separately. For K models, determine the best model by Bayesian information criterion (β/C). The maximum number of components is generally selected according to the accuracy requirement, but it must be noted that the more the number of components, the slower the expectation maximization algorithm converges. The maximum number of components selected here is =5, which requires calculation: f»tA n {\, 2, ..., 5) A total of 5 mixed models. At the same time, the definition of the five models is defined as:

BIC = -2\nL + k - \nn

Where, is the maximum likelihood function value, which is the number of parameters in the model, "for the total amount of data.

After that, the mixed model with the smallest β/C is selected, and the parameter vectors ^ μ and σ are recorded, where η is the proportional vector occupied by each sub-component in the historical traffic feature model, and μ is the historical traffic feature model. The mean vector of each sub-component, σ is the standard deviation vector of each sub-component in the historical traffic feature model, as a feature record of the present-time sub-region. The density curve morphology of the hybrid model is shown in Figure 6.

613) Finite mixed model with variable composition and distribution type

This scheme uses the same model-based evaluation method as 612), but the distribution of the sub-components and the number of components are variable. The method is as follows:

The probability distribution model is chosen as the distribution type of the sub-components, including but not limited to: normal distribution, gamma distribution, Weibull distribution. When using a normal distribution, the sub-distribution function takes: : exp

^1πσ1πσ _νν

When using a gamma distribution, the sub-distribution function uses: strip: v "" - ^l e ^β ", where Γ(^): When using the Weibull distribution, the sub-distribution function uses:

Assuming that the distribution types of all sub-components of the hybrid model are the same, determine the maximum number of possible components ^. For the selection of M seed component distribution types and the number of species components, M combinations are formed, and the δ/C values are calculated separately, and the model with the smallest δ/C is taken as the best model.

62) Historical trajectory data classification method by context. According to the temperature, precipitation, visibility and traffic control measures, the historical data without traffic anomalies are divided into different categories, and models and parameter estimates are established. The implementation method is as follows:

According to the temperature, precipitation, visibility and traffic control measures, the traffic environment is divided into 5~8 categories. The historical data corresponds to the different traffic environments, and the historical data is classified into the above categories. For each category, the processing as described in 5) is performed separately, thereby establishing a mapping relationship R (^, for the traffic environment, and for the traffic situation.

63) Historical data clustering method. For the historical data, the difference between the time and space sub-regions is obtained, and the difference quantitative representation of different space-time regions is obtained, and the quantized differences are used for clustering. Using temperature, precipitation, visibility and traffic control measures as characteristic factors, a number of Lo _gl t regressions were performed to establish a mapping relationship between traffic environment and categories. See Figure 4 for the implementation process. The implementation steps are as follows: 631) According to the method described in 5), a traffic feature model is established, and parameter estimation is performed.

632) Based on the previous finite-mix model parameter estimation results, write the probability density function p x) of the travel speed distribution corresponding to the spatio-temporal sub-region on different dates. The parameters are based on the mixed Gaussian model:

κ

A ( ^v f ) =∑ '(; In ^σ , the number of subcomponents representing the speed distribution of the stroke, // represents the proportion of a subcomponent in the travel speed distribution, and represents the mean of a subcomponent in the travel speed distribution, σ Indicates the standard deviation of a subcomponent in the travel speed distribution.

633) Calculate Jensen-Shanno d, _J between the two distributions _:

D'j = JSD(P \\ Q) = \\ M)

Where P and ρ are two different probability distributions, Μ = θΡ + ρ;), /? For ! ^^-! ^!^!"Diversity:

D (P \\ Q) = ^ P(x _k )log In the case of finite mixed model, the value cannot be explicitly expressed, but Monte Carlo sampling method can be used to approximate the calculation. The calculation method is:

^ D(f \\ g) where Z) _MC represents the Kullback-Leibler divergence approximated by Monte Carlo sampling, and / and g represent any two distribution functions.

634) Express the divergence between the two distributions as a distance matrix:

D - d„, . . . d This matrix satisfies 4=4,, d, _r 0(i=j).

635) Using the distance matrix as input to the K-Medoids algorithm, the clustering results are obtained, and the categories are indexed.

636) Using the category index as the response variable, the traffic environment data (including temperature, precipitation, visibility, etc.) is used as an independent variable, and multiple Logit regression is performed to obtain the mapping relationship R between the traffic environment E and the traffic situation category T (£^.

637) Aggregate the same type of data, and re-establish the hybrid model with the new data set after aggregation, and perform parameter estimation to obtain the final historical traffic characteristic data set. The step 7) may specifically adopt the following methods:

71) Simple real-time data processing. This method is implemented simultaneously with 61). The real-time traffic data is modeled and parameter estimated to obtain the characteristic function of the current traffic condition. The implementation steps of the method are exactly the same as 61), except that the data used is real-time traffic data.

72) Classification processing. This method is carried out simultaneously with 62) or 63). Obtain the characteristic function of the traffic condition, and obtain the current information such as temperature, precipitation, visibility, traffic control measures, etc., and judge the current traffic condition category. See Figure 5 for the implementation process. Implementation steps as follows:

721) Calculate the travel speed in the time-space sub-zone, which constitutes the overall real-time travel speed

722) Establish a travel speed probability distribution model (ν^) = | - ί^ , - μ^σ^, and perform parameter estimation;

723) Using the current traffic environment data (including temperature, precipitation, visibility, etc.) as an input parameter, use the mapping relationship R(£^) to obtain the category T of the current traffic situation. The step 8) specifically includes the following steps:

81) When step 72) is adopted, according to the current traffic situation category Τ, the historical traffic characteristic data under the category is located, otherwise it will not be processed;

82) Calculate the difference between the two velocity distributions according to the description parameters τ, μ, ^ of the current traffic characteristics and the description parameters η, μ, σ of the historical traffic characteristics:

ΙΙ Ρ) » where ^ is the proportional vector of each subcomponent in the real-time traffic feature model, ^ is the mean vector of each subcomponent in the real-time traffic feature model, (J _rt is the standard for each subcomponent in the real-time traffic feature model Difference vector; η is the proportional vector of each sub-component in the historical traffic feature model, μ is the mean vector of each sub-component in the historical traffic feature model, and σ is the standard deviation vector of each sub-component in the historical traffic feature model. When the traffic characteristics and real-time traffic characteristics (that is, the historical travel speed distribution and the real-time travel speed distribution) are similar, a smaller Jensen-Shannon divergence value will be obtained, that is, the difference between the two is small; when historical traffic characteristics and real-time traffic When the difference in characteristics is large, a larger Jensen-Shannon divergence value is obtained, that is, the difference between the two is large, that is, the probability of existence of an abnormality is large, see FIG. 7. The step 9) specifically includes the following step:

91) Normalize the difference in velocity distribution of each space-time sub-region to a normalized value of 0~1 _.

Diff^ - min(diff)

ξ' max, diff, - min, diff,

92) Calculate the traffic anomaly index for each time and space sub-region

10;

93) Project the location of the region with an anomaly index higher than 5 onto the electronic map, and publish it to the public in the form of an intelligent mobile device APP, so that the driver can avoid potential congestion points and improve the traffic efficiency of urban road traffic. The step 10) specifically includes the following steps:

101) Calculate the false negative rate of traffic anomaly:

= -^ χ 100%

n _a

102) Calculate the false alarm rate of traffic abnormal state: α ₂ = ^ χ 100% In the above two formulas, the total number of missed events per unit time, which is the total number of false positive events per unit time. Actual time per unit time The total number of abnormal times that occurred. The present invention has the following advantages over similar technologies in the same field:

(1) Make full use of existing floating vehicle operation data (GNSS trajectory data), detect historical traffic state changes through historical traffic feature extraction and real-time traffic situation analysis, and realize real-time, low-cost, intelligent urban road traffic anomaly events Detection

(2) Taking the probability distribution of traffic characteristic parameters as the description of traffic characteristics, the characteristics reflected are more comprehensive, avoiding the one-sidedness and instability of traffic characteristics using a single index, and the reliability of detection is higher;

(3) In view of the characteristics of traffic characteristics affected by traffic environment (such as weather conditions), a clustering multiple Lo _gl t regression algorithm was introduced to establish the mapping relationship between traffic environment characteristics and traffic situation categories.

(4) According to the test of actual data, the urban road traffic anomaly detection technology based on floating car data proposed by the present invention can realize the detection of abnormal events with high accuracy, the detection rate exceeds 90%, and the false negative rate is less than 15%. The false alarm rate is lower than 20%, and it has achieved good detection results, and can be applied to urban traffic intelligent management and service. DRAWINGS

The details and advantages of the present invention will become apparent and readily understood in conjunction with the following drawings in which:

Figure 1 shows a schematic diagram of the components and basic principles of the present invention;

Figure 2 is a schematic view showing the overall flow of the present invention in the implementation process;

3 is a schematic diagram showing an implementation manner of a fast map matching algorithm of the present invention;

4 is a schematic flow chart showing a historical traffic feature extraction scheme implemented by the present invention;

FIG. 5 is a schematic flow chart showing a real-time traffic feature extraction scheme implemented by the present invention; FIG.

Figure 6 shows a schematic diagram of the morphology of the Gaussian mixture model probability distribution;

Figure 7 shows a measurement of the difference in the comparison between historical traffic characteristics and real-time traffic characteristics. Specific implementation

In order to more clearly and clearly clarify the objects, technical solutions and advantages of the present invention, the specific embodiments of the present invention are described in detail below.

As shown in Fig. 1, the overall system architecture of the present invention includes: an onboard GNSS track recorder mounted on a floating vehicle, a data center, a GNSS satellite, and a communication system. The GNSS here includes GPS, GLONASS, GALILEO, Beidou, IRNSS, QZSS and any similar navigation satellite positioning system. GNSS track recorders equipped with floating cars, buses, etc., record the position information of the vehicle at various points in time at a certain sampling frequency / (general requirements of 0.1 Hz), and pass the GPRS mobile communication network (also can be used) Wireless network communication technologies such as WCDMA and TD-LTE, but the cost will be increased accordingly) The location information will be sent to the data center in real time. The data center establishes a historical road traffic characteristic database through data preprocessing, data fusion, and through a specific algorithm; establishes a real-time traffic feature database for the recently received real-time data; and determines whether the current traffic feature is abnormal through the mapping relationship between the historical database and the real-time database And visualize the display through the processing terminal and generate a traffic anomaly event report.

The overall process of the scheme is shown in Figure 2, including the acquisition and storage of GNSS trajectory data, the establishment of spatiotemporal sub-areas, historical traffic feature extraction, real-time traffic feature extraction, and anomaly identification. Collecting and storing GNSS trajectory data is the data foundation of the whole scheme. Due to the huge amount of data, a distributed storage scheme should be adopted. For distributed storage, there are mature technologies, which are not the content of the present invention. The basic assumption of establishing a spatiotemporal sub-area is that it has the same traffic characteristics in a certain area and a specific time period. This assumption is generally applicable after long-term observation. Historical traffic feature extraction, the principle is to use the GNSS trajectory data to calculate the travel speed, benefit Using a large number of travel speed data in the same space-time sub-area, the probability distribution model of the vehicle speed is established, and the parameters are estimated, and the traffic characteristics are characterized by a small number of parameters. Real-time traffic feature extraction, the principle is to process and analyze the speed data in the current time period, and also establish the current vehicle speed probability distribution model. The abnormality identification is to use the difference measurement index to judge the degree of change of the real-time feature compared with the historical feature, and determine whether a traffic anomaly event occurs according to whether it reaches the threshold.

According to the combination of the embodiments of the invention, the implementation is given below. Embodiment 1

Step 11. Using the equidistant space-time division method, determining the segment scale of the time dimension, the time segment span is a fixed value, usually taking 30 mm as a time segment; determining the segment scale of the spatial dimension, the spatial segment span is a fixed value, usually taking 200m×200m The spatial grid acts as a spatial fragment.

Step 12. Perform data preprocessing to perform data cleaning, data integration, data conversion, and data reduction on the GNSS positioning data to improve the structural degree of the data.

Step 13. Divide the space area to be processed into a grid of a certain size, and the range of each grid area can be expressed as

4

- Determine the grid area where the anchor point is located, and use the distance and azimuth to search for the section where the anchor point is located; Search for the nearest section of the point, take the threshold ^=2.5°, and satisfy the travel direction angle and section of point A^ When the difference of the direction angle is less than the threshold, that is, |< is satisfied; the match is completed; if the - -<< is not satisfied, the link is deleted in the search space and the other sections are continued to be searched until the condition is satisfied; the straight line equation of the link is used (if The curved section is approximately split into straight lines. The projection coordinates of the GNSS positioning point on the road segment are calculated to reduce the error caused by the GNSS positioning drift. The specific method is: Determine the straight line equation of the road segment ^ (if the road segment is a curve, then divide For several straight segments):

y, - y,

Where the slope is: The projection line equation is: yy _A =- ^x - ky _A -ky +k ² x +χ _Λ

Solve the projected coordinates p is:

k ² +\

k ² y _A + y _t +kx _A - x _j

y _P

k ² +l

After the map matching process, the anchor point is matched to the spatio-temporal sub-area in combination with the timestamp data of the coordinates of the positioning point.

Step 14. The total travel speed data of each of the secondary floating cars in a time and space sub-zone constitutes an overall. Calculate the travel speed of each vehicle in the time-space sub-region: where ₂ ... is the distance between the first and second GNSS anchor points in the space-time sub-region f, ..., "-1 The distance from the nth GNSS anchor point is the first time in the space-time sub-region, ..., the time stamp of the GNSS anchor point; the data in each spatio-temporal sub-region is not filtered. , constitute a collection V _{ for subsequent processing.

Step 15. As a whole, the historical data under the condition of no traffic anomaly is used to establish a traffic feature model and estimate the parameters. The method uses a finite mixing model to establish a traffic feature model and perform parameter estimation. Take the maximum component quantity K=5, and estimate the parameters of the mixed Gaussian model of «=1,2,···, respectively. For each model, determine the best model by Bayesian information criterion. Calculation: = ΪΆ n {\,2,...,5} A total of 5 mixed models. At the same time, calculate the five models

BIC = -2\nL + k - \an

After that, the mixed model with the smallest β/C is selected, and its parameter vectors ^ μ and σ are recorded as the feature records of the present time-space sub-region. Step 16. Perform real-time traffic data model establishment and parameter estimation to obtain a characteristic function of the current traffic condition. The method is the same as step one, and the parameter vector τ, μ, a is recorded.

Step 17. Calculate the difference between the two velocity distributions according to the description parameters τ, μ, ^ of the current traffic characteristics and the description parameters η, μ, σ of the historical traffic characteristics:

JSD(/) _ri || P).

Step 18: Normalize the difference in velocity distribution of each space-time sub-region to a normalized value a _d of 0~1:

Diff _i} - min(diff)

ξ' max, diff, - min, diff,

Calculate the traffic anomaly index of each time and space sub-region

10. Embodiment 2

Step 21: Using the equidistant space-time division method, determining the segment scale of the time dimension, the time segment span is a fixed value, usually taking 30 mm as a time segment; determining the segment scale of the spatial dimension, the spatial segment span is a fixed value, usually taking 200 m X The 200m spatial grid acts as a spatial fragment.

Step 22. Perform data preprocessing to perform data cleaning, data integration, data conversion, and data reduction on the GNSS positioning data to improve the structural degree of the data.

The space area is divided into grids of a certain size, and the range of each grid area can be expressed as

- Determine the grid area where the anchor point is located, and use the distance and azimuth to search for the section where the anchor point is located; Search for the nearest section of the point, take the threshold ^=2.5 °, when the travel direction angle and the section of the point A are satisfied ^ When the difference of the direction angle is less than the threshold, that is, | - | < is satisfied, and the match is completed; if the | - | < is not satisfied, the link is deleted in the search space and the other sections are continued to be searched until the condition is satisfied; If it is a curved road segment, it is roughly split into straight lines. Calculate the projection coordinates of the GNSS positioning point on the road segment, and reduce the error caused by the GNSS positioning drift. The specific method is: Determine the straight line equation of the road segment ^ (if the road segment is a curve, It is divided into several straight line segments):

y, - y,

Where the slope is: k : The projection line equation is:

Ky _A -ky _t +k ² x _t +x _A

Solve the projected coordinate P as:

After the map matching process, the anchor point is matched to the spatio-temporal sub-area in combination with the timestamp data of the coordinates of the positioning point. Step 24: Calculate the travel speed of each vehicle in the space-time sub-region: , where ^ ... _1?! is a time-space sub-

The distance between the first and second GNSS anchor points in the zone, ..., the distance between the -1 and the nth GNSS anchor points, ^... is the space-time sub-zone 1 , ..., the first time stamp of a GNSS anchor point; specify the maximum length of the clip data at the same time segment length; ^ «; search for the speed data in the time segment of the time and space sub-region If the number of speed data in the time segment exceeds the upper limit p, randomly f data is added to ^. .

Step 25. Perform historical traffic data without traffic anomalies as a whole, and perform traffic feature model establishment and parameter estimation. The method uses a finite mixing model to establish a traffic feature model and perform parameter estimation. Take the maximum component quantity K=5, and estimate the parameters of the mixed Gaussian model of «=1,2,···, respectively. For each model, determine the best model by Bayesian information criterion. Calculation:

There are 5 mixed models in total. At the same time, calculate the five models

BIC = -2\nL + k-\an

After that, the mixed model with the smallest β/C is selected, and its parameter vectors ^ μ and σ are recorded as the feature records of the local time and space sub-region. According to the parameter estimation result, the probability density function p, (x) of the travel speed distribution corresponding to the spatio-temporal sub-region on different dates is written:

Calculate the Jensen-Shannon divergence d _1J between each of the two distributions _:

d^JSDiPWQ) ---D(P\\M) + -D(Q\\M) where P and β are two different probability distributions, Μ -(P + Q), D is Kullback- Leibler Divergence:

In the case of a finite mixing model, Monte Carlo sampling is used to approximate the calculation. The calculation method is:

»MC( II g) = II g)

The divergence between the distributions is expressed as a distance matrix:

D =

d„, . . . d The matrix satisfies 4=4,, d, _r 0(i=j).

The distance matrix is used as the input of the K-Medoids algorithm to obtain clustering results and index the categories.

Using the category index as the response variable, the traffic environment data (including temperature, precipitation, visibility, etc.) is used as an independent variable, and multiple Logit regressions are performed to obtain the mapping relationship between the traffic environment E and the traffic situation category T (E.

The same type of data is aggregated, and the mixed model is re-established with the new data set after aggregation, and the parameter estimation is performed to obtain the final historical traffic characteristic data set.

Step 26: Obtain a characteristic function of the traffic condition, and obtain information such as current temperature, precipitation, visibility, traffic control measures, and the type of the current traffic condition.

Calculate the travel speed in the time-space sub-zone, form the real-time travel speed overall V _irt ', establish the travel speed probability distribution model ρ _ξ = ^η _; and make parameter estimation; the current traffic environment data (including temperature, precipitation, visibility, etc.) As an input parameter, the category Γ of the current traffic situation is obtained using the mapping relationship R(£^).

Step 27. According to the category of the current traffic situation, locate the historical traffic characteristic data of the category; calculate two parameters according to the description parameters τ, μ„, (J _rt and the description parameters η, μ, σ of the historical traffic feature) The difference between the velocity distributions: diff [(η, , μ _Γί , σ _Γί ) , (η, μ, σ)] = JSD(P _rt 11 Ρ).

Step 28 _: Normalize the difference in velocity distribution of each spatiotemporal sub-region to a normalized value of 0 to 1 _.

Diff _} - min(diff)

ξ' max, diff, - min, diff,

Calculate the traffic anomaly index of each time and space sub-region

10. Embodiment 3

Step 31: Using a non-equidistant space-time division method, for a central area of the city where the road network density is greater than ² km/km ² or the peak hour flow is greater than 1000 vehicles/hour, a 30 min time segment and a 200 m X 200 m space segment are taken. For urban suburbs with a density of less than 2km/km ² or a peak hour flow of less than 1000 vehicles per hour, a 30-minute time segment and a 400 m X 400 m space segment are taken.

Step 32: Perform data preprocessing, perform data cleaning, data integration, data conversion, and data reduction on the GNSS positioning data to improve the structural degree of the data.

Step 33: Divide the space area to be processed into a grid of a certain size, and the range of each grid area can be expressed as 4 = {( ^x s ^s ) \ ^x s

Express the floating vehicle GNSS data acquisition frequency as

, Pfc+io) is defined as A 1-adjacent points, ^04-23⁄4), ^04+23⁄4) are defined as 2-adjacent points of A, and so on, then Ρθ4-/3⁄4;), defined as // adjacent points of Α. When / _Q <lHz, take ^ =1 or 2. Take the distance between the neighboring points of the distances A and A and calculate the A and A

^The mean value of the driving direction angle of the adjacent point is ^4·, and the threshold is ^=5°. If |^.- |< is satisfied, the matching is completed; otherwise, other sections are searched until the condition is met.

Use the straight line equation of the road segment (if it is a curved road segment, it is roughly split into straight lines), calculate the projection coordinates of the GNSS positioning point on the road segment, and reduce the error caused by the GNSS positioning drift. The specific method is:

Determine the straight line equation of the road segment ^ (If the road segment is a curve, divide it into several straight line segments): y-y^kix-x,) y, - y,

The slope is: The projection line equation is:

Ky _A -ky _t +k ² x _t +x _A

Solve the projected coordinates p is:

k ² +l

-h , -kx _:

y _P

After the map matching process, the anchor point is matched to the spatio-temporal sub-area in combination with the timestamp data of the coordinates of the positioning point. Step 34: Calculate the travel speed of each vehicle in the space-time sub-region: where ₂ ... 4 - 1, « is a time slot

The distance between the first and second GNSS anchor points in the zone, ..., the distance between the -1 and the nth GNSS anchor point, ^... is the space-time sub-zone 1 , ..., the first time stamp of a GNSS anchor point; specify the maximum length of the clip data at the same time segment length; ^ «; search for the speed data in the time segment of the time and space sub-region If the number of speed data in the time segment exceeds the upper limit p, randomly f data is added to ^. .

Step 35: Perform historical data of no traffic anomaly as a whole, and establish traffic feature model and parameter estimation. The method uses a finite mixing model to establish a traffic feature model and perform parameter estimation. Take the maximum component quantity K=5, and estimate the parameters of the mixed Gaussian model of «=1,2,···, respectively; for each model, determine the best model by Bayesian information criterion ΒΙΟ. Calculation: "e{l,2 ..,5} A total of 5 hybrid models. At the same time, calculate 5 models

BIC = -2\nL + k-\nn

After that, the mixed model with the smallest β/C is selected, and the parameter vectors ^ μ and σ are recorded. As the characteristic record _{ε of the} present space-time sub-region, according to the parameter estimation result, the probability of the distribution of the speed of the space-time sub-region corresponding to the different speeds is written. Density function p, (x): Calculate the Jense-Shannon divergence d _y between each of the two distributions:

Where P and ρ are two different probability distributions, Μ=^0Ρ + ρ;), /? For ! ^^-! ^!^!"Diversity:

D(P\\Q) = ^P(x _k )log In the case of a finite-mixing model, Monte Carlo sampling is used to approximate the calculation. The calculation method is:

The divergence between the distributions is expressed as a distance matrix:

D- d„, . . . d This matrix satisfies 4=4,, d, _r 0(i=j).

Using the category index as the response variable, the traffic environment data (including temperature, precipitation, visibility, etc.) is used as an independent variable to perform multiple logit regression to obtain the mapping relationship between the traffic environment E and the traffic situation category T (E. The same category will be used. The data is aggregated, and the hybrid model is re-established with the new data set after aggregation, and the parameter estimation is performed to obtain the final historical traffic characteristic data set.

Step 36: Obtain a characteristic function of the traffic condition, and obtain current information such as temperature, precipitation, visibility, traffic control measures, and the type of the current traffic condition.

Calculate the travel speed in the time-space sub-zone, form the overall V _IRT ' of the real-time travel speed, establish the travel speed probability distribution model pj _{v rt} ) = f lj - / _; ( ^^^ _; ), and make parameter estimation; The data (including temperature, precipitation, visibility, etc.) is used as an input parameter, and the mapping relationship R (E is used to obtain the category 当前 of the current traffic situation.

Step 37: Locating the historical traffic characteristic data of the category according to the current traffic situation category ;; calculating two parameters according to the description parameters τ , μ „, (J _rt and the description parameters η, μ, σ of the historical traffic feature) The difference between the velocity distributions: diff [(η, , μ _Γί , σ _Γί ) , (η, μ, σ)] = JSD(P _RT 11 Ρ).

Step 38 _: Normalize the difference in velocity distribution of each space-time sub-region to a normalized value of 0~1 _.

Diff^ - in(diff)

Max [diff, - min [diff,

Calculate the traffic anomaly index of each spatiotemporal sub-area ^f xli^

Claims

Claim

1. An urban road traffic anomaly detection method, comprising the following steps:

1) Establish space-time sub-area: divide the day into several time segments, each time segment is called a time sub-region; divide the implementation area of urban road traffic anomaly detection into several spatial segments, each spatial segment is called a space sub- Area; the intersection of any one of the time sub-areas and any one of the spatial sub-areas is called a spatio-temporal sub-area;

2) Preprocessing of historical trajectory data: processing the floating vehicle GNSS positioning historical data as the sampled vehicle speed data of the historical trajectory; Preprocessing of the real-time trajectory data: processing the floating vehicle GNSS positioning real-time data into the sampled vehicle speed data of the real-time trajectory;

3) Historical trajectory data analysis and feature extraction: Using the sampled vehicle speed data of the historical trajectory, establishing a historical travel speed probability distribution, and obtaining a historical traffic characteristic model P _A , the implementation method is: using historical data under the condition of no traffic abnormality as A whole, a finite mixed model method is used to establish a traffic feature model, and parameter estimation is performed;

Real-time trajectory data analysis and feature extraction: Using the sampled vehicle speed data of the real-time trajectory, establishing a real-time travel speed probability distribution, and obtaining a real-time traffic feature model

4) Anomaly detection: The Jensen-Shannon divergence is used to measure the difference between the historical traffic feature model and the real-time traffic feature model. The Jensen-Shannon divergence calculated by using the historical traffic feature model is calculated by using the real-time traffic feature model. Jensen-Shannon divergence, calculated historical and real-time traffic characteristics difference values;

5) Quantitative characterization of abnormal severity: Calculate the abnormality index of traffic conditions by using the difference between the historical and real-time traffic characteristics;

6) System performance evaluation: Evaluate the accuracy of traffic abnormality detection and measure the stability of system operation.

2. The urban road traffic anomaly detecting method according to claim 1, wherein: step 1) adopting one of the following methods:

La) Isometric space-time division method: Determine the segment scale of the time dimension, the time segment span is a fixed value, take 30mm as a time segment; determine the segment scale of the spatial dimension, the spatial segment span is a fixed value, take a spatial grid of 200mX 200m As a space segment;

lb) based network density of non-equidistant temporal division method: based network density as a determination indicator, when the road network density greater than or equal / km ² when 2km, time segments takes 30min and 200m X 200m spatial segment; when the road network When the density is less than 2km/km ² , take a 30 min time segment and a 400 m X 400 m spatial segment;

Lc) Non-equidistant space-time division method based on peak hour flow: Based on peak hourly flow rate as a judgment indicator, when the peak hour flow rate is greater than or equal to 1000 vehicles/hour, take 30mm time segment and 200mX 200m space segment; When less than 1000 vehicles/hour, take a 30-minute time segment and a 400-mX 400-meter space segment.

3. The urban road traffic anomaly detecting method according to claim 1, wherein the preprocessing of the historical trajectory data in step 2) comprises:

2a) Data structuring: Data cleaning, data integration, data conversion, and data reduction of floating vehicle GNSS positioning history data to obtain structured GNSS positioning history data;

2b) Fast map matching: Combine the urban road network data, map the structured GNSS positioning historical data to the urban road network through the map matching algorithm, and establish the matching relationship between the positioning points and the road segments in the structured GNSS positioning historical data, Forming a matching relationship between the positioning point in the structured GNSS positioning history data and the road segment, and correcting the error caused by the positioning drift; 2c) calculating and sampling the vehicle speed of the historical trajectory: calculating according to the structured GNSS positioning historical data The traffic operation characteristic parameter obtains the vehicle speed data of the historical trajectory, and performs data sampling on the vehicle speed data of the historical trajectory to obtain a sampling of the historical trajectory. Speed data.

4. The method of detecting urban road traffic anomaly according to claim 3, wherein the fast map matching according to step 2b) comprises:

2b 1) Divide the space area to be processed into a grid of a certain size, and the range of each grid area can be expressed as

4 ={(x _s ,y _s )\x _s G[x _f ,x _f+1 ),^ [y _r ,y _r+1 )}^ Each grid area contains several sections, which are The set is represented as R _s , and each of the sets R _s of the road segments is represented as and assigned a number to each road segment;

2b2) Determine the grid area where the anchor point is located, and use the distance and azimuth angle to search for the section where the anchor point A is located in the set of road segments.

2b3) Calculate the projection coordinates of the GNSS anchor point on the road segment using the GNSS anchor point linear projection method.

The urban road traffic anomaly detecting method according to claim 4, wherein the step 2b2) adopts one of the following methods:

2b21) Single point matching method: Search for the nearest road segment to a certain positioning point A. The implementation method is: For a certain road segment ij in the set of road segments, the difference between the traveling direction angle satisfying the point A and the direction angle of the road segment ij is smaller than When the threshold is satisfied

| - |< When the match is completed; if -6 < is not satisfied, continue to search for other sections in the link set R until | - |<_;

2b22) Point sequence matching method: This scheme is applicable to high frequency floating car data; the time interval of floating car GNSS data of each two adjacent time is expressed as ί. , the floating vehicle GNSS data acquisition frequency is expressed as / _Q = l / i _Q , and the time record of a certain positioning point A is represented as a point P (t _{A -} t ₀ ) that is temporally adjacent to the positioning point A, Pfc+if is defined as 1-adjacent point of A, P(t _A -2h), P04+2i _Q ;) is defined as 2-adjacent point of a certain anchor point A, and so on, then Pi -kt^h is defined as a neighboring point of a certain positioning point A; when / _Q <lHz, take ^ =1 or 2; take a section of the distance from a certain positioning point A and the adjacent point of the positioning point ^ and calculate the positioning point and The mean value of the driving direction angle of the adjacent point ^4i, if |_|< is satisfied, the matching is completed; otherwise, the other road segments are searched until the |1|< is satisfied.

6. The urban road traffic anomaly detecting method according to claim 3, wherein the vehicle speed calculation and sampling of the historical trajectory of step 2c) adopt one of the following methods:

2c 1) Full sample method: The overall speed data of each sub-floating vehicle in a time-space sub-region is composed of the total speed of the vehicle. The implementation method is to calculate the travel speed of each vehicle in the space-time sub-region: 4 ₂ ... 4 — _ln is a spatiotemporal subregion

The distance between the first and second GNSS anchor points, ..., the distance between the -1 and the nth GNSS anchor point, iL.A is the first in the space-time sub-region , ..., the first time stamp of the GNSS anchor points; the travel speed data of each time and space sub-zone is not filtered, and constitutes a set ^ for subsequent processing;

2c2) Time-smooth sampling method: Specify the length of the time segment, set the upper limit of the number of segments of the same time; search for the velocity data in each time segment in a time-space sub-region, if the number of velocity data in the time segment exceeds the upper limit, randomly take Upper limit The data is used for subsequent processing. The implementation method is to calculate the travel speed of each vehicle in the space-time sub-region: , where ₂ ... is the distance between the first and second GNSS ... ..., the first "- and the distance between the GNSS anchor points, ^... is the first time in the space-time sub-area ^, ..., the time stamp of the GNSS anchor point; time slice a maximum length of the number of data pieces in the same time segment; ^ «; speed data within the search a temporal sub-area at the time of the i each time segment, if the time slice speed data number of pieces exceeds the upper limit p randomly _∞ of data added ^ and dried For subsequent processing.

The urban road traffic anomaly detecting method according to any one of claims 1 to 6, wherein the step 3) adopts one of the following methods:

3a) Fixed-component mixed Gaussian model method: The mixed Gaussian model with a fixed number of components is used to describe the probability distribution of the vehicle speed. The number of components is manually specified according to the distribution pattern of the vehicle speed under typical conditions. The number of components is 4~6. The real-time traffic data is modeled. Establishing and parameter estimation to obtain a characteristic function of the current traffic condition;

3b) Variable mixed Gaussian model method: using variable component quantities, or variable component quantities and variable distribution types; using classification processing method to obtain the characteristic function of traffic conditions, and obtaining current temperature and precipitation, Information such as visibility, traffic control measures, and the type of current traffic conditions.

8. The traffic anomaly detection method according to claim 7, wherein the variable hybrid Gaussian model method according to step 3b) comprises one of the following methods:

3b l) Mixed Gaussian model method with variable number of components: The method based on model evaluation is used to select the appropriate number of components. The method is as follows: Determine the maximum number of possible components K, and respectively for the components of = 1, 2, ... a mixed Gaussian model for parameter estimation; for each model, the best model is determined by Bayesian information criterion (β/C);

3b2) The finite mixed model method in which the number of components and the type of distribution are variable: the distribution pattern and the number of components of the subcomponent are variable; the classification processing described in step 3b) includes:

3b3) Calculate the travel speed in the time-space sub-zone, which constitutes the overall real-time travel speed

3b4) Establish a travel speed probability distribution model ^ ^ 2 ! ^ / ^ ^^, ^.), where the number of sub-components of the real-time traffic feature is represented, // represents a sub-component of the real-time traffic feature The ratio represents the mean value of a certain sub-component of the real-time traffic feature model, σ represents the standard deviation of a certain sub-component of the real-time traffic feature model, and performs parameter estimation to obtain a description parameter T! _{rt of the} current real-time traffic feature, a _rt , where , is the proportional vector of each sub-component in the real-time traffic feature model, ^ is the mean vector of each sub-component in the real-time traffic feature model, (T _rt is in the real-time traffic feature model The standard deviation vector of each subcomponent;

3b5) Taking the current traffic environment data (including temperature, precipitation, visibility, etc.) as an input parameter, using the mapping relationship R(£^) to obtain the category T of the current traffic situation, where £ represents the traffic environment data.

The urban road traffic anomaly detecting method according to any one of claims 1 to 6, wherein the step 4) the abnormality detecting comprises:

4a) According to the category of the current traffic situation, locate the historical traffic characteristic data under the category. If there is no category division, it is not necessary to distinguish the category; 4b) Calculate the difference between the two velocity distributions according to the description parameters τ, μ, ^ of the current traffic characteristics and the description parameters η, μ, σ of the historical traffic characteristics: ^[(τ^, μ , σ Μη, μ^^^ / ^ ΙΙΑ), where is the proportional vector of each sub-component in the real-time traffic feature model, ^ is the mean vector of each sub-component in the real-time traffic feature model, (^ is the real-time traffic feature model The standard deviation vector of each sub-component; η is the proportional vector of each sub-component in the historical traffic feature model, μ is the mean vector of each sub-component in the historical traffic feature model, and σ is the historical traffic feature model The standard deviation vector of each subcomponent.

The urban road traffic anomaly detecting method according to any one of claims 1 to 6, wherein the step 5) the abnormal severity quantification characterization comprises:

5a) will be in each time and space sub-area

5b) Calculate the traffic anomaly index ^ 10 for each time-space subzone.