CN110968617A

CN110968617A - Road network key road section correlation analysis method based on position field

Info

Publication number: CN110968617A
Application number: CN201910984130.6A
Authority: CN
Inventors: 董宏辉; 贾利民; 秦勇; 吴金锁
Original assignee: Beijing Jiaotong University
Current assignee: Beijing Jiaotong University
Priority date: 2019-10-16
Filing date: 2019-10-16
Publication date: 2020-04-07
Anticipated expiration: 2039-10-16
Also published as: CN110968617B

Abstract

The invention provides a road network key road section correlation analysis method based on a position field, and belongs to the technical field of urban road network operation management. The method comprises the steps of obtaining running track data of urban trip vehicles and determining a position field where the vehicles pass by during trip; performing transaction division on the driving track according to the repeated condition of the position field; determining a frequent item set of the divided transactions according to an FP-growth algorithm; extracting a road network key road section according to the obtained relation between the frequent item set sub-track and the super-track; and analyzing the correlation among all nodes or road sections in the urban road network through confidence calculation according to the obtained key road sections. The method determines the key road sections in the urban road network through the path information comprising the vehicle travel position field, provides reliable basis for urban road section grade division, resource allocation, police dispatching and the like, analyzes the potential association degree of each position area and road section through correlation analysis, and provides specific acquaintance reference data for making traffic planning and policy measures.

Description

Road network key road section correlation analysis method based on position field

Technical Field

The invention relates to the technical field of urban road network operation management, in particular to a road network key road section correlation analysis method based on a position field.

Background

The traffic is used as a bridge for maintaining interpersonal communication, goods circulation and urban communication, and plays a vital role in regional construction and economic development. With the improvement of the comprehensive examples in our country in recent years, various constructions of urban road traffic are rapidly developed, a large number of people rush into urban spaces with limited land, and many cities cannot relieve the urban road traffic problems by traditional methods such as repairing roads and bridges, so that a series of problems such as traffic jam and environmental pollution are generated to greatly influence the comprehensive competitiveness of cities and the living standard of people, and the guarantee of orderly operation of the urban road traffic becomes the basis for promoting urban development and improving the living standard of people. According to the following characteristics of urban road vehicles, particularly transmissibility and delay, once an accident occurs to a key road section and an important node bearing main traffic of a city, certain traffic pressure is generated to adjacent road sections and nodes, and large-area delay and influence are easily caused. Meanwhile, the rapid development of cities leads to tighter connection among regions, and cooperation and communication among regions can be reflected by the traffic flow change conditions of road sections or nodes in a road network. Therefore, the correlation analysis of road sections or nodes in the road network is also the basic reference material for researching the regional connection, traffic canalization, policy making and future development of cities.

Disclosure of Invention

The present invention is directed to provide a method for analyzing relevance of key road segments of a road network based on location fields, so as to solve at least one technical problem in the background art.

In order to achieve the purpose, the invention adopts the following technical scheme:

the invention provides a road network key road section correlation analysis method based on a position field, which comprises the following steps:

step 110: obtaining the running track data of the urban trip vehicles, wherein a certain vehicle C_iHas a track composition of C_i＝{p₁,p₂,...,p_xIn which p is₁,p₂,...,p_xA location field for the vehicle to pass by;

step 120: performing transaction division on the driving track according to the repeated condition of the position field;

step 130: determining a frequent item set of the divided transactions according to an FP-growth algorithm;

step 140: extracting a road network key road section according to the obtained relation between the frequent item set sub-track and the super-track;

step 150: and generating an association rule according to the obtained key road sections, and analyzing the correlation among all nodes or road sections in the urban road network through confidence calculation.

Preferably, the step S120 specifically includes:

if two or more continuous position fields in the traveling track of the vehicle are repeated, the repeated position fields are used as dividing basis, the front repeated point is used as a tail item of the last transaction, the rear repeated point is used as a matter of the next transaction, and the track chain is divided into a plurality of transactions without repeated items;

if the vehicle repeatedly passes through a certain road section twice or more, traversing, judging and repeating operation is carried out on each track data item, the previous item with the repeated position appearing for the first time is taken as the ending item of the transaction, the data item is copied and taken as the initial item of the next transaction, the repeated position is judged backwards in sequence, and the data item is added to the transaction until the whole track is divided into a plurality of transactions.

Preferably, the step S130 specifically includes:

step 131: counting the position field information in each vehicle track according to the occurrence frequency, calculating the support degree counts of all vehicle position fields, setting a proper threshold value, screening out the position information which does not accord with the rule, arranging the data which accords with the condition in a descending order according to the size of the support degree counts, and constructing an item head table;

step 132: constructing an FP-tree and a node linked list; setting the root node as a null set, and drawing the FP-tree of each track by a drawing rule;

step 133: obtaining a conditional mode base of a node from an FP-tree, and determining a path set between an item to be searched as a tail end and a root node, namely a prefix path between the item and the root node;

step 34: and constructing a condition FP-tree through a condition mode base, screening prefix paths meeting the condition, and acquiring a frequent item set.

Preferably, the drawing rule is:

if a certain item of data appears for the first time, establishing the node, and simultaneously adding a pointer pointing to the node in the item head table; otherwise, continuously changing the data of each node according to the node corresponding to the path requirement, and connecting the same data of different nodes by using connecting lines to represent the connection relation of the nodes and the data.

Preferably, in step S140, the non-empty subset of the frequent item set is also a frequent item set, and if an item set is a non-frequent item set, the supertrack item set of the item set is also a non-frequent item set, and the frequent item set with the largest number of items in the frequent item set is selected as the key road segment in the road network.

Preferably, in step S150, the correlation calculation formula is analyzed by the confidence calculation formula as follows:

and (3) acquiring a strong association rule by setting a minimum confidence coefficient, and determining a strong association relation existing in the key road sections in the road network.

Preferably, in step S110, vehicle driving track data is acquired through an urban intelligent traffic management and service system based on the RFID technology.

The invention has the beneficial effects that: the method comprises the steps of determining key road sections in an urban road network through path information comprising vehicle travel position fields, determining grade division of the road sections for an urban management department, improving reliable basis for resource allocation and police dispatching, analyzing potential association degree of each position area and the road sections through correlation analysis, and providing specific acquaintance reference data for making traffic planning and policy measures.

Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a flowchart of a method for analyzing relevance of a key road segment of a road network based on a location field according to embodiment 1 of the present invention.

Fig. 2 is a flowchart of a method for analyzing relevance of a key road segment of a road network based on a location field according to embodiment 2 of the present invention.

Fig. 3 is a flowchart of transaction division on a travel track according to a repetition condition of a location field according to embodiment 2 of the present invention.

Fig. 4 is a schematic diagram of a vehicle travel space-time including a continuously repeated location field according to embodiment 2 of the present invention.

Fig. 5 is a schematic diagram of a vehicle travel space-time including a discontinuous repetition location field according to embodiment 2 of the present invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below by way of the drawings are illustrative only and are not to be construed as limiting the invention.

It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

For the purpose of facilitating an understanding of the present invention, the present invention will be further explained by way of specific embodiments with reference to the accompanying drawings, which are not intended to limit the present invention.

It should be understood by those skilled in the art that the drawings are merely schematic representations of embodiments and that the elements shown in the drawings are not necessarily required to practice the invention.

Example 1

The embodiment 1 of the invention provides a road network key road section correlation analysis method based on a position field, which comprises transaction division based on a repeated position field, an FP-growth algorithm, a frequent item set, association rule calculation and the like.

The transaction division based on the repeated position field is to process the repeated position field existing in the travel road section of the urban travel vehicle, and divide the repeated position field into different transactions to be applied to the FP-growth algorithm. The FP-growth algorithm is a core algorithm of the method and is an association rule algorithm for mining key road sections of a road network and analyzing correlation. The method aims to extract the frequent item set from track data of a large number of position fields and then extract the key road section of the road network according to the relation between sub tracks and super tracks. The association rule calculation is to analyze the correlation between each road section and each node according to a calculation formula of the confidence coefficient through a non-empty subset generated by a frequent item set, and can dig out certain road sections or nodes with strong correlation through a strong association rule.

The method is characterized in that a frequent item set is mined by taking track information of a plurality of vehicles in travel and position fields thereof as items in affairs, and key road sections and association rules are extracted by the frequent item set, and the specific method comprises the following steps:

step 110: obtaining the running track data of the urban trip vehicles, wherein a certain vehicle C_iHas a track composition of C_i＝{p₁,p₂,...,p_xIn which p is₁,p₂,...,p_xA location field for the vehicle to travel past.

Step 120: and correspondingly processing the acquired running track of the urban trip vehicle according to the repeated condition of the position field.

When two or more continuous position fields in the vehicle traveling track are repeated, the phenomenon that the vehicle possibly stays at a certain position in the traveling process is shown, although the running track of the vehicle is actually traveling information of the vehicle, if two continuous same position fields appear in the vehicle track, namely the vehicle makes a long or short stop state at the position, for the busy degree of a road section, the part is equivalent to a 'static' state and has no travel road section track, so that two or more continuous repeated positions can be used as the basis for dividing the transaction, a front repeated point is used as the tail item of the previous transaction, a rear repeated point is used as the item of the next transaction, and a track chain is divided into a plurality of transactions without repeated items. For example, the trajectory of a vehicle is { a, b, c, d, d, e }, where the vehicle may stop at d, and the busy state of the vehicle is not affected for the road segment, so that a duplicate is retained, and thus the trajectory is processed to obtain a transaction 1 ═ a, b, c, d }, and a transaction 2 ═ d, e }.

If a certain vehicle repeatedly passes through a certain road segment twice or more, the track link can be understood as two or more vehicles passing through the road segment in terms of the road segment, so that the traversal judgment repeated operation is carried out on each data item, the previous item with the repeated position appearing for the first time is taken as the ending item of a transaction, the data item is copied and taken as the initial item of the next transaction, and the repeated position is sequentially judged backwards and added to the transaction until the whole track is divided into a plurality of transactions.

For example, the trajectory of a vehicle is { a, b, c, b, b, a }. According to the description of the algorithm, data b is a repeated item which appears for the first time, the previous item is c, so that an affair 1{ a, b, c } is obtained, data c is copied to serve as an initial item of the next affair, backward judgment is continued, due to the fact that continuous positions are repeated, the repeated data b is reserved for one affair, the affair 2{ c, b } is finally obtained, the affair 3{ b, a } is continuously solved, and finally the track of the vehicle is divided into three affairs which do not contain the repeated item.

Step S130: and performing frequent item set mining on the processed transaction according to an FP-growth algorithm.

Step S131: the data is scanned for the first time. Firstly, counting the position field information in each vehicle track according to the occurrence frequency, calculating the support degree count of all vehicle position fields, setting a proper threshold value, screening out the position information which does not accord with the rule, arranging the data which accords with the condition in a descending order according to the size of the support degree count, and constructing an item head table.

Step S132: and constructing an FP-tree and a node linked list. Firstly, setting a root node as a null set, drawing an FP-tree of each track according to a rule, wherein the rule is that if a certain item of data appears for the first time, the node is established, and a pointer pointing to the node is added in an item head table; otherwise, continuously changing the data of each node according to the node corresponding to the path requirement, and connecting the same data of different nodes by using connecting lines to represent the connection relation of the nodes and the data.

Step S133: and obtaining the conditional mode base of the node from the FP-tree. The conditional mode base is a set of paths ending at an entry to be searched to a root node, also referred to as a prefix path of the entry to the root node. Generally, the searching starts from the node with lower supporting technology. When the data volume is large, the searching efficiency can be greatly improved by using the pointer tool of the item head table, and the solution is fast and effective.

Step S134: the condition FP-tree is constructed by a condition mode base. Constructing the condition FP-tree is to screen the prefix path meeting the condition and search for a frequent item set. A user generally sets a minimum count threshold in a self-defined mode, then screens nodes which do not meet the conditions through the threshold, and sequentially solves the condition FP-tree of each node, namely, the mining of a frequent item set is completed through an FP-growth algorithm.

Step S140: and extracting the key road sections of the road network according to the obtained relationship between the frequent item set sub-tracks and the supertracks. The frequent itemset has the following properties: if an item set is a frequent item set, then the non-empty subset of the item set is also the frequent item set, and if an item set is a non-frequent item set, then the hypertrophic item set of the item set is also a non-frequent item set. Therefore, the frequent item set with the maximum number of items in the frequent item set can be selected as the key road section in the road network.

Step S150: and generating an association rule according to the obtained key road sections, and analyzing the correlation among all nodes or roads by a confidence coefficient calculation formula. The form is X → Y, and the confidence coefficient is calculated as follows:

x, Y represents two related key road section frequent item sets respectively; and finally, a strong association rule can be obtained by setting the minimum confidence coefficient, and the relationship of strong correlation in the road network is mined.

Example 2

The method for analyzing the relevance of the key road sections of the road network based on the position fields, provided by the embodiment 2 of the invention, analyzes the vehicle travel data of a certain city on a certain working day of an urban intelligent traffic management and service system based on the RFID technology. As shown in fig. 2, the method comprises the following steps:

step 1: and selecting the vehicle travel data of a certain working day of a city intelligent traffic management and service system 2016, 4 months and 4 years for analysis. By 2016, the system has installed electronic tags for more than 350 ten thousand vehicles, and basically all the motor vehicles in stock in the city main city area can be collected; wherein, a plurality of road surface collecting points 840 are constructed on the roads in the main city area and suburban area.

Step 2: the position field set of the vehicle running, namely the running track of the vehicle, is obtained in advance through related data processing. The vehicle travel track containing the position field is a data basis for mining key road sections of the urban road network and analyzing the relevance.

And step 3: and respectively processing according to the repeated conditions of the position fields in the vehicle travel track to obtain the affairs without repeated items. The processing method is mainly based on the repetition condition of the position field. A flow diagram for partitioning transactions based on repeat location fields is shown in fig. 3.

Step 31: when two or more continuous position fields in the vehicle traveling track are repeated, the phenomenon that the vehicle possibly stays at a certain position in the traveling process is shown, although the running track of the vehicle is actually traveling information of the vehicle, if two continuous same position fields appear in the vehicle track, namely the vehicle makes a long or short stop state at the position, for the busy degree of a road section, the part is equivalent to a 'static' state and has no travel road section track, so that two or more continuous repeated positions can be used as the basis for dividing the transaction, a front repeated point is used as the tail item of the previous transaction, a rear repeated point is used as the item of the next transaction, and a track chain is divided into a plurality of transactions without repeated items. For example, the trajectory of a vehicle is { a, b, c, d, d, e }, where the vehicle may stop at d, and the busy state of the vehicle is not affected for the road segment, so that a duplicate is retained, and thus the trajectory is processed to obtain a transaction 1 ═ a, b, c, d }, and a transaction 2 ═ d, e }. A vehicle trip space-time diagram containing successive repeat location fields is shown in fig. 4.

Step 32: when the position field of the vehicle going out is non-continuous and repeated, the path of the vehicle going out can be regarded as that a plurality of vehicles pass through the repeated road section from the viewpoint of the road section, and each vehicle does not contain repeated position information, so that the corresponding processing can be carried out on the vehicle, and the vehicle is divided into a plurality of transactions without repeated position fields. For example, the trajectory of a vehicle is { a, b, c, b, b, a }. According to the description of the algorithm, data b is a repeated item which appears for the first time, the previous item is c, so that an affair 1{ a, b, c } is obtained, data c is copied to serve as an initial item of the next affair, backward judgment is continued, due to the fact that continuous positions are repeated, the repeated data b is reserved for one affair, the affair 2{ c, b } is finally obtained, the affair 3{ b, a } is continuously solved, and finally the track of the vehicle is divided into three affairs which do not contain the repeated item. A vehicle travel space-time diagram including a non-consecutive repetition location field is shown in fig. 5.

And 4, step 4: and performing frequent item mining by adopting an FP-growth algorithm. The transaction data of the whole day is analyzed, and researches show that when the minimum support degree is set to be too high, the frequent item length is too short, and through repeated experiments and analysis, the minimum support degree is set to be 1000, which is most reasonable. And obtaining approximately 70 sets of the most frequently occurring 3 items and the most frequently occurring 4 items in the road network.

And 5: and classifying the hot tracks into 12 classes by calculating the similarity of the tracks and taking the similarity of the tracks as a characteristic. Taking the road section from the yangguan bridge to the red-trough house as an example, the frequent items including the point are { yangguan bridge to the red-trough house, the high beach rock to the west ring, the stone river to the yangguan bridge: 18564}, { Yangguan to Hongdu, Gao Tan rock to West Ring, Shima river to Yangguan bridge, Hongdu house to Gao Tan rock: 17382}, { populus bridge to ruby house, highwall rock to west ring, ruby house to highwall rock: 26054 and { populic bridge to shoal, shoal to west ring, shoal to shoal, west ring to phoenix: 13103, processing the frequent item sets into a class with high similarity, and selecting the item with highest support degree or the item with the largest number as a key road section of the road network. The specific information of the first ten key road sections in Chongqing city on the day is obtained through analysis and is shown in table 1.

TABLE 1 the top ten key road sections in the urban road network

Step 6: by analyzing the association rules generated by the key segments, taking the first key segment as an example, the non-empty subset generated by the frequent item { airport lotterie, airport lou saint road, airport two-way city carousel } is counted as the corresponding support degree as shown in table 2.

Table 2 non-empty subsets of key road segments 1

Meanwhile, the confidence of each rule is calculated according to a confidence formula, and each rule and confidence of the first key road section are obtained and shown in table 3.

TABLE 3 rules and confidence for Key road segment 1

In order to seek the strong association relation existing in the key road section, the first 3 rules with the highest confidence coefficient are defined as strong association rules. Therefore, the strong association rules of the key road segment 1 are the 3 rd, the 4 th and the 6 th rules, and the confidence coefficient of the 6 th rule is the highest, which indicates that the traveling vehicles in the city pass through the airport lou road simultaneously by the Ottley road and approximately 80% of the turntables in the two urban areas of the airport.

And sequentially calculating to obtain strong association rules of ten key road sections before the day. Because of more rules, ten rules with the highest confidence of the key road sections are shown and analyzed, and the specific information is shown in table 4.

TABLE 4 Strong association rules for Key road segments

In summary, the method according to the embodiment of the present invention includes an association rule algorithm, mining a frequent pattern, associating a rule, and dividing a transaction based on a vehicle operation track repeat position; the key road section mining and correlation analysis is analysis data of the space object based on the position field characteristics, and the form of the analysis data can be mobile phone signaling data containing the position field, GPS data, RFID data, and emerging text data formats such as micro blogs and micro messages. The association rule algorithm (FP-growth algorithm) is a main method for mining key road sections in the urban road network and analyzing the correlation, and the transaction requirements of the method do not contain repeated items. The position field in the vehicle travel path may contain a duplicate entry, and the situation that the vehicle travel path contains the duplicate position field needs to be processed to be a transaction without the duplicate entry. Handling duplicate entries in the vehicle travel path should consider two possible situations: the position field where the vehicle is traveling is continuously repeated and the position field where the vehicle is traveling is non-continuously repeated. When the position field where the vehicle is traveling is continuously repeated, the state of the vehicle at that time corresponds to the "stationary" state from the viewpoint of the link, and the direct division processing is performed. When the position field of the vehicle going out is non-continuous and repeated, the path of the vehicle going out can be regarded as that a plurality of vehicles pass through the repeated road section from the viewpoint of the road section, and each vehicle does not contain repeated position information, so that the corresponding processing can be carried out on the vehicle, and the vehicle is divided into a plurality of transactions without repeated position fields. The FP-growth algorithm determines key road sections in the road network by mining a frequent item set and by the relationship between sub-tracks and supertracks in the frequent item set. The FP-growth algorithm determines key road sections in a road network, generates association rules through non-empty sets generated by the key road sections, and solves strong association rules according to a minimum confidence threshold value so as to analyze strong correlation in the key road sections.

The method provided by the embodiment of the invention aims to excavate the non-directional path of the key road section in the urban road network through the position field information, so that the path based on the position field can be unordered. According to key road sections, nodes and correlation in an urban road network, the position fields of vehicle driving road sections are mainly used as basis, corresponding data processing is carried out, the key road sections in the road network are determined according to an FP-growth algorithm, and meanwhile, potential association relations of all positions are analyzed through confidence degrees to obtain strong association rules. The method is beneficial to improving the important grade division of road sections and nodes in the road network by the road traffic management department in China, performing targeted key prevention and control, improving the road structure, simultaneously inducing the operation rule of urban vehicles, perceiving the urban traffic state, providing a brand-new way for ensuring the normal operation of the road network and relieving urban congestion by applying limited resources as far as possible, and laying a theoretical and application foundation for the subsequent research of key road sections and nodes in the future.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A road network key road section correlation analysis method based on a position field is characterized by comprising the following steps:

step S110: obtaining the running track data of the urban trip vehicles, wherein a certain vehicle C_iHas a track composition of C_i＝{p₁,p₂,...,p_xIn which p is₁,p₂,...,p_xA location field for the vehicle to pass by;

step S120: performing transaction division on the driving track according to the repeated condition of the position field;

step S130: determining a frequent item set of the divided transactions according to an FP-growth algorithm;

step S140: extracting a road network key road section according to the obtained relation between the frequent item set sub-track and the super-track;

step S150: and generating an association rule according to the obtained key road sections, and analyzing the correlation among all nodes or road sections in the urban road network through confidence calculation.

2. The method for analyzing relevance of key road segments of road network based on location field as claimed in claim 1, wherein said step S120 specifically comprises:

3. The method for analyzing relevance of key road segments of road network based on location field as claimed in claim 2, wherein said step S130 specifically comprises:

4. The method according to claim 3, wherein said mapping rule is: if a certain item of data appears for the first time, establishing the node, and simultaneously adding a pointer pointing to the node in the item head table; otherwise, continuously changing the data of each node according to the node corresponding to the path requirement, and connecting the same data of different nodes by using connecting lines to represent the connection relation of the nodes and the data.

5. The method according to claim 3, wherein in step S140, the non-empty subset of the frequent item sets is also a frequent item set, and if an item set is a non-frequent item set, the supertrack item set of the item set is also a non-frequent item set, and the frequent item set with the largest number of items in the frequent item set is selected as the key link in the road network.

6. The method according to claim 5, wherein said method comprises: in step S150, the correlation calculation formula is analyzed by the confidence calculation formula as follows:

x, Y represents two related key road section frequent item sets respectively;

7. The method for analyzing the relevance of key road segments according to claim 1, wherein in step S110, the vehicle driving track data is obtained through an intelligent city traffic management and service system based on RFID technology.