CN113920352A - Travel mode identification method and system based on mobile phone positioning data - Google Patents

Travel mode identification method and system based on mobile phone positioning data Download PDF

Info

Publication number
CN113920352A
CN113920352A CN202111248470.6A CN202111248470A CN113920352A CN 113920352 A CN113920352 A CN 113920352A CN 202111248470 A CN202111248470 A CN 202111248470A CN 113920352 A CN113920352 A CN 113920352A
Authority
CN
China
Prior art keywords
mobile phone
vehicle
mounted mobile
travel
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111248470.6A
Other languages
Chinese (zh)
Inventor
邢吉平
霍锦彪
张媛
杨逊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xing Jiping
Original Assignee
Xing Jiping
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xing Jiping filed Critical Xing Jiping
Priority to CN202111248470.6A priority Critical patent/CN113920352A/en
Publication of CN113920352A publication Critical patent/CN113920352A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2323Non-hierarchical techniques based on graph theory, e.g. minimum spanning trees [MST] or graph cuts
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0108Measuring and analyzing of parameters relative to traffic conditions based on the source of data
    • G08G1/012Measuring and analyzing of parameters relative to traffic conditions based on the source of data from other sources than vehicle or roadside beacons, e.g. mobile networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/029Location-based management or tracking services

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Discrete Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Traffic Control Systems (AREA)
  • Navigation (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention relates to a trip mode identification method and a trip mode identification system based on mobile phone signaling data, which are used for acquiring mobile phone triangulation location data in an area to be analyzed within a period of time; screening out vehicle-mounted mobile phone user data running in an actual road to form a running track of a vehicle-mounted mobile phone user; taking the area difference value between the tracks of each vehicle-mounted mobile phone user as input, and pruning a support tree through a minimum support tree algorithm to form a clustering result; and regarding the number of the vehicle-mounted mobile phone users of the same type in the clustering result as travelers taking the same vehicle at the same time, and identifying the travel mode of the vehicle-mounted mobile phone users as independent travel, shared travel or bus travel. The invention innovatively provides a minimum support tree clustering method based on graph theory aiming at the screened vehicle-mounted mobile phone users, and the travel modes can be accurately divided according to the positioning tracks of the vehicle-mounted mobile phone users.

Description

Travel mode identification method and system based on mobile phone positioning data
Technical Field
The invention relates to the technical field of intelligent transportation, in particular to a travel mode identification method and system based on mobile phone positioning data.
Background
With the improvement of the positioning precision of the mobile phone, the mobile phone triangulation positioning data can improve the mobile phone positioning data from 50m to 100 m. The method is mainly applied to analysis of the traffic travel behaviors of travelers and analysis of traffic states (traffic speed, travel time and the like) of expressways. However, in the past research on traffic state estimation based on mobile phone data, the number of single mobile phone users is generally regarded as a vehicle that operates independently, but in reality, the mobile phone data does not consider the difference between the mobile phone users and the actual vehicle and the difference between the travel patterns of the respective mobile phone users.
There are also problems as follows:
1. in the research of extracting traffic information by using mobile phone data, the influence of irregular mobile phone positioning cycle frequency is limited, the traffic parameters extracted by the mobile phone data are mostly concentrated on traffic speed, travel distance and the like, and the research on traffic volume is often less involved. In a small amount of traffic volume research, the influence of the positioning accuracy of mobile phone data and the market share of the mobile phone data is limited, and the research scenes are mostly in expressways, so that the research on urban roads is less. The number of the vehicle-mounted mobile phone users is approximated to traffic volume, and the influence of vehicle co-riding of the vehicle-mounted mobile phone users is ignored. And when different types of traffic modes are selected for traveling in the urban road network, certain errors can be caused for the statistics of traffic information.
2. In the past research of travel mode division, the divided travel modes are mainly used for identifying great differences of the travel modes such as subway, plane, walking and motor vehicle travel according to the travel distance, the travel speed and other characteristics. The interior differences of the specific motor vehicle travel are divided less. In addition, in the traffic mode identification research based on the mobile phone data, the verification of the real data is less obtained by the divided modes due to the limitation of difficult real data statistics.
3. In the current research using mobile phone signaling data, the types of the applied data are mostly original network signaling switching data or mobile phone simulation data, etc. Limited by the disadvantage of low data positioning accuracy or difference from the actual positioning error. In the past, application scenes divided by a travel mode are mostly concentrated among urban areas of remote travel or divided in highway sections with few entrances and exits and small vehicle interference. And the research on the real mobile phone data processed by the positioning algorithm is less.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a positioning method based on mobile signal tower point monitoring.
In order to achieve the above object, the present invention provides a travel mode identification method based on mobile phone positioning data, which comprises:
acquiring mobile phone triangulation data in an area to be analyzed within a period of time; screening out vehicle-mounted mobile phone user data running in an actual road to form a running track of a vehicle-mounted mobile phone user;
taking the area difference value between the tracks of each vehicle-mounted mobile phone user as input, and pruning data through a minimum support tree algorithm to form a clustering result;
and regarding the number of the vehicle-mounted mobile phone users of the same type in the clustering result as travelers taking the same vehicle at the same time, and identifying the travel mode of the vehicle-mounted mobile phone users as independent travel, shared travel or bus travel.
Further, comprising: screening out the vehicle-mounted mobile phone user data of operation in the actual road, including: mapping the mobile phone positioning data to a road network map, calculating the average speed of each road section in unit time, and rejecting the mobile phone positioning data if the average speed is lower than a set threshold value;
the average velocity is calculated as follows:
Figure BDA0003321888360000021
wherein m represents the number of positioning points of the mobile phone user in unit time period T, TkAnd x (k) represents latitude and longitude coordinate information of the position at the kth moment.
Further, calculating the maximum instantaneous speed, and eliminating the mobile phone positioning data of which the maximum instantaneous speed exceeds the highest speed limit of each road section;
the maximum instantaneous speed is as follows:
Figure BDA0003321888360000031
wherein, TkThe time intervals of the mobile phone user at the kth moment and the kth-1 moment are represented, and x (k) and x (k-1) respectively represent the longitude and latitude of the geographic position of the mobile phone user at the kth moment and the kth-1 moment.
And further, mobile phone positioning data with less than two positioning points are rejected.
Further, taking the area difference value between the tracks of each vehicle-mounted mobile phone user as input, performing data pruning through a minimum support tree algorithm to form a clustering result, and the method comprises the following steps:
each vehicle-mounted mobile phone user on a certain road section serves as a vertex to form a full-connection undirected graph G (A, E, omega); a is a vertex set, namely an area set surrounded by the tracks of the mobile phone users; the weight value of the edges between the vertexes is the difference value E between the areas enclosed by the user tracks of the vehicle-mounted mobile phones to represent an edge set connecting the two vertexes; the connecting edge e between different vehicle-mounted mobile phone user tracks is (A)u,Av) U is 1,2, …, n, v is 1,2, …, n, u ≠ v, and the weight of the side is ω (e) ═ au,Au) The weight calculation formula is:
ω(e)=|Au-Av|
wherein A isu,AvRespectively representing the total area enclosed by the tracks of the vehicle-mounted mobile phone user u and the vehicle-mounted mobile phone user v and the abscissa axis, calculating the difference value between the track areas of all the vehicle-mounted mobile phone users as the distance between vertexes, namely the weight of the side, and respectively calculating the areas between the tracks of all the users, thereby obtaining a weight matrix:
Figure BDA0003321888360000032
undirected graph by Prim algorithmG is divided into a number of subset graphs SiSiThe minimum support tree form of (1);
for each subset graph SiRepeatedly eliminating subset picture SiThe edge with the maximum weight is determined, so that the intra-class distance is minimum, the inter-class distance is maximum, and the number of clustering subsets reaches the subset graph SiAnd (5) finishing clustering by the optimal clustering number.
Further, each subset graph S is generatediThe minimal support tree of (1), comprising:
(1) if a cluster subset graph SiIf the graph is an acyclic graph, the subset graph is directly regarded as a minimum support tree, and the step (5) is turned to, otherwise, the step (2) is continued;
(2) finding a certain cluster subset graph S with ringsiThe edge of the smallest weight in (1)<Au,Av>;
(3) Will edge<Au,Av>Putting the new vertex connected with the edge into a set T, if the set T contains all the vertexes in the graph G, turning to the step (5), and if not, entering the step (4);
(4) finding the edge with the minimum weight value in the edges formed by each vertex and the outer vertex of T<Au,Av>Turning to (3);
(5) the formed set T is a cluster subset graph SiThe minimum support tree.
Further, the subset map SiThe determination of the optimal cluster number comprises: in [ c ]min,cmax]Interval search for the number of clusters at which the VRC value reaches the maximum value as the optimal number of clusters, cmaxIs the total number of the users of the vehicle-mounted mobile phone, cminEqual to 1;
and (3) calculating:
Figure BDA0003321888360000041
Figure BDA0003321888360000042
Figure BDA0003321888360000043
wherein, BGSS represents the sum of the distances between the mobile phone users of two different classes, which has k-1 degrees of freedom; the WGSS represents the sum of distances between mobile phone users in the same category, and has n-k degrees of freedom; n represents the total number of the vehicle-mounted mobile phone users for constructing the support tree; k represents the number of the split categories of the constructed minimum support tree, namely the number of clusters; n isjThe number of mobile phone users in the jth category after the constructed minimum support tree is divided is represented, mu represents the mean value of the weights of the edges in the minimum support tree constructed based on all vehicle-mounted mobile phone users, and mu represents the weight of the edges in the minimum support treejWeight mean, c, representing the interior edges of the clustered class jjRepresenting the weight subset of the inner edges of the clustered category j, and x representing the weight of each edge in the clustered category j;
the corresponding k value when the WGSS value is minimum and the BGSS value is maximum is calculated as the optimal cluster number.
Further, identifying the travel mode of the vehicle-mounted mobile phone user comprises: when the number of the vehicle-mounted mobile phone users in the category after clustering is 1, representing independent travel; when the number of the vehicle-mounted mobile phone users in the category is equal after clustering, the vehicle-mounted mobile phone users are represented as a carpooling trip; and when the number of the vehicle-mounted mobile phone users in the category is more than 7 after clustering, the bus trip is represented.
On the other hand, a travel mode identification system based on mobile phone positioning data is provided, which comprises:
the data screening module is used for acquiring mobile phone triangulation location data in an area to be analyzed within a period of time; screening out vehicle-mounted mobile phone user data running in an actual road;
the track forming module is used for forming the running track of each vehicle-mounted mobile phone user;
the clustering module is used for performing data pruning through a minimum support tree algorithm by taking the area difference value between the tracks of each vehicle-mounted mobile phone user as input to form a clustering result;
and the travel mode identification module is used for regarding the number of the vehicle-mounted mobile phone users of the same type in the clustering result as travelers taking the same vehicle at the same time, and identifying the travel mode of the vehicle-mounted mobile phone users as independent travel, shared travel or bus travel.
A third aspect provides a computer-readable storage medium, in which program instructions are stored, and when the program instructions are executed by a processor, the method for identifying a travel mode based on positioning data of a mobile phone is implemented.
The technical scheme of the invention has the following beneficial technical effects:
(1) aiming at the problem that the number of the vehicle-mounted mobile phone users is not matched with the actual traffic volume, the invention selects the mobile phone triangulation location data with high data location precision and large market share. The travel modes can be accurately identified to be independent travel and three travel modes of shared travel or bus travel, so that more accurate travel mode information in the urban road network and proportion information of the travel modes are provided.
(2) Aiming at the influence that the mobile phone data positioning period is influenced by irregular flow interaction frequency of a mobile phone user and irregular positioning intervals, the invention provides a minimum area difference value method among track areas, and under the condition of meeting the requirement of minimum research road section length, a road section intersection is used as a starting and ending point of a research trip mode, so that a research area is determined. And analyzing the travel tracks of all mobile phone users in the area by taking the minimum passing time interval as a unit. Longitude and latitude are mapped into horizontal and vertical coordinates, and the area difference between each mobile phone user track and the coordinate axis is selected and used as a cluster. The personnel on the same vehicle can be determined more accurately.
(3) The invention provides a two-stage traffic information processing method aiming at mobile phone data, aiming at the problem of processing a complex urban traffic network by applying mobile phone positioning information. Firstly, screening out non-motorized mobile phone users by determining the driving speed threshold of the vehicle-mounted mobile phone users. And determining the research range of the road section based on the positioning precision of the mobile phone data. Aiming at the screened vehicle-mounted mobile phone users, the invention innovatively provides a minimum support tree clustering method based on the graph theory, and the method is used as an effective method in unsupervised learning and can accurately divide the travel modes according to the positioning track of the vehicle-mounted mobile phone users.
Drawings
Fig. 1 is a schematic diagram of a travel mode identification process based on mobile phone positioning data;
FIG. 2 is a schematic diagram of area calculation after connection of mobile phone tracks;
FIG. 3 is a schematic diagram of a split of a tree in a minimal support tree cluster;
FIG. 4 is a diagram illustrating an embodiment of mobile phone location data distribution;
fig. 5 is a schematic diagram of a travel mode identification system based on mobile phone positioning data.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings in conjunction with the following detailed description. It should be understood that the description is intended to be exemplary only, and is not intended to limit the scope of the present invention. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present invention.
The method for clustering the minimum support trees is used for dividing travel modes of a mobile phone triangulation data user and identifying each travel mode.
A travel mode identification method based on mobile phone positioning data is combined with figure 1 and comprises the following steps:
s100, acquiring mobile phone triangulation location data in an area to be analyzed within a period of time; and screening out the data of the vehicle-mounted mobile phone users running in the actual road to form the running track of the vehicle-mounted mobile phone users.
Mapping the mobile phone positioning data to a road network map, specifically comprising: firstly, mobile phone positioning data is mapped to each road section in a research road network under the condition of considering the positioning error precision, namely the road network mapping process of the mobile phone positioning data is obtained. The step is a precondition for obtaining traffic parameter information, and is mainly a process for associating mobile phone positioning data containing geographic position information with road network map information.
The original coordinate system adopted by the mobile phone triangulation location data is GCJ-02, which is a geographic information coordinate system established by the China national surveying and mapping bureau. And converting the mobile phone positioning data based on the GCI-02 coordinate into a WGS84 coordinate scene by adopting a coordinate conversion algorithm disclosed in Github. In one embodiment, a Transform _ files module developed by a Python toolkit is selected to realize the conversion operation of two coordinate systems.
The mobile phone triangulation location data only records the longitude and latitude position information of the mobile phone and does not have any label related to the road section position information. Therefore, the mobile phone positioning data needs to be matched with the road network in the electronic map, so that the mobile phone positioning data and the road network geographic coordinate information are correlated. The matching is realized by a topological algorithm, an automatic dotting method, a probability method and the like.
The average moving speed of the mobile phone user is calculated firstly, and the travel modes of the non-motor vehicles are screened out by setting speed thresholds of different travel modes. And then, judging whether the instantaneous speed maximum value of the mobile phone user meets the maximum speed limit requirement of the road where the mobile phone user is located or not by evaluating, so as to judge whether the data is wrong or drift data. If the maximum speed limit threshold is exceeded, the data may be drift or error data of the mobile phone user, and the data of the mobile phone user also needs to be removed.
The average speed calculation formula based on the mobile phone positioning data is as follows:
Figure BDA0003321888360000071
wherein k represents the kth moment of each mobile phone user, m represents the number of positioning points of the mobile phone users in a time period T, and TkRepresents a time period between the kth time and the previous time, and x (k) represents latitude and longitude coordinate information of the position at the kth time.
Considering the actual traffic running conditions of urban road networks in Nanjing, the research institute, the speed threshold range of walking travel is set to be 0-7km/h in this chapter, and the speed threshold range of bicycle travel is set to be 0-15 km/h. And when the road sections in the urban road network are not in traffic jam, the average speed of the vehicles is generally more than 16 km/h. Therefore, the average speed of 16km/h is selected as the lowest speed threshold value of the travel of the number of the mobile phone users in the vehicle. If the calculated average speed is greater than 16km/h, we can consider the handset positioning data as sent by the vehicle handset user.
In addition, the maximum instantaneous speed of the cell phone data is also used to determine whether the cell phone triangulation data is from a continuous signal transmitted in the moving vehicle. The maximum instantaneous speed calculation formula is as follows:
Figure BDA0003321888360000081
wherein, TkRepresenting the time period of the user of the mobile phone within the kth time and the kth-1 time. x (k) and x (k-1) respectively represent the longitude and latitude of the geographic position of the mobile phone user at the kth moment and the kth-1 moment.
All roads in the city are divided into main roads, secondary roads and branch roads. And the maximum speed requirements demanded at different road grades are different. In one embodiment, the instantaneous speed of the handset user of 80km/h is selected as the maximum speed threshold. If the maximum speed exceeds 80km/h, the positioning data of the mobile phone user is considered to be discontinuous, and then the positioning data needs to be eliminated.
It is necessary to ensure that each mobile phone user in the research road segment has at least two positioning point information, so that a combined polygon can be formed between the tracks. And removing the user data with less than two positioning points.
In order to avoid that all mobile phone users passing through the same research road section perform calculation, unit research time intervals need to be set before inputting a model, and then mobile phone users driving under a target road section at different time intervals are pre-screened.
The map starts at an intersection where the vehicle enters. If the vehicle exits the intersection, the intersection is taken as the terminal. The location trajectory of each handset can be extracted.
S200, taking the area difference value between the tracks of the vehicle-mounted mobile phone users as input, and pruning data through a minimum support tree algorithm to form a clustering result.
The minimum support tree is established by judging the distance between mobile phone users. Therefore, the travel mode of the mobile phone user is identified after clustering.
The minimum support tree is an undirected graph G composed of edges interconnected with weights. An undirected graph G consists of a set of vertices P and edges E connecting the vertices, i.e. can be denoted as G ═ P, E. Each edge e in the undirected graph GiGiven a weight ω (e)i). When all vertices in graph G are connected by edges and do not contain rings, it is called a support tree. The minimum support tree is a support tree satisfying the following conditions:
Figure BDA0003321888360000091
i.e. the weighted sum of all edges in the composed target support tree is minimal.
In one embodiment. The minimum support tree model is constructed by using the area difference between the tracks as a weight, so that a reference coordinate system needs to be established at first, and the areas between the tracks of different mobile phone users are calculated.
The mobile phone user positioning track is position track information formed by a plurality of positioning points which are connected in front and back at a plurality of moments of the mobile phone user. In this chapter, longitude in the geodetic coordinate system is used as the abscissa of the study, and latitude is used as the ordinate of the study. And selecting a latitude data set { Lon in all vehicle-mounted mobile phone positioning data in the research road sectioni,LatiThe smallest longitude and latitude value in the equation is taken as the origin of the coordinate system, namely (min (Lon)i),min(lati)). The mobile phone location data are mapped to the established coordinate system, and each point shown in fig. 2 is a mobile phone location data mapping point.
And representing the n vertexes in the fully-connected undirected graph by the n mobile phone users in the travel modes to be identified. The weight of the edge between the vertices can be represented by the area difference formed between the tracks of the mobile phone users, as shown in fig. 2.
Suppose there are n number of vehicle-mounted mobile phone users needing to be divided into travel modes, and each mobile phone user contains m longitude and latitude position data, which can be PitTo indicate. Pit=(xit,yit) The coordinate position of the mobile phone user i at the time t is represented, wherein i is 1,2, …, n, t is 1,2, …, m. M positioning data of the same mobile phone user are connected through a straight line, and the travel mode to be identified is projected to the same position coordinate system, so as to obtain an area enclosed by a longitude coordinate axis, as shown in fig. 2, wherein P isSAnd PERespectively representing the intersection of the study link as the starting point and the ending point of the link. Next, the area of the trapezoid enclosed by the position trajectory data can be calculated using the following equation:
Si,t=[(xi,t-xbase)+(xi,t+1-xbase)]×(yi,t+1-yi,t)/2 (4)
wherein x isbaseRepresenting the origin in the latitude and longitude coordinate axes, where xbaseTaking the latitude value with the minimum latitude and longitude value in all the mobile phone positioning data sets in the research road section, namely min (lat)i);xi,tAnd xi,t+1Representing the latitude of the mobile phone user i at the time t and the time t +1 respectively; y isi,tAnd yi,t+1Represents the longitude of the handset user i at times t and t +1, respectively, where yi,t+1-yi,tIndicating the height of the trapezoidal area enclosed by the two different time instants when projected onto the longitude coordinate axis. And then Si,tCan be expressed as the area of a trapezoid formed by the mobile phone user i between the locus and the abscissa in t and t +1 respectively.
The total area enclosed by all the tracks and the abscissa for each handset user i can be expressed as follows:
Figure BDA0003321888360000101
wherein A isiIndicating that mobile phone user i is in research roadThe sum of the area formed by any two tracks in the segment and the abscissa.
By respectively calculating the track areas A of any two different mobile phone users in the coordinate systemi. Further, the difference A between the track areas of different mobile phone users can be obtainedijThe calculation is as follows:
Aij=|Ai-Aj|,i=1,2,...,n,j=1,2,...,n (5)
wherein A isijAnd the difference value of the i track area of the mobile phone user and the j track area of the mobile phone user is represented. Therefore, by calculating the difference of all mobile phone user track areas in a unit period of the whole research road section, the area difference can be expressed in the form of an area difference matrix as follows:
Figure BDA0003321888360000102
in combination with the calculation of the abstract area surrounded by the mobile phone positioning tracks, the travel pattern recognition method based on the minimum support tree clustering provided by the invention comprises the following specific steps:
step 1: assuming that there are n mobile phone users on the research road segment, the area of the trajectory and coordinate axis of the n mobile phone users can be constructed, which can be expressed as n vertexes in the fully-connected undirected graph G. At this time, the weight of the edge between the vertexes is the difference between the areas surrounded by the mobile phone tracks, and the area difference matrix can be represented as the above area difference matrix; furthermore, the connected and undirected graphs may be represented as G ═ (a, E, ω), where a is the set of vertices, i.e., the area enclosed by the mobile phone user trajectory, and E represents the set of edges connecting the two vertices, where each edge may be defined as E ═ (a, E, ω), whereu,Av) That is, the edges of the connection between the tracks of different mobile phone users are u ≠ 1,2, …, n, v ≠ 1,2, …, n, u ≠ v. The weight of an edge may be expressed as ω (e) ═ au,Av) The weight is calculated as
ω(e)=|Au-Av| (7)
Wherein A isu,AvRespectively showing the track and the sit-ups of a mobile phone user u and a mobile phone user vThe total area enclosed by the axis. Then, calculating the difference between the track areas of all the mobile phone users, and obtaining the weight matrix, as shown in formula (6).
Step 2: partitioning of an undirected graph G into several subset graphs S by Prim algorithmi(ii) a Generating each subset graph SiThe smallest of which supports the sub-branches in the tree.
Partitioning a subset graph S from an interconnected graph G by application of Prim algorithmi. The Prim algorithm is a clustering method which starts from a starting point, selects all edges with the minimum weight and sequentially adds other vertexes into a support tree. It is assumed that in the connected undirected graph G ═ V, E), the set of vertices V can be divided into two subsets T' and T, respectively. Wherein T' does not belong to the set of vertices of the current support tree, and T belongs to the set of vertices of the current support tree. Further, it can be expressed as T ═ V.
The creation result is shown in fig. 3, and the algorithm construction steps are as follows:
(1) if a cluster subset graph SiIf the graph is an acyclic graph, the subset graph can be directly regarded as a minimum support tree, and the step (5) is turned to, otherwise, the step (2) is continued;
(2) selecting a cluster subset graph S with ringsiThe edge of the smallest weight in (1)<Au,Av>;
(3) Will be the edge<Au,Av>And putting the new vertex connected with the edge into the set T, if the set T contains all the vertices in the graph G, turning to (5), and otherwise, entering the step (4).
(4) Finding the edge with the minimum weight value in the edges formed by each vertex and the outer vertex of T<Au,Av>Turning to (3);
(5) end, when T is the clustering subset graph SiThe minimum support tree.
And 3, step 3: repeatedly removing subset picture SiThe edge with the maximum weight in the cluster map S is obtained by the method that the number of clustering subsets reaches the subset map SiAnd (5) finishing clustering by the optimal clustering number.
Finding each cluster subset graph SiMax (T) of the smallest support tree with the largest weighti(emax) Remove max: (Ti(emax) ) corresponding edges, thereby obtaining c clustering subsets; if c is<cmaxIf so, c is equal to c +1, and the step is repeated, otherwise, the classification is finished. c. CmaxIs the total number of the vehicle-mounted mobile phone users.
And 4, step 4: assuming total cmaxEvaluating the cluster, wherein cmaxIs the total number of the users of the vehicle-mounted mobile phone, cminEqual to 1. Then, the optimal clustering number (the number of mobile phone users in the vehicle) is obtained by using the clustering criterion, a clustering effectiveness index function f (c) is established, and in [ c ]min,cmax]Interval search for the number of clusters c at which the VRC value reaches the maximum value*. At this time c*Can be defined as the number of best clusters, i.e., the number of vehicles that will ultimately be obtained. And after the minimum support tree is constructed, splitting the minimum support tree into optimal clustering subsets.
The clustering validity function can be used for evaluating the clustering effect or solving the optimal clustering number. The optimal ideal clustering effect is that the distance between the classes is the minimum and the distance between the classes is the maximum. The CH function is selected as an effect of evaluating the best cluster number among the mobile phone users traveling on the road segment. The CH function is specifically defined as follows:
Figure BDA0003321888360000121
Figure BDA0003321888360000122
Figure BDA0003321888360000123
where BGSS represents the inter-group distance (cell phones in different moving vehicles) and has k-1 degrees of freedom; WGSS denotes the inter-group distance (cell phone in the same moving vehicle) and, with n-k degrees of freedom; n represents the total number of mobile phone users for constructing the support tree, namely the sample size; k represents the number of classes after the constructed minimum support tree is split,i.e. the number of clusters; n isjRepresenting the number of mobile phone users in the jth category after clustering (namely after the constructed minimum support tree is segmented), mu represents the mean value of the weights of edges in the minimum support tree constructed based on all the mobile phone users, and mujRepresenting the weighted mean of the edges within the clustered category j. c. CjAnd representing the weight subset of the edges in the clustered category j, and x representing the weight of each edge in the clustered category j. In summary, the main principle of the clustering algorithm is to require the minimization of the inter-group distance (WGSS) and the maximization of the inter-group distance (BGSS) in the splitting process of the minimum support tree.
Finally, the VRC function is used for expressing the proportion between the inter-class distance and the intra-class distance, wherein the number of the classes corresponding to the maximum value of the VRC is the optimal clustering number.
S300, regarding the number of the vehicle-mounted mobile phone users in the clustering result as travelers taking the same vehicle at the same time, and identifying the traveling mode of the vehicle-mounted mobile phone users as independent traveling, shared traveling or bus traveling.
In the urban road network travel mode estimation research, the number of vehicle-mounted mobile phone users is selected as the basis of the divided travel modes, and the travel modes are mainly divided into: and the bus is taken out independently, taken together and taken out and is taken out by buses. The judgment standard of each travel mode is as follows:
and (3) independent travel: when the number of the vehicle-mounted mobile phone users in the category after clustering is 1, representing independent travel;
and (3) carpooling travel: when the number of the vehicle-mounted mobile phone users in the category is equal after clustering, the vehicle-mounted mobile phone users are represented as a carpooling trip;
and (3) bus trip: and when the number of the vehicle-mounted mobile phone users in the category is more than 7 after clustering, the bus trip is represented.
Examples
The road network within 5 kilometers around the Nanjing high-speed railway station in Nanjing City of Jiangsu province is selected to carry out the division of the travel mode and the traffic volume estimation, and the research road network is about an area of 80 square kilometers. There are 99 bidirectional segments in the research range, of which 27 segments are provided with LPR detector devices, and these segments can acquire the real flow of the segments. In addition, the real flow of 72 road sections is unknown, and the estimation can be carried out by applying the method provided by the rush to the you. Road networks include links of different road classes such as main roads, sub-main roads, and branch roads.
The data selected by travel mode identification and traffic flow estimation are mainly mobile phone positioning data processed by a mobile phone triangulation location Technology (TDOA), and the positioning accuracy of the data is about 50-150 m. Wherein the data acquisition period is from 2016/09/26 to 2016/10/09, and the data acquisition period is 14 days. The amount of data acquired in this area per day is about 600 or more thousand. The data tag mainly comprises an ISMI identification code of a mobile phone user, a timestamp, longitude and latitude, and a specific data style is shown in the following table 1.
Table 1 mobile phone user data tag
ISMI Time stamp Longitude (G) Latitude
460030230759353 03-SEP-16 11.36.52.000000PM 118.7702866 31.9927197
460031698566428 03-SEP-16 11.36.54.000000PM 118.8031311 31.9786892
460030741044661 03-SEP-16 11.36.40.000000PM 118.7701569 31.9926796
460037590970936 03-SEP-16 11.36.51.000000PM 118.77034 31.9930305
And judging whether the mobile phone data is the data of the vehicle-mounted mobile phone user or not by calculating the traveling speed of the mobile phone user. The ORACLE database was chosen for screening of data. Inputting all the mobile phone data into an ORACLE database, sorting the positioning data of mobile phone users driving in the same road section by selecting a timestamp, and calculating the average speed and the instantaneous speed of the mobile phone data. Excluding non-vehicular user data and non-continuous signals.
It is necessary to ensure that each mobile phone user in the research road segment has at least two positioning point information, so that a combined polygon can be formed between the tracks. And removing the user data with less than two positioning points.
Urban roads are divided into several sections according to the position of intersections. Typically, there are two directions in each road segment. We start with the intersection where the vehicle is driving into. If the vehicle exits the intersection, the intersection is taken as the terminal. Then, the location trajectory of each handset can be extracted.
And selecting python as a program running platform, and combining a Graph clustering package provided in the github to construct the minimum support tree.
The travel mode identification method based on the mobile phone positioning data can calculate the proportion of different travel modes on the three road sections, as shown in the following table 2. By analyzing the estimation result, compared with the estimation result of the branch road section, the proportion of bus traveling is the highest, and is about 10%. In addition, the proportion of the combined cars in the main road is 37 percent, and the proportion of the branch road is lower. In the branch, the proportion of individual riding trips is large, accounting for about 61%. The reason is that the number of the buses in the branch is small, the whole urban road network mainly has the function of serving each traffic cell, and the function of inter-area traffic trip is small. Furthermore, for different types of research road sections, the difference in the number of lanes may also result in different proportions of travel patterns.
TABLE 2 analysis of estimated travel pattern errors in different road segments
Figure BDA0003321888360000141
Figure BDA0003321888360000151
The travel mode identification system based on the mobile phone positioning data is provided, and comprises a data screening module, a track forming module, a clustering module and a travel mode identification module in combination with the graph 5.
The data screening module is used for acquiring mobile phone GPS data in an area to be analyzed within a period of time; screening out vehicle-mounted mobile phone user data running in an actual road;
the track forming module is used for forming the running track of each vehicle-mounted mobile phone user;
the clustering module is used for performing data pruning through a minimum support tree algorithm by taking the area difference value between each track of two vehicle-mounted mobile phone users as input to form a clustering result;
and the travel mode identification module is used for regarding the number of the vehicle-mounted mobile phone users of the same type in the clustering result as travelers taking the same vehicle at the same time, and identifying the travel mode of the vehicle-mounted mobile phone users as independent travel, shared travel or bus travel.
There is provided a computer-readable storage medium, wherein the computer-readable storage medium stores program instructions, and when the program instructions are executed by a processor, the method for identifying a travel mode based on positioning data of a mobile phone is implemented.
In summary, the present invention relates to a travel mode identification method and system based on mobile phone positioning data, which obtains mobile phone GPS data in an area to be analyzed within a period of time; screening out vehicle-mounted mobile phone user data running in an actual road to form a running track of a vehicle-mounted mobile phone user; taking the area difference value between each track between two vehicle-mounted mobile phone users as input, and pruning data through a minimum support tree algorithm to form a clustering result; and regarding the number of the vehicle-mounted mobile phone users of the same type in the clustering result as travelers taking the same vehicle at the same time, and identifying the travel mode of the vehicle-mounted mobile phone users as independent travel, shared travel or bus travel. The invention innovatively provides a minimum support tree clustering method based on graph theory aiming at the screened vehicle-mounted mobile phone users, and the travel modes can be accurately divided according to the positioning tracks of the vehicle-mounted mobile phone users.
It is to be understood that the above-described embodiments of the present invention are merely illustrative of or explaining the principles of the invention and are not to be construed as limiting the invention. Therefore, any modification, equivalent replacement, improvement and the like made without departing from the spirit and scope of the present invention should be included in the protection scope of the present invention. Further, it is intended that the appended claims cover all such variations and modifications as fall within the scope and boundaries of the appended claims or the equivalents of such scope and boundaries.

Claims (10)

1. A travel mode identification method based on mobile phone positioning data is characterized by comprising the following steps:
acquiring mobile phone triangulation data in an area to be analyzed within a period of time; screening out vehicle-mounted mobile phone user data running in an actual road to form a running track of a vehicle-mounted mobile phone user;
taking the area difference value between the tracks of each vehicle-mounted mobile phone user as input, and pruning data through a minimum support tree algorithm to form a clustering result;
and regarding the number of the vehicle-mounted mobile phone users of the same type in the clustering result as travelers taking the same vehicle at the same time, and identifying the travel mode of the vehicle-mounted mobile phone users as independent travel, shared travel or bus travel.
2. The method for identifying a travel mode based on the mobile phone positioning data according to claim 1, comprising: screening out the vehicle-mounted mobile phone user data of operation in the actual road, including: mapping the mobile phone positioning data to a road network map, calculating the average speed of each road section in unit time, and rejecting the mobile phone positioning data if the average speed is lower than a set threshold value;
the average velocity is calculated as follows:
Figure FDA0003321888350000011
wherein m represents the number of positioning points of the mobile phone user in unit time period T, TkAnd x (k) represents latitude and longitude coordinate information of the position at the kth moment.
3. A travel mode identification method based on mobile phone positioning data according to claim 2, characterized in that the maximum instantaneous speed is calculated, and mobile phone positioning data of which the maximum instantaneous speed exceeds the highest speed limit of each road section is removed;
the maximum instantaneous speed is as follows:
Figure FDA0003321888350000012
wherein, TkThe time intervals of the mobile phone user at the kth moment and the kth-1 moment are represented, and x (k) and x (k-1) respectively represent the longitude and latitude of the geographic position of the mobile phone user at the kth moment and the kth-1 moment.
4. A travel mode identification method based on mobile phone positioning data according to claim 3, characterized by rejecting mobile phone positioning data with less than two positioning points.
5. A travel mode identification method based on mobile phone positioning data according to one of claims 1 to 4, characterized in that, taking the area difference between the tracks of each vehicle-mounted mobile phone user as input, data pruning is performed through a minimum support tree algorithm to form a clustering result, comprising:
each vehicle-mounted mobile phone user on a certain road section serves as a vertex to form a full-connection undirected graph G (A, E, omega); a is a vertex set, namely an area set surrounded by the tracks of the mobile phone users; the weight value of the edges between the vertexes is the difference value E between the areas enclosed by the user tracks of the vehicle-mounted mobile phones to represent an edge set connecting the two vertexes; the connecting edge e between different vehicle-mounted mobile phone user tracks is (A)u,Av) U is 1,2, …, n, v is 1,2, …, n, u ≠ v, and the weight of the side is ω (e) ═ au,Av) The weight calculation formula is:
ω(e)=|Au-Av|
wherein A isu,AvRespectively representing the total area enclosed by the tracks of the vehicle-mounted mobile phone user u and the vehicle-mounted mobile phone user v and the abscissa axis, calculating the difference value between the track areas of all the vehicle-mounted mobile phone users as the distance between vertexes, namely the weight of the side, and respectively calculating the areas between the tracks of all the users, thereby obtaining a weight matrix:
Figure FDA0003321888350000021
partitioning of an undirected graph G into several subset graphs S by Prim algorithmiSiThe minimum support tree form of (1);
for each subset graph SiRepeatedly eliminating subset picture SiThe edge with the maximum weight is determined, so that the intra-class distance is minimum, the inter-class distance is maximum, and the number of clustering subsets reaches the subset graph SiAnd (5) finishing clustering by the optimal clustering number.
6. A travel mode identification method based on mobile phone positioning data according to claim 5, characterized in that each subset graph S is generatediThe minimal support tree of (1), comprising:
(1) if a cluster subset graph SiIf the graph is an acyclic graph, the subset graph is directly regarded as a minimum support tree, and the step (5) is turned to, otherwise, the step (2) is continued;
(2) finding a certain cluster subset graph S with ringsiThe edge of the smallest weight in (1)<Au,Av>;
(3) Will edge<Au,Av>Putting the new vertex connected with the edge into a set T, if the set T contains all the vertexes in the graph G, turning to the step (5), and if not, entering the step (4);
(4) finding the edge with the minimum weight value in the edges formed by each vertex and the outer vertex of T<Au,Av>Turning to (3);
(5) the formed set T is a cluster subset graph SiThe minimum support tree.
7. The method of claim 5, wherein the subset map S is a set of maps, and a set of data set of mapsiThe determination of the optimal cluster number comprises: in [ c ]min,cmax]Interval search for the number of clusters at which the VRC value reaches the maximum value as the optimal number of clusters, cmaxIs the total number of the users of the vehicle-mounted mobile phone, cminEqual to 1;
and (3) calculating:
Figure FDA0003321888350000031
Figure FDA0003321888350000032
Figure FDA0003321888350000033
wherein, BGSS represents the sum of the distances between the mobile phone users of two different classes, which has k-1 degrees of freedom; the WGSS represents the sum of distances between mobile phone users in the same category, and has n-k degrees of freedom; n represents the total number of the vehicle-mounted mobile phone users for constructing the support tree; k represents the number of the split categories of the constructed minimum support tree, namely the number of clusters; n isjThe number of mobile phone users in the jth category after the constructed minimum support tree is divided is represented, mu represents the mean value of the weights of the edges in the minimum support tree constructed based on all vehicle-mounted mobile phone users, and mu represents the weight of the edges in the minimum support treejWeight mean, c, representing the interior edges of the clustered class jjRepresenting the weight subset of the inner edges of the clustered category j, and x representing the weight of each edge in the clustered category j;
the corresponding k value when the WGSS value is minimum and the BGSS value is maximum is calculated as the optimal cluster number.
8. A travel mode identification method based on mobile phone positioning data according to one of claims 1 to 4, characterized in that, identifying the travel mode of the vehicle-mounted mobile phone user comprises: when the number of the vehicle-mounted mobile phone users in the category after clustering is 1, representing independent travel; when the number of the vehicle-mounted mobile phone users in the category is equal after clustering, the vehicle-mounted mobile phone users are represented as a carpooling trip; and when the number of the vehicle-mounted mobile phone users in the category is more than 7 after clustering, the bus trip is represented.
9. The utility model provides a trip mode identification system based on cell-phone location data which characterized in that includes:
the data screening module is used for acquiring mobile phone triangulation location data in an area to be analyzed within a period of time; screening out vehicle-mounted mobile phone user data running in an actual road;
the track forming module is used for forming the running track of each vehicle-mounted mobile phone user;
the clustering module is used for performing data pruning through a minimum support tree algorithm by taking the area difference value between the tracks of each vehicle-mounted mobile phone user as input to form a clustering result;
and the travel mode identification module is used for regarding the number of the vehicle-mounted mobile phone users of the same type in the clustering result as travelers taking the same vehicle at the same time, and identifying the travel mode of the vehicle-mounted mobile phone users as independent travel, shared travel or bus travel.
10. A computer-readable storage medium, wherein the computer-readable storage medium stores program instructions, and when the program instructions are executed by a processor, the method for identifying travel modes based on positioning data of mobile phones according to one of claims 1 to 8 is implemented.
CN202111248470.6A 2021-10-26 2021-10-26 Travel mode identification method and system based on mobile phone positioning data Pending CN113920352A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111248470.6A CN113920352A (en) 2021-10-26 2021-10-26 Travel mode identification method and system based on mobile phone positioning data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111248470.6A CN113920352A (en) 2021-10-26 2021-10-26 Travel mode identification method and system based on mobile phone positioning data

Publications (1)

Publication Number Publication Date
CN113920352A true CN113920352A (en) 2022-01-11

Family

ID=79242888

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111248470.6A Pending CN113920352A (en) 2021-10-26 2021-10-26 Travel mode identification method and system based on mobile phone positioning data

Country Status (1)

Country Link
CN (1) CN113920352A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105608896A (en) * 2016-03-14 2016-05-25 西安电子科技大学 Traffic bottleneck identification method in urban traffic network
CN109462814A (en) * 2018-11-28 2019-03-12 南京理工大学 Locating base station selection method based on minimum spanning tree clustering algorithm

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105608896A (en) * 2016-03-14 2016-05-25 西安电子科技大学 Traffic bottleneck identification method in urban traffic network
CN109462814A (en) * 2018-11-28 2019-03-12 南京理工大学 Locating base station selection method based on minimum spanning tree clustering algorithm

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JIPING XING; ZHIYUAN LIU; CHUNLIANG WU; SHUYAN CHEN: "Traffic Volume Estimation in Multimodal Urban Networks Using Cell Phone Location Data", IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE, vol. 11, no. 3, 25 June 2019 (2019-06-25), pages 93 - 104, XP011736213, DOI: 10.1109/MITS.2019.2919593 *

Similar Documents

Publication Publication Date Title
US9355063B2 (en) Parking lot detection using probe data
CN108170793B (en) Vehicle semantic track data-based dwell point analysis method and system
CN105608505B (en) Resident rail transit trip mode identification method based on mobile phone signaling data
CN112133090A (en) Multi-mode traffic distribution model construction method based on mobile phone signaling data
CN110909788B (en) Statistical clustering-based road intersection position identification method in track data
CN108961758B (en) Road junction widening lane detection method based on gradient lifting decision tree
CN111341135B (en) Mobile phone signaling data travel mode identification method based on interest points and navigation data
CN113436433B (en) Efficient urban traffic outlier detection method
CN110413855B (en) Region entrance and exit dynamic extraction method based on taxi boarding point
Xing et al. Traffic volume estimation in multimodal urban networks using cell phone location data
CN105844031B (en) A kind of urban transportation gallery recognition methods based on mobile phone location data
CN113079463A (en) Tourist attraction tourist travel activity identification method based on mobile phone signaling data
CN110830915B (en) Method and device for determining starting point position
CN115795332A (en) User travel mode identification method
CN108538054A (en) A kind of method and system obtaining traffic information based on mobile phone signaling data
CN113888867B (en) Parking space recommendation method and system based on LSTM (least squares) position prediction
Liu et al. Determination of routing velocity with GPS floating car data and webGIS-based instantaneous traffic information dissemination
CN114116926A (en) Passenger travel mode identification method based on bus stop information matching
Al Mahmud et al. Impact of pedal powered vehicles on average traffic speed in dhaka city: A cross-sectional study based on road class and timestamp
Habtie et al. Cellular network based real-time urban road traffic state estimation framework using neural network model estimation
CN116129643A (en) Bus travel characteristic identification method, device, equipment and medium
CN113920352A (en) Travel mode identification method and system based on mobile phone positioning data
CN115129769A (en) Resident travel survey sample expansion method and device and storage medium
CN109409731B (en) Highway holiday travel feature identification method fusing section detection traffic data and crowdsourcing data
Elleuch et al. Collection and exploration of GPS based vehicle traces database

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination