CN111860699B - Commuting trip mode identification method based on fluctuation rate - Google Patents
Commuting trip mode identification method based on fluctuation rate Download PDFInfo
- Publication number
- CN111860699B CN111860699B CN202010872239.3A CN202010872239A CN111860699B CN 111860699 B CN111860699 B CN 111860699B CN 202010872239 A CN202010872239 A CN 202010872239A CN 111860699 B CN111860699 B CN 111860699B
- Authority
- CN
- China
- Prior art keywords
- clustering
- formula
- site
- calculating
- center
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 10
- 238000011160 research Methods 0.000 claims description 24
- 238000004364 calculation method Methods 0.000 claims description 17
- 239000004576 sand Substances 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 5
- 238000003064 k means clustering Methods 0.000 claims description 4
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000003909 pattern recognition Methods 0.000 claims description 3
- 238000012567 pattern recognition method Methods 0.000 claims 1
- 230000006872 improvement Effects 0.000 description 9
- 230000000052 comparative effect Effects 0.000 description 4
- 238000011161 development Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000007781 pre-processing Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 235000003934 Abelmoschus esculentus Nutrition 0.000 description 1
- 240000004507 Abelmoschus esculentus Species 0.000 description 1
- 241000251468 Actinopterygii Species 0.000 description 1
- 241000272814 Anser sp. Species 0.000 description 1
- 210000000709 aorta Anatomy 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 239000003208 petroleum Substances 0.000 description 1
- 230000000284 resting effect Effects 0.000 description 1
- 239000011435 rock Substances 0.000 description 1
- 239000004575 stone Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/40—Business processes related to the transportation industry
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Business, Economics & Management (AREA)
- Mathematical Physics (AREA)
- Strategic Management (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- Mathematical Analysis (AREA)
- Marketing (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Optimization (AREA)
- Computational Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Operations Research (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Artificial Intelligence (AREA)
- Game Theory and Decision Science (AREA)
- Quality & Reliability (AREA)
- Bioinformatics & Computational Biology (AREA)
- Development Economics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Entrepreneurship & Innovation (AREA)
- Probability & Statistics with Applications (AREA)
- Evolutionary Computation (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Traffic Control Systems (AREA)
Abstract
The invention relates to a commuting trip mode identification method based on fluctuation rate, which comprises the steps of firstly dividing urban areas, clustering by using a K-means algorithm, randomly selecting a clustering center, calculating the influence distance from each station to each clustering center point aiming at each station in a clustering data set, determining which clustering center the influence distance from the station to is the smallest, dividing the station into the clustering center class, and outputting the clustering center and all stations in each class; and then, carrying out commuting travel mode identification, wherein a passenger flow fluctuation rate is introduced in the step, the number q of fluctuation rates larger than a threshold value is counted, and if q is 4 and four fluctuation rate peak values respectively correspond to the starting and stopping time points of the early peak and the late peak, the pair of study object areas are identified as commuting travel modes. The method can accurately identify the commuting travel mode, so that the accuracy of station passenger flow prediction is improved, and further, early warning of congestion or abnormity can be effectively given.
Description
Technical Field
The invention relates to an LSTM network data preprocessing method for predicting passenger flow in an OD area, in particular to a commuting travel mode identification method based on fluctuation rate.
Background
Along with the development of the modernization process of the world city and the gradual rise and development of various business circles in the city, the city economy is continuously flourished, meanwhile, the number of urban residents is also rapidly increased, the number of road motor vehicles is increased day by day, and great pressure is brought to urban road traffic. The urban road congestion condition is more serious due to the fact that the number of the urban roads cannot meet the travel demand of urban residents. The congestion of road traffic seriously restricts the development of urban economy, and becomes an important problem restricting the development of urban modernization. In recent years, urban cultural communication is frequent, including the holding of various large-scale events, and the quantity of urban residents in residents is increasing due to holidays, so that the sudden increase of passenger flow in a short time is very likely to be caused. The quality of life of urban residents is continuously improved, and the requirements on the comfort level and the convenience of traveling are also continuously increased. The urban rail transit is one of important transportation means for solving urban road congestion due to the advantages of convenience, rapidness, punctuality and large passenger capacity.
At present, rail transit is taken as an important travel mode for traveling of residents in Chongqing cities, so that the rail transit becomes an aorta of urban traffic for the Chongqing cities, and an important mode for relieving traffic jam conditions is provided. At Chongqing, more than 200 million people have passenger flow entering and leaving urban rail transit every day. The networking complexity of urban rail transit is continuously increased, the future traffic trend analysis is more and more emphasized, and based on the analysis result of regional OD passenger flow prediction, a traffic operation plan can be made, and early warning of congestion or abnormity can be made to improve the operation efficiency and the service quality of rail transit, so that the urban rail transit becomes one of key technologies of an Intelligent Transportation System (ITS).
The passenger flow prediction of the region OD is researched by taking historical passenger flow as an entry point, and the station region division and the region passenger flow travel mode of urban rail transit are identified, so that the early warning of congestion or abnormity can be effectively given.
Disclosure of Invention
Aiming at the problems in the prior art, the technical problems to be solved by the invention are as follows: the method for effectively identifying the commuting travel mode is provided.
In order to solve the technical problems, the invention adopts the following technical scheme: a commuting travel mode identification method based on fluctuation rate comprises the following steps:
s10, dividing the city area;
s11 go through the cityThe number and the clustering range of the divided and clustered regions of some administrative and functional region division planning regions are as follows: all n study site x are taken as a clustering dataset Ω, Ω ═ x1,x2,x3……xn};
All the research object sites are respectively classified into k site area sets thetaiIn (c) (-)i={xi,1,xi,2,xi,3……},i∈{1,2,3,4…,k};
S12, clustering by using a K-means algorithm, randomly selecting a clustering center, and calculating the influence distance from each station to each clustering center point aiming at each station in the omega clustering data setDetermining site xiThe smallest distance of impact to which cluster center, site x will beiDividing the cluster centers into classes of the cluster centers, and outputting the cluster centers and all the sites in each class;
s20, identifying a commuting travel mode;
s21, each clustering center and all the sites of the clustering center form a research object area, and a group of research object areas are formed by the two research object areas;
randomly extracting hourly passenger flow statistics a for a plurality of working days for a group of study areasiThe 24-hour passenger flow data form a data set Ψ, Ψ ═ a1,a2,a3…a24};
S22, respectively calculating 24-n passenger flow fluctuation rates SiSearch for 24-n passenger flow fluctuation rates siCounting the number q of fluctuation rates larger than a threshold value:
if q is 4 and the four fluctuation rate peaks correspond to the start-stop time points of the early peak and the late peak, respectively, then this set of subject areas is identified as commuting travel pattern.
As an improvement, the method for randomly selecting the cluster center in S12 is as follows:
1) randomly selecting a site from a clustering data set omega as an initial clustering center C1By the formula(1) Compute site xiAnd cluster center point CjEuclidean distance ofCalculating site x by equation (2)iProbability of being selected as next cluster center point
Where k is the coordinate parameter dimension, xi,kAnd cj,kRespectively represent sites xiAnd cluster center point CjThe kth-dimension data of (1);
2) according to each site xiIs/are as followsDetermining the area of the wheel disc of each station, selecting the next clustering center by using the wheel disc method, and belonging to the same theta after selecting the next clustering center each timeiX ofiDeleting the cluster centers from the wheel disc, and sequentially selecting k cluster centers as a cluster center point set phi, wherein phi is { c }1,c2,c3…ct…ck}。
As an improvement, the influence distance from each station to each cluster center point is calculated in the step S12The method comprises the following steps:
I) calculating the Euclidean distance mean value S from each non-cluster-center site to the cluster center by using the formula (3)iAs characteristic Euclidean distance values;
calculating the belonged characteristic value R of each non-clustering-center site to the preset region of the research clustering center by using the formula (4)i;
i∈{1,2,3,4…,n-k} (4);
II) using the formula (5) and the formula (6) to obtain SiAnd RiS 'is obtained by normalization calculation'iAnd R'i;
III) calculating entropy values e of the current research clustering centers S and R using formula (7) and formula (8), respectivelySAnd eR;
Wherein, S "iAnd R "iIs two calculated intermediate values, without practical meaning, S "iAnd R "iRespectively calculated by formula (9) and formula (10);
IV) calculating the information entropy redundancies d of the S and R of the current research clustering centers by respectively using a formula (11) and a formula (12)SAnd dR;
dS=1-eS (11);
dR=1-eR (12);
V) calculating information entropy weights w of the current research cluster centers S and R by using formula (13) and formula (14) respectivelySAnd wR;
V) repeating the calculation processes of I) to V) to obtain information entropy weights w of k clustering centersS,iAnd wR,i;
VI) clustering operation is carried out by using a K-means clustering algorithm, and the influence distance from each station to each clustering center point is calculated by using a formula (15)
As an improvement, the S12 outputs the cluster center and all the sites in each class as:
a) calculating the influence distance from each site to each cluster center pointDeterminingSite xiThe smallest distance of impact to which cluster center, site x will beiDividing into the cluster center class;
b) for each class i after repartitioning in a), calculating a new cluster center c for that class using equation (16)i;
c) Repeatedly and randomly selecting clustering centers and calculating the influence distance from each site to each clustering centerAnd finishing the division of the regional sites until the position of the clustering center of each category is not changed any more, and outputting the clustering center and all the sites in each category.
As an improvement, the passenger flow fluctuation rate S in S22iThe calculation method comprises the following steps:
s221: 23 logarithmic parameters b of the time of the passenger flow are calculated by using a formula (17)i,
S222: calculating 24-n passenger flow fluctuation rates s using equation (18)iWhere n is the fluctuation observation range, s1The fluctuation rate at the time of 1+ (n-1)/2 points is shown;
As an improvement, in S22, the method for counting the number q of fluctuation rates greater than the threshold value includes:
calculating 24-n passenger flow fluctuation rates s using the formula (20)iMean value ofAnd standard deviation ds;
Retrieving 24-n passenger flow fluctuation rates siCounting the number q of fluctuation rates larger than a threshold value:
compared with the prior art, the invention has at least the following advantages:
according to the method, the urban rail transit station area division is carried out by combining two factors of urban administration, functional area division and station geographic position and using the entropy weight calculation method. And after the regional division is mature, the fluctuation rate is introduced to identify the morning and evening peaks of regional passenger flow so as to identify the commuting travel mode. Finally, all the preprocessing operations enable the area pairs identified as the commuting travel modes to adopt the working day data of the holidays excluded as historical contemporaneous data sequences for calculation and prediction so as to achieve more accurate area OD passenger flow prediction effect.
Drawings
FIG. 1 shows the K-mean clustering result of the present invention.
FIG. 2 is a region-divided graph obtained by the method of the present invention.
Detailed Description
The present invention is described in further detail below.
The urban area division according to urban rail transit is the basis for extracting the travel mode of urban rail transit passengers. In order to divide urban areas, a K-mean clustering algorithm is applied on the basis of analyzing the structure of a rail transit network and dividing urban administrative and functional areas.
A commuting travel mode identification method based on fluctuation rate comprises the following steps:
s10, dividing the city area;
and S11, dividing the clustering number and the clustering range by the current administrative and functional area division planning areas of the city: all n study site x are taken as a clustering dataset Ω, Ω ═ x1,x2,x3……xn};
All the research object sites are respectively classified into k site area sets thetaiIn (c) (-)i={xi,1,xi,2,xi,3……},i∈{1,2,3,4…,k};
S12, clustering by using a K-means algorithm, randomly selecting a clustering center, and calculating the influence distance from each station to each clustering center point aiming at each station in the omega clustering data setDetermining site xiThe smallest distance of impact to which cluster center, site x will beiDividing the cluster centers into classes of the cluster centers, and outputting the cluster centers and all the sites in each class;
s20, identifying a commuting travel mode;
s21, each clustering center and all the sites of the clustering center form a research object area, and a group of research object areas are formed by the two research object areas;
for a group of study areasRandomly extracting hourly passenger flow statistical data a of a plurality of working days by domainiThe 24-hour passenger flow data form a data set Ψ, Ψ ═ a1,a2,a3…a24};
S22, respectively calculating 24-n passenger flow fluctuation rates SiSearch for 24-n passenger flow fluctuation rates siCounting the number q of fluctuation rates larger than a threshold value:
if q is 4 and the four fluctuation rate peaks correspond to the start and stop time points of the early and late peaks, respectively, in this example 7 am and 9 am, and 6 pm and 8 pm, then this set of subject areas is identified as commuting travel mode.
As an improvement, the method for randomly selecting the cluster center in S12 is as follows:
1) randomly selecting a site from a clustering data set omega as an initial clustering center C1Calculating site x by equation (1)iAnd cluster center point CjEuclidean distance ofCalculating site x by equation (2)iProbability of being selected as next cluster center point
Where k is the coordinate parameter dimension, xi,kAnd cj,kRespectively represent sites xiAnd cluster center point CjThe kth-dimension data of (1);
2) according to each site xiIs/are as followsDetermining the area of the wheel disc of each station, selecting the next clustering center by using the wheel disc method, and belonging to the same theta after selecting the next clustering center each timeiX ofiDeleting the cluster from the wheel disc to ensure that the last k cluster centers are respectively positioned at different preset thetaiSequentially selecting k clustering centers as a clustering center point set phi, wherein phi is { c ═ c1,c2,c3…ct…ck}。
In order to enable the final region division result to contain the characteristics of urban administrative and functional regions, an entropy weight is introduced to determine the Euclidean distance and the weight value of the condition of a preset region, and the influence distance is the sum of the product of the Euclidean distance and the weight value.
Firstly, because the value of the influence distance is the sum of the product of the euclidean distance eigenvalue and the weight value of the affiliated eigenvalue of the preset region, respectively, we need to obtain the information entropy weight of each cluster center about the two eigenvalues. The following is the process of finding the information entropy weight of a certain cluster center (the entropy weight needs to be calculated for each cluster center pair):
as an improvement, the influence distance from each station to each cluster center point is calculated in the step S12The method comprises the following steps:
I) calculating the Euclidean distance mean value S from each non-cluster-center site to the cluster center by using the formula (3)iAs characteristic Euclidean distance values;
calculating the belonged characteristic value R of each non-clustering-center site to the preset region of the research clustering center by using the formula (4)i;
i∈{1,2,3,4…,n-k} (4);
II) using the formula (5) and the formula (6) to obtain SiAnd RiS 'is obtained by normalization calculation'iAnd R'i;
III) calculating entropy values e of the current research clustering centers S and R using formula (7) and formula (8), respectivelySAnd eR;
Wherein, S "iAnd R "iIs two calculated intermediate values, without practical meaning, S "iAnd R "iRespectively calculated by formula (9) and formula (10);
IV) calculating the information entropy redundancies d of the S and R of the current research clustering centers by respectively using a formula (11) and a formula (12)SAnd dR;
dS=1-eS (11);
dR=1-eR (12);
V) calculating information entropy weights w of the current research cluster centers S and R by using formula (13) and formula (14) respectivelySAnd wR;
V) repeating the calculation processes of I) to V) to obtain information entropy weights w of k clustering centersS,iAnd wR,i;
VI) clustering operation is carried out by using a K-means clustering algorithm, and the influence distance from each station to each clustering center point is calculated by using a formula (15)
As an improvement, the S12 outputs the cluster center and all the sites in each class as:
a) calculating the influence distance from each site to each cluster center pointDetermining site xiThe smallest distance of impact to which cluster center, site x will beiDividing into the cluster center class;
b) for each class i after repartitioning in a), calculating a new cluster center c for that class using equation (16)i;
c) Repeated random selection clusteringCentering and calculating the influence distance from each site to each cluster center pointAnd finishing the division of the regional sites until the position of the clustering center of each category is not changed any more, and outputting the clustering center and all the sites in each category.
After the area division is completed, in order to optimize the prediction effect by extracting historical synchronization data, the passenger flow travel mode is identified, and the commuting travel mode is mainly identified.
Since here we discuss commute travel patterns, we need to focus on early and late peaks, correspondingly we propose a commute travel pattern recognition based on the volatility.
According to the commute travel pattern definition, the identification of the commute pattern must be within the working day. And has two traffic peaks of early peak and late peak, and the early peak is probably 7 am to 9 am according to data statistics, and the late peak is 6 pm to 8 pm according to data statistics. This area OD traffic situation we call commute travel mode.
As an improvement, the passenger flow fluctuation rate S in S22iThe calculation method comprises the following steps:
s221: 23 logarithmic parameters b of the time of the passenger flow are calculated by using a formula (17)i,b1Representing the corresponding parameter at 1 point in time, and so on
S222: calculating 24-n passenger flow fluctuation rates s using equation (18)iWhere n is the fluctuation observation range, s1The fluctuation rate at the time of 1+ (n-1)/2 points is shown;
The method for counting the number q of the fluctuation rates larger than the threshold in the step S22 includes:
calculating 24-n passenger flow fluctuation rates using equation (20)Mean value ofAnd standard deviation ds;
Retrieving 24-n passenger flow fluctuation rates siCounting the number q of fluctuation rates larger than a threshold value:
and (3) experimental verification:
in the experiment, the Chongqing city is taken as an example, and track traffic data in Chongqing city areas are taken as an experiment original data set.
The experimental results clearly show that the optimized clustering algorithm has stronger environmental adaptability and good dividing effect, and avoids the misclassification condition that the geographic position is close but the track distance is far.
The invention uses GPS positioning data of stations when spatial clustering is carried out on the track stations, and the attributes in the table 1 are as follows in sequence: card id, site number, site name, longitude, latitude.
TABLE 1 GPS positioning data
id | ostation | StationName | oLongitude | oLatitude |
1 | 101 | Upward door | 106.5844 | 29.55976 |
2 | 102 | Small assorted Chinese character | 106.5791 | 29.56167 |
3 | 103 | Field comparison port | 106.5686 | 29.5564 |
4 | 104 | Seven-star sentry box | 106.5596 | 29.55797 |
5 | 105 | Two road junctions | 106.5457 | 29.55557 |
6 | 106 | Goose green | 106.5302 | 29.5508 |
7 | 107 | Terrace | 106.5149 | 29.54346 |
8 | 108 | Petroleum road | 106.5063 | 29.54199 |
9 | 109 | Resting table | 106.4928 | 29.5379 |
10 | 110 | Stone bridge is spread | 106.4813 | 29.53553 |
11 | 111 | Gaomicun | 106.465 | 29.53917 |
12 | 112 | Majia rock | 106.4648 | 29.548 |
13 | 113 | Small dragon ridge | 106.4643 | 29.55621 |
… | … | … | … | … |
By using the station area division method based on the entropy weight, the station area division method takes the Chongqing as an example result and divides the station area into the following 10 clustering areas:
the area 0 is a commercial tourist area represented by a red flag river channel transfer station.
In the area 1, fish holes are used as representatives of the southward region, and the scenic spots and ancient town courtyards are numerous.
Area 3 is a campus parcel of a college city centered around the college city.
The area 4 is a Yu Chinese and western communication industry plot represented by a plateau and a Yuanjia post.
The area 7 is a Chongqing North station-Jiangbei airport district in the direction of the Jiangbei airport, and comprises a railway station and an airport.
The area 8 is a scientific and educational culture area of an apron dam area with an apron dam as a center.
The region 9 is a convergence region of the Yangling Yangtze river represented by two paths of orifices and the south plateau.
FIG. 1 is a graph of the experimental results of the K-means clustering algorithm based on spatio-temporal influence distance, from which two points can be clearly seen: firstly, the influence of the geographic factors of clustering division is still obvious, the geographic position distance of each clustered station in the division result is relatively close, and the clustering condition with large geographic difference for meeting the influence of time dimension can not occur; secondly, the distribution of the clustering stations does not depend on the straight line geographic distance completely, and the clustering stations are all located at the similar positions of the rail transit lines from the view point of the distribution of the stations.
The urban area division step and the commuting trip pattern recognition step complement each other, a group of comparative examples are given below, and the comparison results are shown in table 2:
table 2 comparison of accuracy rates of prediction of commuting travel mode passenger flow of area OD before and after preprocessing
Network model | The method of the invention | Comparative example |
LSTM | 95.6% | 89.2% |
The only difference between the comparative example and the method of the present invention is that the method of the present invention preprocesses the acquired site data of the study object by urban regional division, whereas the comparative example does not.
The table shows that the accuracy of the commuting travel mode passenger flow prediction is greatly improved by the method of the entropy weight to the urban area division.
Finally, the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, and all of them should be covered in the claims of the present invention.
Claims (4)
1. A commuting travel mode identification method based on fluctuation rate is characterized by comprising the following steps:
s10, dividing the city area;
s11, dividing the clustering number and the clustering range through the existing administrative and functional areas of the city: all n study site x are taken as a clustering dataset Ω, Ω ═ x1,x2,x3……xn};
All the research object sites are respectively classified into k site area sets thetaiIn (c) (-)i={xi,1,xi,2,xi,3……},i∈{1,2,3,4…,k};
S12, clustering by using a K-means algorithm, randomly selecting a clustering center, and calculating the influence distance from each station to each clustering center point aiming at each station in the omega clustering data setDetermining site xiThe smallest distance of impact to which cluster center, site x will beiDividing the cluster centers into classes of the cluster centers, and outputting the cluster centers and all the sites in each class;
the method for randomly selecting the cluster center in S12 comprises the following steps:
1) randomly selecting a site from a clustering data set omega as an initial clustering center C1Calculating site x by equation (1)iAnd cluster center point CjEuclidean distance ofCalculating site x by equation (2)iProbability P of being selected as next cluster center point(xi);
Where k is the coordinate parameter dimension, xi,kAnd cj,kRespectively represent sites xiAnd cluster center point CjThe kth-dimension data of (1);
2) according to each site xiP of(xi)Determining the area of the wheel disc of each station, selecting the next clustering center by using the wheel disc method, and belonging to the same theta after selecting the next clustering center each timeiX ofiDeleting the cluster centers from the roulette wheel, and sequentially selecting k cluster centers as a cluster center point set phi, wherein phi is { c ═ c1,c2,c3…ct…ck};
In the step S12, the influence distance from each station to each cluster center point is calculatedThe method comprises the following steps:
i) calculating the Euclidean distance mean value S from each non-clustering center site to the clustering center by using the formula (3)iAs characteristic Euclidean distance values;
calculating the belonged characteristic value R of each non-clustering-center site to the preset region of the research clustering center by using the formula (4)i;
II) using the formula (5) and the formula (6) to obtain SiAnd RiS 'is obtained by normalization calculation'iAnd R'i;
III) calculating the current research clustering centers S and S using equation (7) and equation (8), respectivelyEntropy of R eSAnd eR;
Wherein, S "iAnd R "iIs two calculated intermediate values, without practical meaning, S "iAnd R "iRespectively calculated by formula (9) and formula (10);
IV) calculating the information entropy redundancies d of the S and R of the current research clustering centers by respectively using a formula (11) and a formula (12)SAnd dR;
dS=1-eS(11);
dR=1-eR(12);
V) calculating information entropy weights w of the current research cluster centers S and R by using formula (13) and formula (14) respectivelySAnd wR;
VI) repeating the calculation processes I) to V)Obtaining the information entropy weight w of k clustering centersS,iAnd wR,i;
VII) clustering operation is carried out by using a K-means clustering algorithm, and the influence distance from each site to each clustering center point is calculated by using a formula (15)
S20, identifying a commuting travel mode;
s21, each clustering center and all the sites of the clustering center form a research object area, and a group of research object areas are formed by the two research object areas;
randomly extracting hourly passenger flow statistics a for a plurality of working days for a group of study areasiThe 24-hour passenger flow data form a data set Ψ, Ψ ═ a1,a2,a3…a24};
S22, respectively calculating 24-n passenger flow fluctuation rates SiSearch for 24-n passenger flow fluctuation rates siCounting the number q of fluctuation rates larger than a threshold value:
if q is 4 and the four fluctuation rate peaks correspond to the start-stop time points of the early peak and the late peak, respectively, then this set of subject areas is identified as commuting travel pattern.
2. The method for wave-rate based commuter travel pattern recognition of claim 1, wherein said S12 outputs cluster center and all sites in each class as:
a) calculating the influence distance from each site to each cluster center pointDetermining site xiThe smallest distance of impact to which cluster center, site x will beiIs divided into the clusterThe class of the center;
b) for each class i after repartitioning in a), calculating a new cluster center c for that class using equation (16)i;
c) Repeatedly and randomly selecting clustering centers and calculating the influence distance from each site to each clustering centerAnd finishing the division of the regional sites until the position of the clustering center of each category is not changed any more, and outputting the clustering center and all the sites in each category.
3. The method of claim 2, wherein the passenger flow fluctuation rate S in S22 is a traffic flow pattern recognition method based on fluctuation rateiThe calculation method comprises the following steps:
s221: 23 logarithmic parameters b of the time of the passenger flow are calculated by using a formula (17)i,
S222: calculating 24-n passenger flow fluctuation rates s using equation (18)iWhere n is the fluctuation observation range, s1The fluctuation rate at the time of 1+ (n-1)/2 points is shown;
4. The method for identifying a commuting travel pattern based on fluctuation rate as claimed in claim 3, wherein the method for counting the number q of fluctuation rates greater than the threshold in S22 is:
calculating 24-n passenger flow fluctuation rates s using the formula (20)iMean value ofAnd standard deviation ds;
Retrieving 24-n passenger flow fluctuation rates siCounting the number q of fluctuation rates larger than a threshold value:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010872239.3A CN111860699B (en) | 2020-08-26 | 2020-08-26 | Commuting trip mode identification method based on fluctuation rate |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010872239.3A CN111860699B (en) | 2020-08-26 | 2020-08-26 | Commuting trip mode identification method based on fluctuation rate |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111860699A CN111860699A (en) | 2020-10-30 |
CN111860699B true CN111860699B (en) | 2021-04-13 |
Family
ID=72967959
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010872239.3A Active CN111860699B (en) | 2020-08-26 | 2020-08-26 | Commuting trip mode identification method based on fluctuation rate |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111860699B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112791997B (en) * | 2020-12-16 | 2022-11-18 | 北方工业大学 | Method for cascade utilization and screening of retired battery |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107832779A (en) * | 2017-12-11 | 2018-03-23 | 北方工业大学 | Track station classification system |
CN110232398A (en) * | 2019-04-24 | 2019-09-13 | 广东交通职业技术学院 | A kind of road network sub-area division and its appraisal procedure based on Canopy+Kmeans cluster |
WO2020018679A1 (en) * | 2018-07-17 | 2020-01-23 | Nvidia Corporation | Regression-based line detection for autonomous driving machines |
CN111125184A (en) * | 2019-11-23 | 2020-05-08 | 同济大学 | Bus passenger flow dynamic monitoring method based on time sequence structural variable point identification |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170178044A1 (en) * | 2015-12-21 | 2017-06-22 | Sap Se | Data analysis using traceable identification data for forecasting transportation information |
-
2020
- 2020-08-26 CN CN202010872239.3A patent/CN111860699B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107832779A (en) * | 2017-12-11 | 2018-03-23 | 北方工业大学 | Track station classification system |
WO2020018679A1 (en) * | 2018-07-17 | 2020-01-23 | Nvidia Corporation | Regression-based line detection for autonomous driving machines |
CN110232398A (en) * | 2019-04-24 | 2019-09-13 | 广东交通职业技术学院 | A kind of road network sub-area division and its appraisal procedure based on Canopy+Kmeans cluster |
CN111125184A (en) * | 2019-11-23 | 2020-05-08 | 同济大学 | Bus passenger flow dynamic monitoring method based on time sequence structural variable point identification |
Non-Patent Citations (2)
Title |
---|
Using NARX Neural Network for Prediction of Urban Rail Transit Passenger Flow;Xiaochao Zhao等;《2018 IEEE 9th International Conference on Software Engineering and Service Science》;20181125;117-121 * |
城市轨道交通短时客流不确定性预测模型;郭旷等;《城市轨道交通研究》;20200131;22-26 * |
Also Published As
Publication number | Publication date |
---|---|
CN111860699A (en) | 2020-10-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105261212B (en) | A kind of trip space-time analysis method based on GPS data from taxi map match | |
CN111653099B (en) | Bus passenger flow OD obtaining method based on mobile phone signaling data | |
CN109686090B (en) | Virtual traffic flow calculation method based on multi-source data fusion | |
CN110555544B (en) | Traffic demand estimation method based on GPS navigation data | |
CN106781479A (en) | A kind of method for obtaining highway running status in real time based on mobile phone signaling data | |
CN111581325B (en) | K-means station area division method based on space-time influence distance | |
CN110634299B (en) | Urban traffic state fine division and identification method based on multi-source track data | |
CN108961758A (en) | A kind of crossing broadening lane detection method promoting decision tree based on gradient | |
CN113436433B (en) | Efficient urban traffic outlier detection method | |
CN110414795B (en) | Newly-increased high-speed rail junction accessibility influence method based on improved two-step mobile search method | |
CN114416710B (en) | Method and system for extracting OD position of express way vehicle | |
CN109489679B (en) | Arrival time calculation method in navigation path | |
CN112036757A (en) | Parking transfer parking lot site selection method based on mobile phone signaling and floating car data | |
CN106327867B (en) | Bus punctuation prediction method based on GPS data | |
CN115795332A (en) | User travel mode identification method | |
CN110913345B (en) | Section passenger flow calculation method based on mobile phone signaling data | |
CN111860699B (en) | Commuting trip mode identification method based on fluctuation rate | |
CN111341135B (en) | Mobile phone signaling data travel mode identification method based on interest points and navigation data | |
CN116233757A (en) | Resident travel carbon emission amount calculating method based on mobile phone signaling data | |
CN114358386A (en) | Double-trip-mode ride-sharing site generation method based on reserved trip demand | |
CN114742131A (en) | Method for identifying urban excessive tourism area based on pattern mining | |
CN112866934A (en) | Subway user identification method and system | |
CN110610446A (en) | County town classification method based on two-step clustering thought | |
CN113724494B (en) | Customized bus demand area identification method | |
Song et al. | Clustering and understanding traffic flow patterns of large scale urban roads |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |