CN105718946A - Passenger going-out behavior analysis method based on subway card-swiping data - Google Patents

Passenger going-out behavior analysis method based on subway card-swiping data Download PDF

Info

Publication number
CN105718946A
CN105718946A CN201610037044.0A CN201610037044A CN105718946A CN 105718946 A CN105718946 A CN 105718946A CN 201610037044 A CN201610037044 A CN 201610037044A CN 105718946 A CN105718946 A CN 105718946A
Authority
CN
China
Prior art keywords
passenger
trip
going
station
pedestrian
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610037044.0A
Other languages
Chinese (zh)
Inventor
尹宝才
李莹
张勇
刘浩
赵霞
葛启彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN201610037044.0A priority Critical patent/CN105718946A/en
Publication of CN105718946A publication Critical patent/CN105718946A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06Q50/40

Abstract

The invention discloses a passenger going-out behavior analysis method based on subway card-swiping data. The passenger going-out behavior analysis method is characterized in that the subway going-out behaviors can be classified, and the classification results have the obvious and easy-to-observe characteristics, and can be widely used for the intelligent traffic passenger going-out behavior analysis. The passenger going-out behavior analysis method is characterized in that S1, data pre-processing can be carried out; original data can be merged and organized, and passenger going-out records can be acquired; every passenger going-out record comprises a passenger going-out entrance station, an entrance card-swiping time, an exit station, and an exit card-swiping time; S2, characteristics can be extracted; according to the passenger going-out records, the passenger entrance temporal clustering can be carried out to acquire the fixed going-out days of every passenger, and then the passenger going-out characteristics can be acquired after the extraction of the passenger going-out characteristics; S3, passengers can be clustered; the passenger clustering can be carried out according to the passenger going-out characteristics, and the passenger clustering result can be acquired and analyzed.

Description

A kind of passenger's travel behaviour based on subway brushing card data analyzes method
Technical field
The invention belongs to the technical field of intelligent transportation, analyze method more particularly to a kind of passenger's travel behaviour based on subway brushing card data.
Background technology
In today of industrialization and urbanization accelerated development, millions of population pours in big city, brings huge pressure to city management and urban transportation.Track traffic, as the backbone of urban transportation, can effectively alleviate traffic congestion, improves the efficiency of urban transportation.Abroad, track traffic experienced by the development of centuries, it is verified that play the important and pivotal role in urban development, assume responsibility for dominant contribution in public transport.Research for Metro Passenger travel behaviour rule is the important foundation that track traffic is studied.Existing data, accurate analysis, grasp urbanite's railway trip rule can be utilized, be directly connected to the development strategy of urban track traffic, planning and policy-making reasonability and accuracy.Along with the development that urban rail transit in China is built, quickly propelling of urbanization process, how to meet the growing trip requirements of resident by rational track traffic design has become an extremely urgent problem.By directly observing the flow of the people demand that meet more accurately, become more meticulous more difficult with the conventional rails travel behaviour analytical model of website handling capacity and method.Meanwhile, resident's railway trip rule can well react the change of city society space, making rational planning for for city, it is provided that valuable reference.
The external research for pedestrian's trip, have been achieved for plentiful and substantial achievement in research, wherein, Sumietal proposes a Stochastic Equilibrium model and passenger's departure time and path is selected, assuming mainly according to the operation feature of public transit system with passenger is desired arrives at the time of this model.The travel time of evening peak pedestrian morning is selected to have analyzed by Vovsha.Bhat then proposes a series of Passenger Traveling Choice model, it is possible to dope the trip mode of passerby, travel time, and to the prediction selected with setting out moment and trip purpose.Stefaan and Therese mainly have studied travel time and space factor affects situation to what bus trip mode selected.Domestic research in transit trip mode is also compared many, Yin Huanhuan etc. are according to Jinan citizen's trip survey data, from travel behaviour angle analysis urbanite's bus trip behavior characteristics, construct public transport housing choice behavior model, analyze public transport and Private Traffic Passenger Traveling Choice behavior difference.Zou Zhiyun etc. set up resident go on a journey per capita number of times quantitative model by studying go on a journey the per capita relation of number of times of economic indicator and resident, and Trip Intensity of Residents has been carried out cluster analysis.Cold young tiger and Zhao Wenyuan are extracted passenger based on subway station passenger flow data and go out row mode and subway station passenger flow pattern, set up and distribute the metro passenger flow trip characteristics Clustering Model of topic model by subway station according to going on a journey rule similarity cluster together based on potential Di Li Cray.
Summary of the invention
The technology of the present invention solves problem: overcome the deficiencies in the prior art, a kind of passenger's travel behaviour based on subway brushing card data is provided to analyze method, subway travel behaviour can be classified by effectively, and the feature that classification results has significantly, it is easily observed, can be widely applied in the analysis of intelligent transportation pedestrian's travel behaviour.
The technical solution of the present invention is: this passenger's travel behaviour based on subway brushing card data analyzes method, comprises the following steps:
(1) data prediction: initial data has been merged and has arranged, obtains pedestrian and goes on a journey record, and every pedestrian record of going on a journey comprises: passenger's trip is entered the station website, charge time of entering the station, departures website and departures charge time;
(2) feature extraction: go on a journey record according to pedestrian, carries out pedestrian and enters the station temporal clustering, obtain the fixing trip natural law of each passenger, carries out passenger's trip characteristics and extracts and obtain passenger's trip characteristics;
(3) passenger's cluster: according to passenger's trip characteristics, carry out passenger's cluster, it is thus achieved that passenger's cluster result, and passenger's cluster result is analyzed.
The present invention is directed to subway brushing card data, extract and determined by clustering method the feature of travel behaviour of every pedestrian, pedestrian's feature is clustered by the method then adopting cluster, the pedestrian that trip rule has similarity is got together, obtain three major types trip crowd, obtain relation between trip rule and website distribution, it is possible to effectively subway travel behaviour is classified, and the feature that classification results has significantly, it is easily observed, can be widely applied in the analysis of intelligent transportation pedestrian's travel behaviour.
Accompanying drawing explanation
Fig. 1 is the flow chart that the passenger's travel behaviour based on subway brushing card data according to the present invention analyzes method.
Fig. 2 illustrates that different classes of pedestrian is at each site traffic.
Fig. 3 illustrates random each website flow of the people of going on a journey.
Fig. 4 illustrates fixing each site traffic of going on a journey.
Fig. 5 illustrates the abnormal each website flow of the people of going on a journey of the first kind.
Fig. 6 illustrates the abnormal each site traffic of going on a journey of Equations of The Second Kind.
Fig. 7 illustrates the abnormal each site traffic of going on a journey of the 3rd class.
Detailed description of the invention
As it is shown in figure 1, this passenger's travel behaviour based on subway brushing card data analyzes method, comprise the following steps:
(1) data prediction: initial data has been merged and has arranged, obtains pedestrian and goes on a journey record, and every pedestrian record of going on a journey comprises: passenger's trip is entered the station website, charge time of entering the station, departures website and departures charge time;
(2) feature extraction: go on a journey record according to pedestrian, carries out pedestrian and enters the station temporal clustering, obtain the fixing trip natural law of each passenger, carries out passenger's trip characteristics and extracts and obtain passenger's trip characteristics;
(3) passenger's cluster: according to passenger's trip characteristics, carry out passenger's cluster, it is thus achieved that passenger's cluster result, and passenger's cluster result is analyzed.
The present invention is directed to subway brushing card data, extract and determined by clustering method the feature of travel behaviour of every pedestrian, pedestrian's feature is clustered by the method then adopting cluster, the pedestrian that trip rule has similarity is got together, obtain three major types trip crowd, obtain relation between trip rule and website distribution, it is possible to effectively subway travel behaviour is classified, and the feature that classification results has significantly, it is easily observed, can be widely applied in the analysis of intelligent transportation pedestrian's travel behaviour.
Preferably, described step (1) including:
(1.1) site number is unified;
(1.2) employee job card data and junk data are removed;
(1.3) merge trip to swipe the card record.
Preferably, described step (2) adopts DBSCAN clustering method that a plurality of trip record of each pedestrian is clustered, and is clustered same card number, the time of entering the station with identical record out of the station, and radius chooses one hour.
Preferably, described step (3) adopt Kmeans clustering algorithm carry out passenger's cluster.
Preferably, described Kmeans clustering algorithm comprises the following steps:
(3.1) selected k vector is as initial cluster centre { c1, c2, c3..., ck};
(3.2) being calculated one by one and the distance of each cluster centre by vector to be sorted, by minimal distance principle, each vector is divided into each apoplexy due to endogenous wind, vector spacing adopts Euclidean distance
d i s ( x , c k ) = Σ i = 1 m ( x i - c i k ) 2 - - - ( 1 )
Wherein:Represent certain cluster centre ckThe feature value;
(3.3) recalculating each class center after point good class, utilize formula (2) to calculate the average of each apoplexy due to endogenous wind institute directed quantity, wherein, what v represented is the number of vector in kth class
c k = ( 1 v Σ j = 1 v c j 1 , 1 v Σ j = 1 v c j 2 , 1 v Σ j = 1 v c j 3 , ... , 1 v Σ j = 1 v c j m ) - - - ( 2 )
(3.4) if the class center recalculated changes, then step (3.2) iteration again is gone to, until each class center no longer changes.
The present invention will be described in more detail below.
1. data prediction
Owing to initial data is at random, and comprise that substantial amounts of hash such as website out of the station is identical, subway staff swipes the card record etc., for the ease of follow-up analytical work, initial data merged and has arranged, having eliminated these data.Data prediction comprises herein below:
1. unify site number
Swipe the card in record original, only have recorded and enter the station or outbound routes number and station yard, different website is identified in order to convenient, first site number is unified, Unified number mode in conjunction with Beijing urban mass transit command centre, line number and station yard are merged, and transfer stop numbering has been unified.Comprising enter the station numbering and departures numbering in every record, each site number determines that unique mark of this website.
2. remove employee job card data and junk data:
Initial data has the record of swiping the card of substantial amounts of subway staff, comparatively frequently not there is Research Significance owing to subway staff passes in and out subway station number of times of swiping the card, and initial data field has clearly indicated that Card Type, therefore the record of swiping the card belonging to subway staff is deleted.Carry out again processing to initial data simultaneously, the data that the time of entering the station is identical with departures website have been deleted in the lump.These data have no idea to react real trip situation, so belonging to hash.
3. merge trip to swipe the card record
In initial data, do not record the trip of every passenger accurately, and the station number that simply simple have recorded is entered the station or set off and charge time, so two records out of the station are merged into a record, difference is done by the time of entering the station and exchange hour, can in the hope of the trip duration of passenger, the record thus differentiation is entered the station and set off.After process, every trip record comprises: passenger's trip is entered the station website, enters the station charge time, departures website and departures charge time.
2. single passenger's travel time
Through process above, the full detail that the packet obtained is gone on a journey containing passenger, in real life, Beijing's municipal traffic all-purpose card is substantially staff one, it is possible to think that each IC-card number correspond to a passenger, and is relation one to one.Based on this inference, each IC-card having been carried out independent statistics, all of trip record be grouped according to card number, the trip record below each card number is considered as passenger trip record within the whole moon.For each pedestrian, looking first at its trip rule, namely a middle of the month, the time of trip every day and website, for identical out of the station, travel time every day, difference was interior trip in a hour, it is believed that be fixing trip.Owing to comprising in the trip record of each pedestrian, to fix trip several times be uncertain, therefore the present invention adopts DBSCAN clustering method that a plurality of trip record of each pedestrian is clustered, by same card number, the time of entering the station with identical record out of the station clusters, and radius chooses one hour.
DBSCAN (Density-BasedSpatialClusteringofApplicationswithNoise) is a representational density-based algorithms of comparison.It bunch will be defined as the maximum set of the point that density is connected, it is possible to is divided into bunch having enough highdensity region, and can find the cluster of arbitrary shape in the spatial database of noise.The purpose of DBSCAN algorithm is in that to filter density regions, it has been found that consistency sample point.With traditional different based on the cluster of level and the convex clustering cluster of partition clustering, this algorithm is it appeared that the clustering cluster of arbitrary shape, and it has the following advantages compared with traditional algorithm:
(1) quantity of bunch class to be formed need not be known in advance, it is possible to identify noise spot;
(2) shape of clustering cluster does not have bias;
(3) parameter of noise filtering can be inputted when needed;
The time of entering the station in a plurality of record of identical for single card number correspondence out of the station is carried out DBSCAN cluster, and sorting procedure is as follows:
(1) int64 will be converted the time of entering the station to, remove date and time information
(2) using time correspondence numeral as cluster input, cluster radius is chosen as one hour, and infima species sample size is 1.
(3) in cluster result, the same class time of entering the station averages to be and on average enters the station the time, and time of each run is averaged is journey time, and same category number is trip natural law
The present invention needs to find out the trip rule of everyone (i.e. each ID), and spatially, similar trip necessarily flocks together, and constitutes a high-density region, so being especially suitable for application DBSCAN algorithm.The trip of single ID is carried out clustering and find out and wherein comprises trip to record the result of maximum bunch races as shown in table 1, it can be seen that the record of the time similarity that enters the station well is clustered together by DBSCAN algorithm.
Table 1
By DBSCAN, for each pedestrian, it is found that the number of maximum bunch of race, it is possible to regard the fixing trip natural law of pedestrian as, this is a critically important attribute, follow-up pedestrian is classified, obtains application.
The result that cluster obtains is the corresponding multiple trip of each pedestrian, and on average enter the station time and trip number of times have been noted down in every kind of trip.Wherein, repeatedly go out from the same website same website of entrance in one hour owing to being generally not present a passenger, the fixing trip natural law within this time that number of times is this people that therefore a people goes on a journey within the time period that certain clusters.
3. the pedestrian based on Kmeans cluster goes on a journey analysis
3.1Kmeans clustering algorithm is introduced
Kmeans clustering algorithm is a kind of clustering algorithm first proposed by MacQuen in 1967.Data are clustered by Kmeans when user specifies classification number, attempt finding out the division making squared error function value minimum, by the center that recurrence calculation repeatedly goes out bunch, object are assigned in different bunches.The core concept of Kmeans clustering algorithm: the set of n vector (object) is divided into k class bunch so that bunch in vectorial have a higher similarity, and bunch between similarity relatively low.
Assume that vector set to be sorted is combined into { x1, x2, x3..., xn, diS (x, ck) represent vector x and vector ckEuclidean distance, vector x represents vector to be sorted, and vector c represents cluster centre vector, and k represents the Characteristic Number of vector.
The basic procedure of Kmeans algorithm is as follows:
1, selected k vector is as initial cluster centre { c1,c2,c3... ck};
2, being calculated one by one and the distance of each cluster centre by vector to be sorted, by minimal distance principle, each vector is divided into each apoplexy due to endogenous wind, vector spacing adopts Euclidean distance
d i s ( x , c k ) = Σ i = 1 m ( x i - c i k ) 2 - - - ( 1 )
Wherein:Represent certain cluster centre ckThe feature value, rather than the replacing of cluster centre k.
3, recalculating each class center after point good class, namely utilize formula (2) to calculate the average of each apoplexy due to endogenous wind institute directed quantity, wherein, what v represented is the number of vector in A class
c k = ( 1 v Σ j = 1 v c j 1 , 1 v Σ j = 1 v c j 2 , 1 v Σ j = 1 v c j 3 , ... , 1 v Σ j = 1 v c j m ) - - - ( 2 )
If the class center that 4 recalculate changes, then go to 2 iteration again, until each class center no longer changes.
3.2 feature extractions
After the process of 1 and 2, every pedestrian comprises a plurality of trip record and fixing trip record, by extracting the feature of each pedestrian, it is possible to obtain Clustering Vector.Consider that affect the feature that pedestrian's travel time be distributed is mainly manifested in the single travel time and goes on a journey on number of times, wherein the single travel time is both affected by distance out of the station, also being affected the holdup time in subway station by pedestrian, average travel time every day of each pedestrian, once maximum travel time and a minimum time have been calculated by the present invention simultaneously;Trip number of times be then divided into one month number of times of always going on a journey, average every day go on a journey number of times, every day maximum trip number of times and every day minimum trip number of times.It addition, for pedestrian's fixing trip record, statistics pedestrian fixes the maximum number of days of trip and minimum natural law.
Feature extraction by each pedestrian, it is possible to obtain such as properties (such as table 2): number of times, the maximum trip in single sky number of times, the minimum trip in single sky number of times, maximum fixing trip natural law, the minimum fixing trip natural law of going on a journey every day total, average of going on a journey in average travel time every day, single maximum travel time, single minimum travel time, month.
Table 2
Wherein, n represents places number, and d represents the natural law of month, tiRepresent single travel time, miRepresent number of times of going on a journey every day, simultaneouslyfiRepresent fixing trip natural law, wherein i=1,2 ..., q, represent for certain pedestrian, it is possible to have q fixing trip.
Consider that pedestrian goes on a journey rule, all people should be divided into following several: go on a journey than more random, trip is comparatively fixing and goes out row mode differs from ordinary person i.e. abnormal trip, wherein the feature of abnormal trip is likely to be embodied in many aspects, its trip frequency abnormality of main consideration, single travel time abnormal and fixing less three aspect factors of going on a journey, therefore by repeatedly comparative experiments, trip classification is decided to be five kinds, adopts Kmeans algorithm to cluster.
In June, 2013 brushing card data adopted in the present invention comprises 11450928 pedestrians altogether, adopts python to carry out data analysis and cluster, after k-means algorithm clusters, and the result finally obtained following (such as table 3):
Table 3
Passing through the above results, it is possible to sum up and obtain, subway trip crowd can be roughly divided into three major types, is random trip, fixing trip and abnormal trip respectively.Wherein, first kind trip does not all have rule on time and number of times, and overall number of times of going on a journey is less, it is believed that be random trip;Equations of The Second Kind fixing trip number of times is more, and the ratio accounting for overall number of times of going on a journey is higher, so being fixing trip;Its excess-three class is different from general trip respectively on time number of times, it is believed that be abnormal trip.
Spatially it is distributed to analyze the different classes of pedestrian obtained based on Kmeans algorithm further, adopt method for visualizing, each class pedestrian website out of the station flow of the people is presented on high moral map, and different website flows of the people have been sorted, different classes of pedestrian can be obtained by observing and analyzing there is obvious feature in spatial distribution.The present invention adopts html5 and css to complete visualization, and calls high moral map app.
Ten websites of maximum flow out of the station are obtained from the trip website of passenger, as shown in Figure 2, all kinds of passengers are in the statistics of ten site traffic, owing in abnormal trip passenger, three class flow of the people differences are bigger, so by its combined analysis, as can be seen from the figure, fixing trip passenger and random trip passenger have obvious difference at some websites, such as West Second Qi, Beijing South Station, Beijing West Railway Station, Beijing Station and Tiantong Yuan etc., the passenger of random trip is more in transfer stop trip, and fix the passenger gone on a journey in West Second Qi, near such residence commercial district, Tiantong Yuan, website flow of the people is apparently higher than flow of the people of going on a journey at random.
The five class real reactions that obtain of the cluster three major types crowd of subway trip, i.e. random trip, fixing trip and abnormal trip, analyze every class result in detail below, each class pedestrian is passed in and out website and is illustrated on map, represent flow of the people size by the size of circle, red expression enters the station, purple represents departures, and lap is the part that number is identical, additionally 15 website flows of the people of trip maximum flow is ranked up, orange represents the number of entering the station, blue expression departures number.
(1) go on a journey at random
This kind of crowd number of times of going on a journey every month is not fixed, but it is generally less, and travel time distribution is comparatively at random, it is substantially free of fixing trip, also without obvious rule, in conjunction with Fig. 3 it can be seen that its trip distribution site distribution is comparatively at random, several railway stations and the bigger transfer website of flow of the people, the track traffic hub website that these websites Dou Shi Beijing is important within focusing primarily upon three rings.Therefore the spatial distribution of pedestrian of going on a journey at random meets Beijing urban mass transit trip rule.
(2) fixing trip
These are shown in 45, and the more website of Crowds Distribute concentrates on residential quarter, shopping centre, transfer stop and office concentration point substantially, such as international trade, West Second Qi, Tiantong Yuan etc..Within the distribution of these websites spreads all over Beijing six ring, the West Second Qi that wherein flow of the people is maximum is one of Beijing's website that early evening peak is blocked up most, this website is the transfer stop of No. 13 lines and Changping line, periphery is high-tech park, more working clan is had to pass in and out from this website, for the fixing maximum website of trip flow of the people.And flow of the people is only second to the international trade station of West Second Qi as Pekinese CBD, being commercial activity concentration zones, each big banking mechanism concentrates on this, simultaneously international trade station or the transfer stop of a line and No. ten lines.It addition, Tiantong Yuan, Back Long View etc. be then large quantities of residential quarters, Beijing intensively, be also the website that often passes in and out of working clan.As can be seen here, fixing trip main population is working clan, each big residential quarter of difference out of the station and Office Area.
(3) abnormal trip
The first Novel presentation is few at trip number of times, on the travel time is longer, fewer in number.This part pedestrian is a minority comparatively abnormal in subway trip, and the trip number of times of their every month is less and the single travel time is longer, is the remote trip of fewer number.As it is shown in figure 5, this kind of pedestrian goes on a journey, website concentrates on each big railway station, such as Beijing Station, Beijing West Railway Station, Beijing South Station and northern station, Beijing, is usually used during taking train and leaves or arrive Pekinese pedestrian.The pedestrian's number of entering the station wherein passing in and out railway station is above departures number, it can be seen that arriving Pekinese pedestrian in turnover Beijing population, to take the probability of subway higher.It addition, the higher website of flow of the people also has Dongzhimen, hydrops pool, Qianmen, zoo etc., these website circumferential distribution some sight spots, Beijing and comparatively famous hospital.It addition, within as can be seen from the figure this kind of crowd concentrates on Fourth Ring, but the single travel time is all longer, it can be seen that this part population mostly relatively is the crowd gone to the capital city in other places, and their activity mostly is plays and go to a doctor.
As shown in Figure 6, Equations of The Second Kind exception crowd is fixing, and trip is less, and trip number of times is more, the single travel time is longer, is repeatedly remote on-fixed trip, and the trip of this kind of crowd is distributed in from bicyclo-to five rings, turnover website is distant, so the single travel time is longer.The website that wherein flow of the people is maximum concentrates on several big transfer stop, also reflects the subway flow of the people regularity of distribution to a certain extent.Additionally also having the residential quarter such as Long Ze, Tiantong Yuan, these websites are all distributed in about six rings, and other websites of distance urban district are farther out.
As it is shown in fig. 7, the 3rd class abnormal go out pedestrian's number of times of totally going on a journey more, the travel time is shorter, and is substantially free of fixing trip, is repeatedly closely on-fixed trip.The trip distribution of this part population is concentrated mainly on a line, No. two lines and No. ten lines, within trip is generally concentrated at Fourth Ring, so essentially short distance trip, has reacted the short distance trip crowd concentrating on urban centre.
The above; it it is only presently preferred embodiments of the present invention; not the present invention is done any pro forma restriction, every any simple modification, equivalent variations and modification above example made according to the technical spirit of the present invention, all still belongs to the protection domain of technical solution of the present invention.

Claims (5)

1. the passenger's travel behaviour based on subway brushing card data analyzes method, it is characterised in that: comprise the following steps:
(1) data prediction: initial data has been merged and has arranged, obtains pedestrian and goes on a journey record, and every pedestrian record of going on a journey comprises: passenger's trip is entered the station website, charge time of entering the station, departures website and departures charge time;
(2) feature extraction: go on a journey record according to pedestrian, carries out pedestrian and enters the station temporal clustering, obtain the fixing trip natural law of each passenger, carries out passenger's trip characteristics and extracts and obtain passenger's trip characteristics;
(3) passenger's cluster: according to passenger's trip characteristics, carry out passenger's cluster, it is thus achieved that passenger's cluster result, and passenger's cluster result is analyzed.
2. the passenger's travel behaviour based on subway brushing card data according to claim 1 analyzes method, it is characterised in that: described step (1) including:
(1.1) site number is unified;
(1.2) employee job card data and junk data are removed;
(1.3) merge trip to swipe the card record.
3. the passenger's travel behaviour based on subway brushing card data according to claim 2 analyzes method, it is characterized in that: described step (2) adopts DBSCAN clustering method that a plurality of trip record of each pedestrian is clustered, being clustered same card number, the time of entering the station with identical record out of the station, radius chooses one hour.
4. the passenger's travel behaviour based on subway brushing card data according to claim 3 analyzes method, it is characterised in that: described step (3) adopt Kmeans clustering algorithm carry out passenger's cluster.
5. the passenger's travel behaviour based on subway brushing card data according to claim 4 analyzes method, it is characterised in that: described Kmeans clustering algorithm comprises the following steps:
(3.1) selected k vector is as initial cluster centre { c1, c2, c3..., ck};
(3.2) being calculated one by one and the distance of each cluster centre by vector to be sorted, by minimal distance principle, each vector is divided into each apoplexy due to endogenous wind, vector spacing adopts Euclidean distance
d i s ( x , c k ) = Σ i = 1 m ( x i - c i k ) 2 - - - ( 1 )
Wherein:Represent certain cluster centre ckIth feature value;
(3.3) recalculating each class center after point good class, utilize formula (2) to calculate the average of each apoplexy due to endogenous wind institute directed quantity, wherein, what v represented is the number of vector in kth class
c k = ( 1 v Σ j = 1 v c j 1 , 1 v Σ j = 1 v c j 2 , 1 v Σ j = 1 v c j 3 , ... , 1 v Σ j = 1 v c j m ) - - - ( 2 )
(3.4) if the class center recalculated changes, then step (3.2) iteration again is gone to, until each class center no longer changes.
CN201610037044.0A 2016-01-20 2016-01-20 Passenger going-out behavior analysis method based on subway card-swiping data Pending CN105718946A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610037044.0A CN105718946A (en) 2016-01-20 2016-01-20 Passenger going-out behavior analysis method based on subway card-swiping data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610037044.0A CN105718946A (en) 2016-01-20 2016-01-20 Passenger going-out behavior analysis method based on subway card-swiping data

Publications (1)

Publication Number Publication Date
CN105718946A true CN105718946A (en) 2016-06-29

Family

ID=56147265

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610037044.0A Pending CN105718946A (en) 2016-01-20 2016-01-20 Passenger going-out behavior analysis method based on subway card-swiping data

Country Status (1)

Country Link
CN (1) CN105718946A (en)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294679A (en) * 2016-08-08 2017-01-04 大连理工大学 A kind of method for visualizing carrying out website cluster based on subway data
CN106529711A (en) * 2016-11-02 2017-03-22 东软集团股份有限公司 Method and apparatus for predicting user behavior
CN106897876A (en) * 2017-02-28 2017-06-27 北京小米移动软件有限公司 terminal payment processing method and device
CN107590239A (en) * 2017-09-11 2018-01-16 东南大学 A kind of method for radius of being plugged into based on IC-card data determination subway station public bicycles
CN107730717A (en) * 2017-10-31 2018-02-23 华中科技大学 A kind of suspicious card identification method of public transport of feature based extraction
CN107818415A (en) * 2017-10-31 2018-03-20 东南大学 A kind of recognition methods of attending a school by taking daily trips based on subway brushing card data
CN107832779A (en) * 2017-12-11 2018-03-23 北方工业大学 Track station classification system
CN107886189A (en) * 2017-10-19 2018-04-06 东南大学 A kind of method that route travel time deduction is carried out based on subway brushing card data
CN107943920A (en) * 2017-11-21 2018-04-20 东南大学 A kind of trip crowd recognition method based on subway brushing card data
CN108230670A (en) * 2016-12-22 2018-06-29 株式会社日立制作所 Predict the method and apparatus for giving the moving body number of place appearance in given time period
CN109508815A (en) * 2018-10-19 2019-03-22 东南大学 Activity space Measures Analysis method of attending a school by taking daily trips based on subway IC card data
CN110020666A (en) * 2019-02-21 2019-07-16 华南理工大学 A kind of public transport advertisement placement method and system based on passenger behavior mode
CN110097138A (en) * 2019-05-11 2019-08-06 北京京投亿雅捷交通科技有限公司 A kind of gauze passenger representation data library application system and method
CN110162520A (en) * 2019-04-23 2019-08-23 中国科学院深圳先进技术研究院 Friend recommendation method and system towards Metro Passenger
CN110533483A (en) * 2019-09-05 2019-12-03 中国联合网络通信集团有限公司 A kind of occupant classification method and system based on trip characteristics
CN110921446A (en) * 2019-12-10 2020-03-27 猫岐智能科技(上海)有限公司 Equipment attribute acquisition system
CN111241162A (en) * 2020-01-16 2020-06-05 同济大学 Method for analyzing travel behaviors of passengers under high-speed railway network formation condition and storage medium
CN111741051A (en) * 2020-04-14 2020-10-02 腾讯科技(深圳)有限公司 Method and device for determining full load rate of vehicle, storage medium and electronic device
CN111833229A (en) * 2020-03-28 2020-10-27 东南大学 Travel behavior space-time analysis method and device based on subway dependency
CN112988855A (en) * 2021-05-24 2021-06-18 中国矿业大学(北京) Subway passenger analysis method and system based on data mining
CN113128282A (en) * 2019-12-31 2021-07-16 深圳云天励飞技术有限公司 Crowd category dividing method and device and terminal
CN113837508A (en) * 2020-06-08 2021-12-24 香港理工大学深圳研究院 Old people space analysis method and device, terminal device and storage medium
CN114331234A (en) * 2022-03-16 2022-04-12 北京交通大学 Rail transit passenger flow prediction method and system based on passenger travel information

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006029022A2 (en) * 2004-09-03 2006-03-16 Arbitron Inc. Out-of-home advertising inventory ratings methods and systems
CN102156732A (en) * 2011-04-11 2011-08-17 北京工业大学 Bus IC card data stop matching method based on characteristic stop
CN103699601A (en) * 2013-12-12 2014-04-02 深圳先进技术研究院 Temporal-spatial data mining-based metro passenger classification method
CN104750800A (en) * 2014-11-13 2015-07-01 安徽四创电子股份有限公司 Motor vehicle clustering method based on travel time characteristic
CN104766473A (en) * 2015-02-09 2015-07-08 北京工业大学 Traffic trip feature extraction method based on multi-mode public transport data matching

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006029022A2 (en) * 2004-09-03 2006-03-16 Arbitron Inc. Out-of-home advertising inventory ratings methods and systems
CN102156732A (en) * 2011-04-11 2011-08-17 北京工业大学 Bus IC card data stop matching method based on characteristic stop
CN103699601A (en) * 2013-12-12 2014-04-02 深圳先进技术研究院 Temporal-spatial data mining-based metro passenger classification method
CN104750800A (en) * 2014-11-13 2015-07-01 安徽四创电子股份有限公司 Motor vehicle clustering method based on travel time characteristic
CN104766473A (en) * 2015-02-09 2015-07-08 北京工业大学 Traffic trip feature extraction method based on multi-mode public transport data matching

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294679A (en) * 2016-08-08 2017-01-04 大连理工大学 A kind of method for visualizing carrying out website cluster based on subway data
CN106529711A (en) * 2016-11-02 2017-03-22 东软集团股份有限公司 Method and apparatus for predicting user behavior
CN106529711B (en) * 2016-11-02 2020-06-19 东软集团股份有限公司 User behavior prediction method and device
CN108230670A (en) * 2016-12-22 2018-06-29 株式会社日立制作所 Predict the method and apparatus for giving the moving body number of place appearance in given time period
CN106897876A (en) * 2017-02-28 2017-06-27 北京小米移动软件有限公司 terminal payment processing method and device
CN106897876B (en) * 2017-02-28 2021-07-23 小米数字科技有限公司 Terminal payment processing method and device
CN107590239A (en) * 2017-09-11 2018-01-16 东南大学 A kind of method for radius of being plugged into based on IC-card data determination subway station public bicycles
CN107590239B (en) * 2017-09-11 2020-08-11 东南大学 Method for measuring connection radius of public bicycle at subway station based on IC card data
CN107886189A (en) * 2017-10-19 2018-04-06 东南大学 A kind of method that route travel time deduction is carried out based on subway brushing card data
CN107886189B (en) * 2017-10-19 2021-06-15 东南大学 Method for deducing path travel time based on subway card swiping data
CN107818415B (en) * 2017-10-31 2021-07-09 东南大学 General recognition method based on subway card swiping data
CN107818415A (en) * 2017-10-31 2018-03-20 东南大学 A kind of recognition methods of attending a school by taking daily trips based on subway brushing card data
CN107730717A (en) * 2017-10-31 2018-02-23 华中科技大学 A kind of suspicious card identification method of public transport of feature based extraction
CN107943920A (en) * 2017-11-21 2018-04-20 东南大学 A kind of trip crowd recognition method based on subway brushing card data
CN107832779A (en) * 2017-12-11 2018-03-23 北方工业大学 Track station classification system
CN109508815A (en) * 2018-10-19 2019-03-22 东南大学 Activity space Measures Analysis method of attending a school by taking daily trips based on subway IC card data
CN109508815B (en) * 2018-10-19 2021-08-10 东南大学 General activity spatial measure analysis method based on subway IC card data
CN110020666A (en) * 2019-02-21 2019-07-16 华南理工大学 A kind of public transport advertisement placement method and system based on passenger behavior mode
CN110162520A (en) * 2019-04-23 2019-08-23 中国科学院深圳先进技术研究院 Friend recommendation method and system towards Metro Passenger
CN110097138A (en) * 2019-05-11 2019-08-06 北京京投亿雅捷交通科技有限公司 A kind of gauze passenger representation data library application system and method
CN110533483A (en) * 2019-09-05 2019-12-03 中国联合网络通信集团有限公司 A kind of occupant classification method and system based on trip characteristics
CN110921446B (en) * 2019-12-10 2022-04-12 佳格科技(浙江)股份有限公司 Equipment attribute acquisition system
CN110921446A (en) * 2019-12-10 2020-03-27 猫岐智能科技(上海)有限公司 Equipment attribute acquisition system
CN113128282A (en) * 2019-12-31 2021-07-16 深圳云天励飞技术有限公司 Crowd category dividing method and device and terminal
CN111241162A (en) * 2020-01-16 2020-06-05 同济大学 Method for analyzing travel behaviors of passengers under high-speed railway network formation condition and storage medium
CN111833229A (en) * 2020-03-28 2020-10-27 东南大学 Travel behavior space-time analysis method and device based on subway dependency
CN111741051A (en) * 2020-04-14 2020-10-02 腾讯科技(深圳)有限公司 Method and device for determining full load rate of vehicle, storage medium and electronic device
CN113837508A (en) * 2020-06-08 2021-12-24 香港理工大学深圳研究院 Old people space analysis method and device, terminal device and storage medium
CN113837508B (en) * 2020-06-08 2023-11-17 香港理工大学深圳研究院 Old people space analysis method and device, terminal equipment and storage medium
CN112988855A (en) * 2021-05-24 2021-06-18 中国矿业大学(北京) Subway passenger analysis method and system based on data mining
CN114331234A (en) * 2022-03-16 2022-04-12 北京交通大学 Rail transit passenger flow prediction method and system based on passenger travel information
CN114331234B (en) * 2022-03-16 2022-07-12 北京交通大学 Rail transit passenger flow prediction method and system based on passenger travel information

Similar Documents

Publication Publication Date Title
CN105718946A (en) Passenger going-out behavior analysis method based on subway card-swiping data
CN110245981B (en) Crowd type identification method based on mobile phone signaling data
Hussain et al. Transit OD matrix estimation using smartcard data: Recent developments and future research challenges
Li et al. Transportation mode identification with GPS trajectory data and GIS information
CN106600960A (en) Traffic travel origin and destination identification method based on space-time clustering analysis algorithm
CN105788260A (en) Public transportation passenger OD calculation method based on intelligent public transportation system data
CN110753307B (en) Method for acquiring mobile phone signaling track data with label based on resident survey data
CN107194525A (en) A kind of down town appraisal procedure based on mobile phone signaling
CN107729938B (en) Rail station classification method based on bus connection radiation zone characteristics
CN107656987A (en) A kind of subway station function method for digging based on LDA models
CN114139251B (en) Integral layout method for land ports of border regions
Tonev et al. Different approaches to defining metropolitan areas (Case study: cities of Brno and Ostrava, Czech Republic).
CN111598333A (en) Passenger flow data prediction method and device
Mei et al. Identifying commuters based on random forest of smartcard data
Cheng et al. Mining customized bus demand spots based on smart card data: A case study of the Beijing public transit system
Li et al. Classifications of stations in urban rail transit based on the two-step cluster
CN114723596A (en) Urban functional area identification method based on multi-source traffic travel data and theme model
Jiao et al. Understanding the land use function of station areas based on spatiotemporal similarity in rail transit ridership: A case study in Shanghai, China
Wang et al. Relationship between urban road traffic characteristics and road grade based on a time series clustering model: a case study in Nanjing, China
CN108681741A (en) Based on the subway of IC card and resident's survey data commuting crowd's information fusion method
CN106781467B (en) A kind of bus passenger based on collaborative filtering is swiped the card site information extracting method
CN116227791B (en) Visual analysis method for exploring dynamic division of urban functional areas based on semantic fusion model
Zhou et al. Big data for intrametropolitan human movement studies A case study of bus commuters based on smart card data
Qin et al. Travel trajectories analysis based on call detail record data
CN110399919A (en) A kind of sparse track data interpolation reconstruction method of mankind's trip

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160629

RJ01 Rejection of invention patent application after publication