CN107657006B - Public bicycle IC card and subway IC card matching method based on time-space characteristics - Google Patents

Public bicycle IC card and subway IC card matching method based on time-space characteristics Download PDF

Info

Publication number
CN107657006B
CN107657006B CN201710865835.7A CN201710865835A CN107657006B CN 107657006 B CN107657006 B CN 107657006B CN 201710865835 A CN201710865835 A CN 201710865835A CN 107657006 B CN107657006 B CN 107657006B
Authority
CN
China
Prior art keywords
card
station
subway
public bicycle
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710865835.7A
Other languages
Chinese (zh)
Other versions
CN107657006A (en
Inventor
季彦婕
马新卫
金雨川
杨名远
刘攀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN201710865835.7A priority Critical patent/CN107657006B/en
Publication of CN107657006A publication Critical patent/CN107657006A/en
Application granted granted Critical
Publication of CN107657006B publication Critical patent/CN107657006B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Devices For Checking Fares Or Tickets At Control Points (AREA)

Abstract

The invention discloses a public bicycle IC card and subway IC card matching method based on space-time characteristics, which comprises the following steps: 1) acquiring original data of a public bicycle IC card and a subway IC card, and extracting effective data information from the original data; 2) selecting public bicycle stations next to the subway station to form a subway-public bicycle station pair; 3) preprocessing data of the public bicycle IC card and the subway IC card according to the subway-public bicycle station pair; 4) according to the transfer time interval, the card numbers under different transfer modes are associated in pairs, a card number pair database is constructed, and error data in the database are eliminated; 5) and arranging the rest card number pairs in an ascending order, and picking out the card number pairs which accord with the specified characteristics to complete matching. The invention can accurately and efficiently match the public bicycle IC card with the subway IC card, and lays a solid foundation for a series of researches based on the connection of public bicycles into subways.

Description

Public bicycle IC card and subway IC card matching method based on time-space characteristics
Technical Field
The invention relates to a traffic data fusion method, in particular to a matching method of a public bicycle IC card and a subway IC card.
Background
The subway is a public transportation mode with high traffic capacity and high transportation efficiency, and plays a good role in many big cities. However, because subway construction requires a large amount of capital, and subway stations cannot cover all traffic demand points in a city, the subway cannot solve the problem of 'last kilometer' of travel. The subway transfer public bike enlarges the selection range of travelers to subway stations or destinations by means of the door-to-door characteristic of the public bike, avoids the interference of ground traffic by using the advantages of the subway, and greatly improves the reliability of individual travel time.
However, the current research means for researching the aspect of transferring the subway to the public bicycle is single, and is mainly based on questionnaire survey, the method needs to consume a large amount of manpower and material resources, and cannot acquire the sample size with high longitude and long time span; data mining of the IC card is also limited to mining of unilateral data, and few data mining methods combine two sets of card swiping data to perform quantitative analysis. How to fuse public bicycle IC card data with subway IC card data, public bicycle IC card and subway IC card that the matching passerby used become the main bottleneck of studying the public bicycle behavior of subway transfer.
Disclosure of Invention
The purpose of the invention is as follows: the invention provides a public bicycle IC card and subway IC card matching method based on space-time characteristics.
The technical scheme is as follows: a public bicycle IC card and subway IC card matching method based on space-time characteristics comprises the following steps:
(1) acquiring original data of a public bicycle IC card and a subway IC card, and extracting effective data information from the original data;
(2) selecting public bicycle stations next to the subway station to form a subway-public bicycle station pair;
(3) preprocessing data of the public bicycle IC card and the subway IC card according to the subway-public bicycle station pair;
(4) according to the transfer time interval, associating the card number pairs under different transfer modes, constructing a card number pair database and eliminating error data in the database;
(5) and arranging the rest card number pairs in an ascending order, and picking out the card number pairs which accord with the specified characteristics to complete matching.
Wherein, public bicycle IC card valid information in step (1) includes: the card swiping date, the card number, the car borrowing time, the number of the car borrowing station, the longitude and latitude of the car borrowing station, the car returning time, the number of the car returning station, the longitude and latitude of the car returning station and the age of the user; the subway IC card effective information comprises: the card swiping date, the card number, the arrival time, the station number, the station longitude and latitude, the station exit time, the station number, the station longitude and latitude and the card type.
Selecting public bicycle stations within a certain range from the subway station port to form a subway-public bicycle station pair in the step (2); and if one subway station corresponds to a plurality of public bicycle stations, taking the IC card data of the public bicycle stations as a whole.
The preprocessing of the data in the step (3) comprises the following steps: (3.1) screening out interference data, wherein the interference data comprises a card swiping record with a residual and missing item, a card swiping record with a logical error, a public bicycle card swiping record with a vehicle using time of less than 2 minutes, and a subway card swiping record with an in-out time interval of less than 5 minutes; (3.2) eliminating data irrelevant to the station, and eliminating data corresponding to public bicycle stations of which the borrowing station and the returning station do not belong to the station pair category and subway stations of which the inbound station and the outbound station do not belong to the station pair category according to the selected subway-public bicycle station pair; and (3.3) storing the data in a classified manner, storing the public bicycle IC card data in a date-by-date, station-by-station and borrowing-returning type, and storing the subway IC card data in a date-by-date, station-by-station and station-out type.
The transfer mode in the step (4) comprises two modes: when the passenger leaves the subway station, the passenger rents the public bicycle at the public bicycle station close to the subway station and rides the bicycle to leave; and returning the vehicle to the station, riding the passenger to a public bicycle station next to the subway station by using the public bicycle, returning the vehicle and entering the subway station to take the subway.
The card number pair database construction process is as follows:
(4.1) judging the association: reading the borrowing time of each public bicycle IC card data based on the station pairs, calculating the time interval between the outbound time of all subway IC cards and the borrowing time within a certain time, and if the calculated time interval is within the specified transfer time interval, considering that the association of the two cards is successful; reading the returning time of each public bicycle IC card data and calculating the time interval between the arrival time of all subway IC cards and the returning time within a certain time, and if the calculated time interval is within the specified transfer time interval, considering that the association of the two cards is successful;
(4.2) composing card number pair: using a character string formed by successfully associated public bicycle IC card numbers and subway IC card numbers as a card number pair;
(4.3) constructing a database: and (4) processing the public bicycle IC cards of all the station pairs in the steps (4.1) to (4.2) to construct a card number pair database.
And eliminating error data in the card number pair database, including deleting the card number pairs with age and card type information contradiction, appearance time contradiction and space contradiction in the card number pair database.
Picking out the card number pair meeting the specified characteristics in the step (5), and specifically comprises the following steps: counting the occurrence times of the same card number pair based on the card number pair data after ascending arrangement, and selecting the card number pair which has the highest frequency corresponding to the public bicycle IC card and is formed by only one subway IC card, wherein if the card number pair consists of two completely same card numbers, the card number pair is matched correctly, and otherwise, the card number pair is matched incorrectly.
Has the advantages that: at present, survey questionnaires or IC card data are mainly used as data sources for researching the subway transfer public bicycles, but the questionnaires mainly consume a large amount of manpower and material resources, and cannot acquire sample size with high longitude and long time span, and few scholars start from the perspective of the IC card data, but are mostly limited to the mining of unilateral data of the subway or the public bicycle, and few scholars combine two sets of card swiping data for quantitative analysis. The invention specifically divides the behavior of transferring public bicycles into two modes of 'out-station bicycle borrowing' and 'returning and entering' based on the IC card data, and matches the public bicycle non-one-card-through IC with the subway IC card for the first time according to the space-time characteristics of the behavior of transferring public bicycles by subways. The IC card matched by the method has large sample amount and high accuracy, does not need a large amount of manpower and material resources to perform questionnaire survey, and lays a solid foundation for a series of researches based on the subway transfer of public bicycles.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
fig. 2 is a schematic diagram of subway data classification according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a public bike data classification according to an embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating a classification of one-card data and non-one-card data of a public bicycle according to an embodiment of the invention;
FIG. 5 is a flowchart of a process for deleting contradictory card number pairs, according to an embodiment of the invention;
FIG. 6 is a flow chart of database M processing according to an embodiment of the present invention;
fig. 7 is a flow chart of database N processing according to an embodiment of the present invention.
Detailed Description
The technical scheme of the invention is further explained by combining the attached drawings. In the embodiment of the present invention, the IC card data adopted is provided by public bike company of south kyo and subway company of south kyo. In this embodiment, the method of the present invention will be further described with reference to fig. 1, taking IC card reading data from 2016 (3/9) to 2016 (3/29) as an example.
1. Extracting valid data information
In the raw data, a complete public bicycle card-swiping record contains 16 parts: the system comprises a card swiping date, a card number, a public bicycle number, a bicycle borrowing time, a station borrowing point number, a station borrowing name, a station borrowing longitude, a station borrowing latitude, a bicycle borrowing pile number, a bicycle returning time, a bicycle returning station number, a bicycle returning name, a bicycle returning station longitude, a bicycle returning station latitude, a bicycle returning pile number and a user age. Wherein, the serial numbers of the borrowing and returning stations correspond to the names of the borrowing and returning stations one by one; according to the requirement of the invention, the IC card valid data information is extracted, and the structure is shown in Table 1:
table 1 public bicycle IC card effective information structure
Date of card swiping Card number Age (age) Borrowing time Borrow station point number Time of returning car Station number of returning vehicle
2016-3-9 990196668813 31 18:56:36 12181 19:03:05 12181
2016-3-9 NJHX00168935 47 15:47:03 15003 16:06:45 15073
2016-3-9 NJHX00168935 47 14:33:32 12004 15:16:09 15003
2016-3-9 NJHX00168935 47 8:39:37 12149 8:39:37 12004
TABLE 1 (CONTINUOUS) PUBLIC CARD EFFICIENT INFORMATION STRUCTURE FOR PUBLIC BICYCLE
Figure BDA0001416080050000041
The card number list in table 1 can divide the public bicycle IC card into one card and non-one card. The one-card IC card of the public bicycle refers to an IC card which can be used for taking a subway or borrowing the public bicycle, and the non-one-card IC card of the public bicycle can only rent the public bicycle. If the card number starts with the number 9, it is a one-card IC card, and if it starts with the letter N, it is a non-one-card IC card. Since the data structure of the one-card and the non-one-card is consistent, the following contents are described in detail only by taking the example that the one-card IC card of the public bicycle is matched with the subway IC card.
In the original data, a complete subway card-swiping record comprises 11 parts: card swiping date, IC card number, card type, outbound site number, outbound site longitude, outbound site latitude, inbound site number, inbound site longitude, inbound site latitude, outbound time and inbound time. Wherein, the card kind divide into: 101 (anonymous card 1), 102 (anonymous card 2), 002 (student card: students under 18 can handle and enjoy half-value), 003 (senior citizen card: senior citizen over 70 can handle and enjoy free ticket), 030 (senior citizen card 60, senior citizen over 70 can handle and enjoy half-value), 012 (handicapped card). According to the requirement of the invention, the IC card valid data information is extracted, and the structure is shown in Table 2:
TABLE 2 valid information structure of subway IC card
Figure BDA0001416080050000042
TABLE 2 valid information structure of subway IC card
Figure BDA0001416080050000043
Figure BDA0001416080050000051
2. Form a site pair
And calculating the linear distance between the public bicycle station and the subway station according to the longitude and latitude information of the public bicycle station and the subway station, selecting the public bicycle station within a certain range from the subway station port, and forming a station pair by the public bicycle station and the corresponding subway station. And if one subway station corresponds to a plurality of public bicycle stations, the IC card data of the public bicycle stations are taken as a whole. In the embodiment, 53 subway stations of No. 1,2, 3 and 10 subway lines of the Nanjing city subway are selected, and public bicycle stations within 300 meters of the subway station mouth are selected to correspond to 92 public bicycle stations in total; and arranging a plurality of public bicycle stations corresponding to the same subway station to finally obtain 53 station pairs.
3. Data pre-processing
The method has the advantages that the effective data of the public bicycle IC card and the subway IC card are preprocessed, useless data are screened out, interference can be eliminated, and the data mining efficiency and the recognition accuracy are improved. The pretreatment comprises the following steps:
1) screening out the card-swiping records with the residual and missing items, such as the record of the lack of card-swiping time; screening out card swiping records with logical errors, such as a record that the car returning time is earlier than the car borrowing time or a record that the outbound time is earlier than the inbound time; screening out public bicycle card swiping records with the vehicle using time less than 2 minutes, and considering that the riding behaviors are not generated in the records; subway card swiping records with time intervals of entering and exiting stations less than 5 minutes (the shortest time interval of arriving between subway stations is 3 minutes, and 1 minute of each entering and exiting station) are screened out, and the subway taking behaviors are not generated by the records.
2) The subway card data is stored on different dates, and according to the selected subway station-public bicycle station pair, the subway IC data of each day is divided into a station pair related database Ai and a station pair independent database Bi as shown in fig. 2, wherein i represents the date, for example, when i is 9, the card swiping data of 3 months and 9 days is represented. In the database Ai, as long as one station belongs to 53 picked subway stations, a record is formed; in the database Bi, none of the incoming and outgoing stations belongs to the 53 subway stations that are picked out. The data structures in the databases Ai and Bi are shown in tables 3 and 4, respectively.
TABLE 3 data Structure of database Ai
Date of card swiping Time of departure Card number Card seed Outbound site Time of arrival Station of entering station
2016-3-9 17:44:49 990776073544 101 1 17:22:03 22
2016-3-9 17:50:38 990500124234 101 1 17:18:12 10
2016-3-9 17:50:48 992174726881 2 1 17:12:35 12
Note: the subway stations numbered 1, 22, 10 and 12 belong to the picked 53 subway stations
TABLE 4 data Structure of database Bi
Date of card swiping Time of departure Card number Card seed Outbound site Time of arrival Station of entering station
2016-3-9 17:49:26 970674366062 101 84 16:55:34 77
2016-3-9 17:49:48 970472643805 101 86 17:06:30 62
2016-3-9 17:49:33 996295168418 1 84 17:26:48 92
Note: none of the subway stations numbered 84, 77, 86, 62, 84, 92 belong to the 53 subway stations that were picked
The public bicycle data are also stored according to different dates, and as shown in fig. 3, the public bicycle data of each day are divided into a station pair related database Ci and a station pair independent database Di, wherein i represents a date, for example, when i is 9, the card swiping data represents 3 months and 9 days. In the database Ci, as long as one station for borrowing and returning bicycles belongs to 92 public bicycle stations, a record is formed; in the database Di, none of the borrowing and returning stations belongs to the 92 public bike stations picked out. The data structures in the databases Ci and Di are shown in tables 5 and 6, respectively.
TABLE 5 data Structure of database Ci
Date of card swiping Card number Age (age) Borrowing time Borrow station point number Time of returning car Station number of returning vehicle
2016-3-9 996060557824 61 18:09:21 12017 18:14:55 12014
2016-3-9 970002313423 32 14:58:48 12017 15:40:47 12036
2016-3-9 990005468324 42 8:58:05 12017 9:06:09 12083
Note: the public bike station with the number 12017 belongs to the 92 public bike stations picked
Table 6 data structure of database Di
Date of card swiping Card number Age (age) Borrowing time Borrow station point number Time of returning car Station number of returning vehicle
2016-3-9 996060553855 58 8:14:53 32065 8:24:22 32053
2016-3-9 990261887945 44 8:24:29 19081 8:41:33 20062
2016-3-9 990163883943 27 7:43:40 19072 7:46:09 19035
Note: the public cycling stations with the numbers 32065, 32053, 19081, 20062, 19072, 19035 do not belong to the 92 public cycling stations that are picked up
3) According to different stations and different types of stations entering and exiting, the station pair related database Ai in the subway data is divided into a database E (i, j, k), wherein i represents the date, j represents the number of the subway station, k represents the type of the station entering and exiting, and the station entering is carried out when k is 1 and the station exiting is carried out when k is 2. Table 7 shows the partial card-swiping record in E (9,1,1), i.e. the card-swiping record of the 3 month and 9 day arrival at the subway station No. 1.
Table 73 No. 9/month No. 1 station subway IC card entering station card swiping data
Figure BDA0001416080050000061
Note: the subway station with the number of 1 is one of the picked 53 subway stations
Table 8 shows the partial card-reading records in the database E (9,1,2), i.e. the card-reading records of 3 months and 9 days that the car is out of the subway station No. 1.
Table 83 No. 9/month No. 1 station subway IC card outbound card swiping data
Figure BDA0001416080050000071
Note: the subway station with the number of 1 is one of the picked 53 subway stations
In the public bicycle station pair related database Ci, all public bicycle one-card IC card data are put into the database Fi, and public bicycle non-one-card IC card data are put into the database Gi, as shown in FIG. 4.
According to different stations and different bicycle borrowing and returning types, the one-card data Fi in the public bicycle data is divided into a database H (i, j ', k'), wherein i represents the date, j 'represents the public bicycle station number, k' represents the bicycle borrowing and returning type, and the bicycle borrowing and returning are carried out when k 'is 1 and k' is 2. Table 9 is a partial swipe record in H (9,12017,1), i.e., a swipe record of all the loans at the 12017 public biking station on the day of 3 months and 9 days.
Table 93 borrowing data of public bicycle station # 9/12017
Date of card swiping Card number Age (age) Borrowing time Borrow station point number Time of returning car Station number of returning vehicle
2016-3-9 997169396161 37 14:58:13 12017 15:38:50 12028
2016-3-9 990772429246 45 18:26:27 12017 18:42:19 12033
2016-3-9 990172167495 44 7:33:45 12017 7:42:57 12019
Note: public bicycle station number 12017 is one of the 92 public bicycle stations picked
Table 10 is a data structure in database H (9,12017,2), all the card-swiping records for return vehicles at the station of public bike No. 12017.
Watch 103 No. 9/12017 public bicycle station returning data
Date of card swiping Card number Age (age) Borrowing time Borrow station point number Time of returning car Station number of returning vehicle
2016-3-9 976072923219 48 7:45:31 12062 8:06:41 12017
2016-3-9 990163796422 52 7:33:00 12062 8:04:07 12017
2016-3-9 990774898034 28 7:46:52 12016 8:04:26 12017
Note: public bicycle station number 12017 is one of the 92 public bicycle stations picked
4. Building a database of card number pairs
The transfer method includes two types: when the passenger leaves the subway station, the passenger rents the public bicycle at the public bicycle station close to the subway station and rides the bicycle to leave; and returning the vehicle to the station, riding the passenger to a public bicycle station next to the subway station by using the public bicycle, returning the vehicle and entering the subway station to take the subway. And according to the corresponding relation between the subway station and the public bicycle station in the station pair, associating the corresponding subway outbound data and the public bicycle borrowing data, and the corresponding public bicycle returning data and the corresponding subway inbound data of the station pair in the same day. The card swiping data for each day is processed as above. The specific association method is as follows: in the embodiment, 10 minutes are selected as transfer time intervals of the returning vehicles for entering and leaving the station and borrowing the vehicles. For each station pair, reading the borrowing time of each public bicycle IC card data and calculating the time interval between the subway IC card outbound time and the borrowing time in the same day, and if the calculated time interval is within 10 minutes, considering that the association of the two cards is successful, wherein the character string formed by the public bicycle card number and the subway card number is called as a card number pair; and similarly, associating the public bicycle returning data with the subway station entering data. And processing the public bicycle one-card-through IC cards of all the station pairs as above to construct a card number pair database I, wherein the database I contains all card number pairs in the period from 3 months 9 days to 3 months 29 days and contains 2143229 card number pair records.
The data structure in the card number pair database I is shown in table 11 below.
Table 11 card number to data structure
Figure BDA0001416080050000081
5. Rejecting erroneous data in a database of card number pairs
The card number pairs with age and card type information contradiction, appearance time contradiction and space contradiction in the database of the card number pairs are deleted, and the processing process is shown in fig. 5.
1) Age and card information conflict
The invention assumes that the IC card transacted by oneself is only used by oneself, so when the age and card type of the two IC cards corresponding to the card number pair are contradictory, the record of the card number pair is screened out, and 240828 card number pairs are deleted altogether. As shown in table 12, the card number pair information shows that the subway IC card is 2, i.e. the user can handle the card for 12-17 years old, and the public bicycle IC card holder age shows 24 years old, and the card number pair record is deleted.
TABLE 12 Association record of card type and age conflicts
Figure BDA0001416080050000091
2) Time contradiction
According to the public bicycle card number in the card number pair, all card swiping records of the public bicycle IC card can be found, all using time periods of the public bicycle user can be calculated further according to the card swiping time, similarly, all using subway time periods of the subway user can be calculated according to the card swiping time in the subway IC card swiping records, if the calculated all using time periods of the public bicycle and all using subway time periods have intersection, the time contradiction of the card number pair is shown, the card number pair is not a person, and 609678 card number pairs are deleted altogether. As shown in table 13, for example, the first card number pair information indicates that the holder of the subway is riding on the subway at 15:12:34 to 15:36:03, and the public bike IC card indicates that the holder of the public bike starts using the public bike at 15:16:22 to 15:36:03, which indicates that the holders of the two cards are not the same person, and the card number pair record is deleted.
TABLE 13 associated records of time conflicts
Figure BDA0001416080050000092
3) Spatial contradiction
Finding out travel records of the bicycle card and the subway card in the card number pair in the same day, comparing each bicycle record with each subway record in pairs, and determining whether the two records are out-station for borrowing or returning to the station according to the time sequence; if the station coordinate is out of the station and the bicycle station is borrowed, the distance between the subway station and the bicycle station and the time difference between the station and the bicycle station are calculated according to the station coordinate, and if the station coordinate is returned to the station, the distance between the bicycle station and the subway station and the time difference between the returned vehicle and the station are calculated according to the station coordinate; taking the maximum speed of 40km/h of the ground trip mode in Nanjing city, if the maximum travelable distance obtained by multiplying the speed by the time difference is smaller than the distance between a public bike and a subway station calculated according to the station coordinates, a space contradiction occurs, and 126210 card number pairs are deleted altogether if the card number pairs are not one person. As shown in table 14, taking the first card number pair information as an example, the subway IC card shows that the cardholder makes a transfer between subway bicycles, the actual distance between the two stations is 5977 meters, the maximum speed of the ground trip mode in tokyo city is 40km/h, and the maximum distance that the cardholder can reach within the transfer time is obtained by multiplying the maximum speed by 51 seconds. If the actual distance is greater than the calculated distance, a space contradiction occurs, which indicates that the card holders of the two cards are not the same person, and the card number pair record is deleted.
Table 14 associated records of spatial conflicts
Figure BDA0001416080050000101
The database processed by the steps is L.
6. Selecting the card number pair meeting the specified characteristics to complete matching
The table 14 is incremented by one for counting the number of card number pairs and then sorted in ascending order of card number pairs, with the results shown in table 15. The database processed by the steps is M.
Table 15 new list card number pair number of occurrences
Figure BDA0001416080050000102
In the database M, if a card number pair consisting of one public bicycle card and a plurality of subway cards appears, the card number pair with the highest frequency is selected and placed in the database N, and the card number pair with the relatively low frequency under the condition of one-to-many is deleted at the same time, so that 939060 card number pair records are deleted altogether. The main reason why 21 correct card number pairs exist in the deleted card number pairs is that the user only regularly uses a public bicycle sometimes, but the user also forms the card number pairs with other subway cards at the moment. The number of correct pairs of card numbers to be deleted is negligible. The database M processing flow is shown in fig. 6.
Next, as shown in fig. 7, two situations occur when the card number pair type in the database N, the highest-frequency card number pair consisting of the public bicycle IC card and the subway IC card, is determined: the public bicycle card and the only subway card form the highest frequency card number pair or the public bicycle card and the card number pair formed by a plurality of subway cards are the highest frequency card number pair with the same times, and the former case is called one-to-one card number pair. If the card number is a one-to-one card number pair, the highest frequency one-to-one card number pair database O is put in, and the rest data are deleted. In the database O, if the card number pair is composed of two identical card numbers, the card numbers are matched correctly, otherwise, the card numbers are matched incorrectly. Therefore, the matching of the public bicycle one-card IC card and the subway IC card is completed.
7. Calculating matching accuracy and applying matching method
The matching accuracy is calculated by dividing the number of consistent pairs of card numbers by the total number in the database. Subsequently, the matching accuracy in database O was calculated to be 86.95%. Therefore, the method can accurately realize the matching of the public bicycle IC card and the subway IC card. Then, the method for matching the public bicycle one-card IC card with the subway IC card is applied to matching the public bicycle non-one-card IC card with the subway IC card to match 16440 correct card number pairs, which relate to 5217 IC cards. According to the matching result, the public bicycle IC card swiping data and the subway IC card swiping data can be accurately and efficiently matched by the method, and the method is suitable for popularization.

Claims (7)

1. A public bicycle IC card and subway IC card matching method based on space-time characteristics is characterized by comprising the following steps:
(1) acquiring original data of a public bicycle IC card and a subway IC card, and extracting effective data information from the original data;
(2) selecting public bicycle stations next to the subway station to form a subway-public bicycle station pair;
(3) preprocessing data of the public bicycle IC card and the subway IC card according to the subway-public bicycle station pair;
(4) according to the transfer time interval, the card number pairs under different transfer modes are correlated, a card number pair database is constructed, and error data in the database are eliminated, wherein the transfer modes comprise two modes: when the passenger leaves the subway station, the passenger rents the public bicycle at the public bicycle station close to the subway station and rides the bicycle to leave; returning the bus to the station, wherein passengers ride the bus to a public bicycle station next to the subway station by using public bicycles, return the bus and enter the subway station to take the subway; the process of constructing the card number pair database is as follows:
(4.1) judging the association: reading the borrowing time of each public bicycle IC card data based on the station pairs, calculating the time interval between the outbound time of all subway IC cards and the borrowing time within a certain time, and if the calculated time interval is within the specified transfer time interval, considering that the association of the two cards is successful; reading the returning time of each public bicycle IC card data and calculating the time interval between the arrival time of all subway IC cards and the returning time within a certain time, and if the calculated time interval is within the specified transfer time interval, considering that the association of the two cards is successful;
(4.2) composing card number pair: using a character string formed by successfully associated public bicycle IC card numbers and subway IC card numbers as a card number pair;
(4.3) constructing a database: processing the public bicycle IC cards of all the station pairs in the steps (4.1) - (4.2) to construct a card number pair database;
(5) and arranging the rest card number pairs in an ascending order, and picking out the card number pairs which accord with the specified characteristics to complete matching.
2. The space-time characteristic-based public bicycle IC card and subway IC card matching method as claimed in claim 1, wherein said public bicycle IC card valid information in step (1) includes: the card swiping date, the card number, the car borrowing time, the number of the car borrowing station, the longitude and latitude of the car borrowing station, the car returning time, the number of the car returning station, the longitude and latitude of the car returning station and the age of the user; the subway IC card effective information comprises: the card swiping date, the card number, the arrival time, the station number, the station longitude and latitude, the station exit time, the station number, the station longitude and latitude and the card type.
3. The space-time characteristic-based public bicycle IC card and subway IC card matching method according to claim 1, wherein in said step (2), public bicycle stations within a certain range from the subway station port are selected to form a subway-public bicycle station pair; and if one subway station corresponds to a plurality of public bicycle stations, taking the IC card data of the public bicycle stations as a whole.
4. The space-time characteristic-based public bicycle IC card and subway IC card matching method as claimed in claim 1, wherein said preprocessing of data in step (3) comprises:
(3.1) screening out interference data, wherein the interference data comprises a card swiping record with a residual and missing item, a card swiping record with a logical error, a public bicycle card swiping record with a vehicle using time of less than 2 minutes, and a subway card swiping record with an in-out time interval of less than 5 minutes;
(3.2) eliminating data irrelevant to the station, and eliminating data corresponding to public bicycle stations of which the borrowing station and the returning station do not belong to the station pair category and subway stations of which the inbound station and the outbound station do not belong to the station pair category according to the selected subway-public bicycle station pair;
and (3.3) storing the data in a classified manner, storing the public bicycle IC card data in a date-by-date, station-by-station and borrowing-returning type, and storing the subway IC card data in a date-by-date, station-by-station and station-out type.
5. The space-time characteristic-based public bicycle IC card and subway IC card matching method according to claim 1, wherein in said step (4), rejecting erroneous data in the card number pair database comprises deleting card number pairs with age and card type information contradiction, appearance time contradiction and space contradiction in the card number pair database.
6. The space-time characteristic-based public bicycle IC card and subway IC card matching method as claimed in claim 1, wherein said picking out card number pairs meeting specified characteristics in step (5) comprises: counting the occurrence times of the same card number pair based on the card number pair data after ascending arrangement, and selecting the card number pair which has the highest frequency corresponding to the public bicycle IC card and is formed by only one subway IC card, wherein if the card number pair consists of two completely same card numbers, the card number pair is matched correctly, and otherwise, the card number pair is matched incorrectly.
7. The space-time characteristic-based public bicycle IC card and subway IC card matching method according to claim 6, further comprising calculating matching accuracy rate, the calculating method of matching accuracy rate is the number of consistent card number pairs divided by the total number in the card number pair database.
CN201710865835.7A 2017-09-22 2017-09-22 Public bicycle IC card and subway IC card matching method based on time-space characteristics Active CN107657006B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710865835.7A CN107657006B (en) 2017-09-22 2017-09-22 Public bicycle IC card and subway IC card matching method based on time-space characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710865835.7A CN107657006B (en) 2017-09-22 2017-09-22 Public bicycle IC card and subway IC card matching method based on time-space characteristics

Publications (2)

Publication Number Publication Date
CN107657006A CN107657006A (en) 2018-02-02
CN107657006B true CN107657006B (en) 2020-12-11

Family

ID=61130875

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710865835.7A Active CN107657006B (en) 2017-09-22 2017-09-22 Public bicycle IC card and subway IC card matching method based on time-space characteristics

Country Status (1)

Country Link
CN (1) CN107657006B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108664553A (en) * 2018-04-03 2018-10-16 东南大学 A kind of subway and public bicycles brushing card data fusion method
CN108681741B (en) * 2018-04-08 2021-11-12 东南大学 Subway commuting crowd information fusion method based on IC card and resident survey data
CN109828991B (en) * 2018-12-03 2021-10-19 深圳市北斗智能科技有限公司 Query ordering method, device, equipment and storage medium under multi-space-time condition
CN115359592B (en) * 2022-10-21 2023-02-14 深圳市城市交通规划设计研究中心股份有限公司 Traffic card number real-name matching method, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007073115A1 (en) * 2005-12-23 2007-06-28 Eb Co., Ltd. Method and apparatus for confirming real-time location using transportation card
CN101789175A (en) * 2010-01-08 2010-07-28 北京工业大学 Public transportation multi-route static coordination and dispatching method
CN103699601A (en) * 2013-12-12 2014-04-02 深圳先进技术研究院 Temporal-spatial data mining-based metro passenger classification method
CN105335795A (en) * 2015-10-23 2016-02-17 东南大学 Metro-bus transfer problem automatic diagnosis method based on IC card data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007073115A1 (en) * 2005-12-23 2007-06-28 Eb Co., Ltd. Method and apparatus for confirming real-time location using transportation card
CN101789175A (en) * 2010-01-08 2010-07-28 北京工业大学 Public transportation multi-route static coordination and dispatching method
CN103699601A (en) * 2013-12-12 2014-04-02 深圳先进技术研究院 Temporal-spatial data mining-based metro passenger classification method
CN105335795A (en) * 2015-10-23 2016-02-17 东南大学 Metro-bus transfer problem automatic diagnosis method based on IC card data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于IC卡数据的地铁与常规公交换乘时间分析;蒋敏;《中国优秀硕士学位论文全文数据库 工程科技II辑》;20170615;C034-7 *

Also Published As

Publication number Publication date
CN107657006A (en) 2018-02-02

Similar Documents

Publication Publication Date Title
CN107657006B (en) Public bicycle IC card and subway IC card matching method based on time-space characteristics
CN106874432B (en) A kind of public transport passenger trip space-time trajectory extracting method
CN107818412B (en) Large-scale bus passenger OD parallel computing method based on MapReduce
CN110753307B (en) Method for acquiring mobile phone signaling track data with label based on resident survey data
CN104318324B (en) Shuttle Bus website and route planning method based on taxi GPS records
CN105809292B (en) Bus IC card passenger getting off car website projectional technique
CN107590239B (en) Method for measuring connection radius of public bicycle at subway station based on IC card data
CN110853156B (en) Passenger OD identification method integrating bus GPS track and IC card data
CN109903553B (en) Multi-source data mining bus station identification and inspection method
Chakirov et al. Activity identification and primary location modelling based on smart card payment data for public transport
Huang et al. A method for bus OD matrix estimation using multisource data
CN112447041A (en) Method and device for identifying operation behavior of vehicle and computing equipment
CN107578619B (en) Method for measuring public bicycle service range of subway station based on IC card data
CN112800210B (en) Crowd portrayal algorithm based on mass public transport data
CN114358808A (en) Public transport OD estimation and distribution method based on multi-source data fusion
CN108182593B (en) Method and device for customizing bus body advertisement delivery route based on map data
Chen et al. An analysis of movement patterns between zones using taxi GPS data
CN108681741B (en) Subway commuting crowd information fusion method based on IC card and resident survey data
Yao et al. Analysis of key commuting routes based on spatiotemporal trip chain
CN113536906A (en) Face recognition method and device based on passenger portrait
CN110020666B (en) Public transport advertisement putting method and system based on passenger behavior mode
CN111680707A (en) Card swiping data analysis method based on public transportation system, electronic terminal and storage device
CN106781467A (en) A kind of bus passenger based on collaborative filtering is swiped the card site information extracting method
CN108053238B (en) Bus body advertisement delivery line customization method and device and electronic equipment
CN112488422B (en) Multi-mode travel demand prediction method based on multi-task learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant