CN111651529A - Airport aviation passenger classification identification method based on mobile phone signaling data - Google Patents

Airport aviation passenger classification identification method based on mobile phone signaling data Download PDF

Info

Publication number
CN111651529A
CN111651529A CN202010456844.2A CN202010456844A CN111651529A CN 111651529 A CN111651529 A CN 111651529A CN 202010456844 A CN202010456844 A CN 202010456844A CN 111651529 A CN111651529 A CN 111651529A
Authority
CN
China
Prior art keywords
airport
mobile phone
signaling
province
passengers
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010456844.2A
Other languages
Chinese (zh)
Inventor
刘劲松
姚海芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hebei Normal University
Original Assignee
Hebei Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hebei Normal University filed Critical Hebei Normal University
Priority to CN202010456844.2A priority Critical patent/CN111651529A/en
Publication of CN111651529A publication Critical patent/CN111651529A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24564Applying rules; Deductive queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2477Temporal data queries
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/20Services signaling; Auxiliary data signalling, i.e. transmitting data via a non-traffic channel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W8/00Network data management
    • H04W8/18Processing of user or subscriber data, e.g. subscribed services, user preferences or user profiles; Transfer of user or subscriber data
    • H04W8/20Transfer of user or subscriber data

Abstract

The invention discloses an airport aviation passenger classification and identification method based on mobile phone signaling data. The method comprises the following steps: firstly, extracting mobile phone signaling data of base stations around a target airport, generating an airport user mobile phone list, then tracing the current day signaling of mobile phone users in the list to obtain a comprehensive data set, preprocessing the data set, dividing the data set into three types of data sets, classifying mobile equipment identification codes in the airport mobile phone user list according to the three types of data sets and a preset aviation passenger identification rule, and finally counting the number of mobile phone users of each type to obtain the passenger source distribution characteristics of the target airport aviation passenger. The method can automatically identify various aviation passengers in the airport, such as passengers entering the airport, passengers leaving the airport, transit passengers and the like by using the mobile phone signaling data, and count the passenger source and place distribution information of various aviation passengers, thereby laying a foundation for the follow-up study of the travel behavior characteristics of various aviation passengers and the description of the airport abdominal space-time pattern.

Description

Airport aviation passenger classification identification method based on mobile phone signaling data
Technical Field
The invention relates to the field of big data calculation, in particular to an airport aviation passenger classification and identification method based on mobile phone signaling data.
Background
Current air transport demand is growing rapidly and global airport throughput is increasing. In 2018, the number of passengers transported by the global regular flights is 44 hundred million, which is 6.9 percent higher than that in 2017, and the number of newly added passengers is 2.84 hundred million. In 2013 and 2018, the number of civil aviation passengers in China is increased from 3.54 hundred million to 6.12 hundred million, and the annual growth rate in five years is 11.56%. The rapid increase of air passenger transportation leads to the rapid increase of airport ground traffic, and brings great challenges to the ground traffic control around the airport.
In order to continuously optimize the travel experience of the aviation passengers, scholars at home and abroad pay great attention to the travel behavior research of the aviation passengers. Social and economic attributes of the aviation passengers, travel information such as departure time, travel time consumption and travel frequency and the like are obtained mainly through telephone access, mail consultation, field investigation and the like, and driving factors of travel behaviors of the aviation passengers are discussed in an empirical mode. However, the travel behavior information of the aviation passengers acquired by the method has the defects of few samples, poor timeliness, discontinuous behavior tracks and the like.
The mobile phone signaling data has the advantages of abundant space-time information, high time resolution, low acquisition cost and the like, and provides abundant data sources for researching the travel behavior of aviation passengers. The method and the device for the travel behavior of the airport aviation passengers study the travel behavior of the airport aviation passengers according to the mobile phone signaling data.
Disclosure of Invention
The technical problem to be solved by the invention is to provide an airport aviation passenger classification and identification method based on mobile phone signaling data, which can automatically identify various aviation passengers in an airport, such as passengers entering the airport, passengers leaving the airport, transit passengers and the like by using the mobile phone signaling data, and count the passenger source distribution information of various aviation passengers, thereby laying a foundation for the follow-up study of the travel behavior characteristics of various aviation passengers and the characterization of the space-time pattern of the airport abdominal region.
In order to solve the technical problems, the invention adopts the technical scheme that: an airport aviation passenger classification and identification method based on mobile phone signaling data comprises the following steps:
step 1, selecting base stations near a target airport, taking 'day' as a unit, extracting all mobile phone signaling data captured by each base station for 0-24 hours in at least seven days, and storing mobile equipment identification codes in the signaling data without repetition to generate an airport mobile phone user list.
The longer the extraction time is, the more data is extracted, and the more comprehensive and reliable the contained information is. The time extraction width is set to seven days, and in practical application, the time width can be freely adjusted according to needs and accuracy.
In order to reduce redundancy to the maximum extent, only one mobile equipment identification code which appears for multiple times in the same day is reserved and stored according to the same user.
Step 2, tracing back the mobile phone user corresponding to each mobile equipment identification code in the airport mobile phone user list, and extracting all signaling data of each mobile phone user appearing in the province of the current day, wherein the signaling data are in a known format and contain fields of signaling occurrence time, longitude, latitude and the like besides the mobile equipment identification code; each mobile equipment identification code corresponds to a unique mobile phone user, all signaling data under the user form a user record, the user records of the mobile phone users corresponding to all the mobile equipment identification codes in the airport mobile phone user table form a comprehensive data set, namely, the comprehensive data set consists of a group of user records, and the user records contain all the signaling of a certain mobile phone user in one day;
step 3, carrying out data preprocessing on the comprehensive data set, deleting repeated signaling data and invalid signaling data in the user record, reducing data redundancy and improving processing efficiency;
step 4, splitting the comprehensive data set obtained in the step 3 into a target airport data set A, other airport data sets B and a peripheral data set O;
the target airport data set A refers to a data set formed by user records of mobile phone users in each base station near a target airport; the mobile phone user with the signaling data in the data set is identified as a passenger who has moved in a target airport, and the time when the signaling of the user appears in the data set is a key node for judging the category of the application;
the other airport data set B refers to a data set formed by user records of mobile phone users in base stations nearby other airports except for a target airport in the provincial region range, and in the application, the mobile phone users with signaling data in the data set are identified as aviation passengers departing from or arriving at other airports in the provincial region;
the peripheral data set O refers to a data set consisting of all signaling except the target airport data set A and other airport data sets B in the comprehensive data set; the mobile phone users with signaling data concentrated in the data are identified as users in all areas except the provincial airport, namely users in urban areas, suburban areas, rural areas and the like.
Step 5, classifying the mobile equipment identification codes in the airport mobile phone user list in the step 1 one by one according to the data set and a preset aviation passenger identification rule, and classifying the mobile equipment identification codes into eight preset classes of passengers;
and 6, counting the number of mobile phone users contained in each type of aviation passenger according to the classification result in the step 5, and taking the number as the number of each type of passengers in the selected aviation passenger at the target airport to obtain the passenger source distribution characteristics of the aviation passenger at the target airport.
In the step 1, the mobile phone signaling base station is located within 2 kilometers of the periphery of the selected airport, namely, within a circular region with the airport as the center and the radius of 2 kilometers; this range is one of the ranges used in the present application, and can be adjusted according to actual needs.
The repeated signaling in the step 3 indicates signaling with completely consistent time, longitude and latitude in the user record, and the invalid signaling indicates signaling with null time, longitude and latitude fields in the user record. The presence of these signaling increases data redundancy and reduces information processing efficiency, and therefore, it is deleted.
The eight classes of passengers in the step 5 are specifically: passengers entering the port from outside the province, passengers entering the port from inside the province, passengers leaving the port to outside the province, passengers leaving the port to inside the province, transit passengers flying from outside the province to outside the province, transit passengers flying from outside the province to inside the province, transit passengers flying from inside the province to outside the province, and others.
In step 5, the eight classes of passenger identification rules are specifically:
determining the number of mobile phone users 1 in the target airportThe occurrence time t of the first signaling in data set AA1And the time t at which the last signalling occursA2Let the mobile phone user 1 be at tA1The last signaling before is named as front signaling, and the mobile phone user 1 is at tA2The first signaling after this is named post signaling:
(1) if no front signaling exists in the other airport data set B and the peripheral data set O and no rear signaling exists, the aviation passenger corresponding to the mobile phone user 1 is judged to be' entering port from outside province and flying to province
Foreign transit passengers ";
(2) if no former signaling exists in the other airport data set B and the peripheral data set O, but a latter signaling appears in the other airport data set B, it is determined that the airline passenger corresponding to the mobile phone user 1 is a transit passenger who enters the port from outside the province and flies into the province;
(3) if no front signaling exists in the other airport data set B and the peripheral data set O, but the rear signaling appears in the peripheral data set O, the aviation passenger corresponding to the mobile phone user 1 is determined to be a passenger who enters the airport from the province;
(4) if the former signaling appears in the peripheral data set O, but the latter signaling does not exist in the other airport data set B and the peripheral data set O, the aviation passenger corresponding to the mobile phone user 1 is determined to be a passenger from the port to the province;
(5) if the former signaling appears in the peripheral data set O and the latter signaling appears in the other airport data set B, the aviation passenger corresponding to the mobile phone user 1 is determined to be a passenger from the port to the province;
(6) if the former signaling appears in the other airport data set B, but the latter signaling does not appear in the other airport data set B and the peripheral data set O, it is determined that the airline passenger corresponding to the mobile phone user 1 is a transit passenger who enters the port from the province and flies outside the province;
(7) if the former signaling appears in other airport data set B and the latter signaling appears in the peripheral data set O, the aviation passenger corresponding to the mobile phone user 1 is determined to be a passenger who enters the airport from province;
(8) passengers other than the seven classes of airline passengers are determined as "other passengers".
There is also a preferred step between step 1 and step 2: and (3) deleting the mobile equipment identification codes which appear for 3 days and more continuously within 7 days in the airport mobile phone user list obtained in the step (1).
The technical effect obtained by adopting the technical scheme is as follows:
the method is favorable for positioning potential markets and optimizing resource allocation of airports and airlines by determining the number of various aviation passengers and the distribution characteristics of passenger sources. Since the number of airline passengers and the information of the passenger source belong to private data of each airline company, the availability of the data is limited. With the rapid growth of aviation passenger transportation, the ground traffic volume of an airport rapidly grows, and great challenges are brought to the ground traffic control around the airport. In order to continuously optimize the travel experience of the aviation passengers, scholars at home and abroad pay great attention to the travel behavior research of the aviation passengers. The acquired travel behavior information of the aviation passengers has the defects of few samples, poor timeliness, discontinuous behavior tracks and the like in the traditional modes of telephone access, mail consultation, field investigation and the like. Based on the mobile phone signaling data, an aviation passenger classification and identification rule is established, airport aviation passengers are classified and identified, and a solid foundation is laid for analyzing the trip characteristics of various aviation passengers, revealing the behavior rules of various aviation passengers, optimizing the public resource allocation between an airport and the abdominal region and improving the trip experience of passengers by utilizing the mobile phone signaling data subsequently.
The invention can quickly extract various aviation passengers in the airport based on the mobile phone signaling data and count the passenger source and place information of various aviation passengers. The invention provides a new method for acquiring comprehensive information of the number of various aviation passengers and the passenger source place.
Drawings
FIG. 1 is a flowchart of a method of example 1 of the present invention.
FIG. 2 is a flowchart of a method of embodiment 2 of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
Example 1
As shown in fig. 1, the method for classifying and identifying airport aviation passengers based on mobile phone signaling data includes the following steps:
step 1, selecting base stations near a target airport, taking 'day' as a unit, extracting all mobile phone signaling data captured by each base station for 0-24 hours in at least seven days, and storing mobile equipment identification codes in the signaling data without repetition to generate an airport mobile phone user list.
The longer the extraction time is, the more data is extracted, and the more comprehensive and reliable the contained information is. The time extraction width is set to seven days, and in practical application, the time width can be freely adjusted according to needs and accuracy. In order to reduce redundancy to the maximum extent, the same mobile equipment identification code appearing many times in the same day is stored according to the same user.
The mobile phone signaling base station is located in a range of 2 kilometers around a selected airport, namely, in a circular region range with the airport as a center and the radius of 2 kilometers. This range is one of the ranges used in the present application, and can be adjusted according to actual needs. Within this range there should be multiple base stations, all of which are to be signaled. According to the application, the aviation passengers are identified through mobile phone signals, which is the popularity of the existing mobile phones.
And 2, backtracking the mobile phone user corresponding to each mobile equipment identification code in the airport mobile phone user list, and extracting all signaling data of each mobile phone user appearing in the province of the current day, wherein the signaling data are in a known format and contain fields of signaling occurrence time, longitude, latitude and the like besides the mobile equipment identification code.
Each mobile equipment identification code corresponds to a unique mobile phone user, all signaling data under the user form a user record, the user records of all mobile phone users in the airport mobile phone user list form a comprehensive data set, namely, the comprehensive data set consists of a group of user records, and the user records contain all signaling of a certain mobile phone user in one day. Hereinafter, the mobile device identification code is referred to as a mobile phone user.
The signaling of the mobile phone is a well-known signal in the communication field, and the signaling contains a lot of key information which can be extracted by a program, such as a mobile equipment identification code of the mobile phone. Each mobile phone has a unique mobile equipment identification code, so that the airport passenger statistics method extracts the unique identification code.
In addition, the time of occurrence of the signaling, the longitude and latitude of the location, and the like are also included, and in the present application, the time of occurrence, the longitude, and the latitude of the signaling are all important parameters.
And 3, carrying out data preprocessing on the comprehensive data set, deleting repeated signaling data and invalid signaling data in the user record, reducing data redundancy and improving processing efficiency.
The repeated signaling data refers to signaling with completely consistent occurrence time, longitude and latitude in the user record, and the invalid signaling data refers to signaling with null occurrence time, longitude and latitude fields in the user record. The presence of these signaling increases data redundancy and reduces information processing efficiency, and therefore, it is deleted.
Step 4, splitting the comprehensive data set obtained in the step 3 into a target airport data set A, other airport data sets B and a peripheral data set O;
the target airport data set A refers to a data set formed by user records of mobile phone users in each base station near a target airport; the mobile phone user with the signaling data in the data set is identified as a passenger who has moved in a target airport, and the time when the signaling of the user appears in the data set is a key node for judging the category of the application;
the other airport data set B refers to a data set formed by user records of mobile phone users in base stations nearby other airports except for a target airport in the provincial region range, and in the application, the mobile phone users with signaling data in the data set are identified as aviation passengers departing from or arriving at other airports in the provincial region;
the peripheral data set O refers to a data set consisting of all signaling except the target airport data set A and other airport data sets B in the comprehensive data set; the mobile phone users with signaling data concentrated in the data are identified as users in all areas except the provincial airport, namely users in urban areas, suburban areas, rural areas and the like.
And 5, classifying the mobile equipment identification codes in the airport mobile phone user list in the step 1 one by one according to the data set and a preset aviation passenger identification rule, and classifying the mobile equipment identification codes into eight preset classes of passengers. The classification of the mobile equipment identification codes is the classification of mobile phone users or aviation passengers which appear in the airport.
Eight classes of passengers are specifically: passengers who enter the port from outside the province, passengers who enter the port from inside the province, passengers who go out of the port to outside the province, passengers who go out of the port to inside the province, transit passengers who fly from outside the province to inside the province, transit passengers who fly from inside the province to outside the province, and other passengers.
Due to the limitations of data acquisition, the experimental results of the present application are based on the intra-provincial airport. Therefore, the names of various aviation passengers are related to 'province inside' and 'province outside'. The selected target airport is an airport in the province, and other airports are airports in the province except the airport.
The eight classes of passenger identification rules are specifically:
determining the occurrence time t of the first signaling of the mobile phone user 1 in the target airport data set AA1And the time t at which the last signalling occursA2Let the mobile phone user 1 be at tA1The last signaling before is named as front signaling, and the mobile phone user 1 is at tA2The first signaling after this is named post signaling:
(1) the conditions include that the mobile phone user 1 does not appear in other places in the province before entering the target airport, and the user does not appear in other places in the province after leaving the target airport, and the condition is that the mobile phone user only appears in the airport, namely the mobile phone user 1 goes from the province to the province, and the target airport is only transit, so that the airline passenger corresponding to the mobile phone user 1 is determined to be a transit passenger who enters the airport from the province and flies to the province;
(2) the condition that the former signaling does not exist in the other airport data set B and the peripheral data set O, but the latter signaling appears in the other airport data set B can be understood that, before entering the target airport, the mobile phone user 1 does not appear in other places in the province, and after exiting the target airport, the user arrives at other airports in the province for the first time, so that the airline passenger corresponding to the mobile phone user 1 is determined to be a transit passenger who enters the airport from the province and flies to the province;
(3) the condition that the pre-signaling does not exist in the other airport data set B and the peripheral data set O, but the post-signaling appears in the peripheral data set O can be understood that, before entering the target airport, the mobile phone user 1 does not appear in other places in the province, and after leaving the target airport, the user arrives at other places outside the province in the first time, that is, the user arrives at the province in the first place, and then the airline passenger corresponding to the mobile phone user 1 is determined to be a passenger entering the airport from outside the province;
(4) the condition that the former signaling appears in the peripheral data set O, but no later signaling exists in the other airport data sets B and the peripheral data set O can be understood as that before entering the target airport, the mobile phone user 1 is in an area outside an in-province airport, and after leaving the target airport, the user saves the province and out of the province for the first time, and then the airline passenger corresponding to the mobile phone user 1 is determined to be a passenger from the port to the outside of the province;
(5) the condition that the former signaling appears in the peripheral data set O and the latter signaling appears in the other airport data set B can be understood as that before entering the target airport, the mobile phone user 1 is in an area outside the provincial airport, and after the mobile phone user leaves the target airport, the user arrives at other airports in the province for the first time, so that the airline passenger corresponding to the mobile phone user 1 is determined to be a "passenger from port to province";
(6) the former signaling appears in other airport data sets B, but the latter signaling does not appear in other airport data sets B and peripheral data sets O, and the above conditions can be understood as that, before entering the target airport, the mobile phone user 1 is in other airport areas in the province, after leaving the target airport, the user just goes out of province to the province for the first time, and then the airline passenger corresponding to the mobile phone user 1 is determined to be a transit passenger who enters the airport from the province and flies out of province;
(7) the condition that the former signaling appears in the data set B of other airports and the latter signaling appears in the data set O of the periphery can be understood as that, before entering the target airport, the mobile phone user 1 is in other airports in the province, and after the mobile phone user leaves the target airport, the user arrives at other areas outside the airport for the first time, then the airline passenger corresponding to the mobile phone user 1 is determined to be a "passenger who enters the airport from the province";
(8) if the passenger does not belong to one of the seven classes of aviation passengers, the passenger corresponding to the mobile phone user 1 is judged to be other.
After the category judgment is finished, the identification code of each mobile device or the mobile phone user is added with a category parameter, so that the statistics of the number of subsequent users is facilitated. Of course, in addition to this, the mobile device identification code can also be assigned to the category of the airline passenger, so that the number can also be counted. The specific implementation is not limited. Table 1 is a table of identification rules for eight classes of airline passengers in the present application.
TABLE 1 airport aviation passenger classification and discrimination rules
Figure BDA0002509677440000091
Figure BDA0002509677440000101
*: indicating that in the integrated dataset there is no such signaling.
And 6, counting the number of mobile phone users contained in each type of aviation passenger according to the classification result in the step 5, and taking the number as the number of each type of passengers in the aviation passenger of the selected target airport to obtain the passenger source area distribution characteristics of the target airport.
Meanwhile, airport passenger source place statistics is carried out by utilizing information of the province, the city and the like of the affiliation included in each aviation passenger mobile phone signaling, and the main passenger source market of the target airport is obtained.
Through the six steps, the classification and identification of airport aviation passengers and the distribution statistics of passenger sources and places can be realized.
To verify the accuracy and precision of the method of the present application, we measure using the identification longitude. The identification precision calculation formula is as follows:
Figure BDA0002509677440000102
wherein p is the identification precision, p' is the number of the aviation passengers obtained by identification, and poThe actual number of air passengers at the target airport.
After the number of each type of aviation passengers is counted by the method, the market proportion occupied by a mobile phone operator is used as a sample expansion coefficient, the identification result is subjected to sample expansion, the sample expansion result is compared with aviation passenger data actually counted by a target airport for analysis, and if the identification precision is higher than 70%, the identification result is considered to be credible.
Taking the certified international airport (abbreviated as Shijiazhuang airport) in Shijiazhuang city in Hebei province as an example, in 10 months in 2019, there are 3 provinces of navigation cities in the Shijiazhuang airport, which are Chengde, Zhang Jiakou and Qinhuang island respectively. During the national day of 2019 (10 months, 1 day-7 days), 24 thousands of passengers were actually carried by the Shijiazhuang airport. And taking 1-7 days (1 week in total) of 10 months in 2019 as a time range, extracting mobile phone signaling data communicated with the river north at and near the stone house airport, tracing back mobile phone users, and extracting all mobile phone signaling data of the mobile phone users in the river north province on the same day. According to the airport aviation passenger classification and identification method, aviation passengers at Shijiazhuang airports are classified and identified, and 56911 people of aviation passengers are shared by Shijiazhuang airports in the period, wherein 21408 people exist among the passengers entering from the province, 1000 people exist among the passengers entering from the province, 29269 people exist among the passengers leaving the province, 872 people exist among the passengers leaving the province, 3793 people exist among the transit passengers flying from the province to the province, 259 people exist among the transit passengers flying from the province to the province, and 310 people exist among the transit passengers flying from the province to the province. And (3) expanding the identification result according to the average market share of 19.56% of China Unicom in China to obtain 29.1 million aerial passengers at the Shijiazhuang airport, wherein the accuracy of the identification result is about 78.8%.
From the aspect of the distribution of the customer source, on the provincial scale, the main customer source is Hebei province accounting for 44.1 percent, and secondly Beijing city accounting for 8.7 percent, and the proportion of other provinces is smaller; on the scale of the city, the main source of the customer is Shijiazhuang city, the proportion of the Shijiazhuang city is 27.7%, the second is Beijing city, the proportion of the Beijing city is 8.7%, the Baoding city is arranged at the 3 rd position, the proportion of the Baoding city is 5.5%, the proportions of Shanghai and the Chenchentai city are both 3%, the proportion of the Heishi city is 2.2%, the proportions of other cities are smaller, and the distribution characteristics of the customer source are consistent with common knowledge.
Example 2
In contrast to example 1, a preferred procedure is also provided in the process.
And (3) deleting the mobile equipment identification codes which appear for 3 days or more continuously from the airport mobile phone user list obtained in the step (1). Because, as a rule of thumb, if the mobile device identification code appears at the airport for 3 consecutive days, the user of the device, i.e. the mobile phone user, is basically a resident or an employee of the airport who is present in the vicinity of the airport, such a person is not included in the category, and therefore such a user is deleted from the airport mobile phone user list. It is calculated that the Shijiazhang airport has 49962 passengers during the period, wherein 18585 passengers enter the province from the province, 974 passengers enter the province from the province, 26350 passengers leave the province, 856 passengers leave the province, 2661 passengers flying to the province from the outside of the province, 242 passengers flying to the province from the outside of the province, and 294 passengers flying to the outside of the province from the inside of the province. The recognition result is expanded according to the average market share of China Unicom of 19.56% in China, 25.5 million people are obtained in total at the airport of Shijiazhuang, the precision of the recognition result is about 93.6%, and the performance is more excellent.
The example shows that the airport aviation passenger classification and identification method based on the mobile phone signaling data can identify various aviation passengers in an airport, can realize the number of the various aviation passengers and the passenger source and city distribution statistics of a target airport, has the identification precision of over 75 percent, shows more excellent performance, and can be applied to actual production.
The signaling data used in the invention is obtained from the operator through a legal way, and the signaling data does not contain any sensitive information, does not relate to the privacy of the user, and is reasonable and legal in data application.
Since the number of airline passengers and the information of the passenger source belong to the private data of each airline company, the availability of the data is limited. However, the method of the invention can quickly identify various aviation passengers in an airport, including the entrance, exit, transfer and the like, and can count the number of various aviation passengers and the distribution characteristics of passenger sources, thereby making up the limitation of data acquirement.
Due to the limitations of data acquisition, the experimental results of the present application are based on the intra-provincial airport. However, the method of the present application is not only applicable to provinces, but also can be generalized to a larger range as long as sufficient abundant data support is available.

Claims (6)

1. An airport aviation passenger classification and identification method based on mobile phone signaling data is characterized by comprising the following steps:
step 1: selecting base stations near a target airport, taking 'day' as a unit, extracting all mobile phone signaling data captured by each base station for 0-24 hours in at least seven days, and storing mobile equipment identification codes in the signaling data without repetition to generate an airport mobile phone user list;
step 2: tracing back the mobile phone user corresponding to each mobile equipment identification code in the airport mobile phone user table, extracting all signaling data of each mobile phone user appearing in the province on the same day, wherein all signaling data under each mobile phone user form a user record of the user, and the user records of all mobile phone users form a comprehensive data set;
and step 3: carrying out data preprocessing on the comprehensive data set, and deleting repeated signaling data and invalid signaling data in the user record;
and 4, step 4: splitting the comprehensive data set obtained in the step 3 into a target airport data set A, other airport data sets B and a peripheral data set O; the target airport data set A refers to a data set formed by user records of mobile phone users in each base station near a target airport; the other airport data set B refers to a data set formed by user records of mobile phone users in base stations nearby other airports except the target airport in the provincial region range; the peripheral data set O refers to a data set formed by all user records except the target airport data set A and the other airport data sets B in the comprehensive data set;
and 5: classifying the mobile equipment identification codes in the airport mobile phone user list in the step 1 one by one according to the data set and a preset aviation passenger identification rule, and classifying the mobile equipment identification codes into eight preset classes of passengers;
step 6: and 5, counting the number of mobile phone users contained in each type of aviation passenger according to the classification result in the step 5, and taking the number as the number of each type of passengers in the aviation passenger of the selected target airport to obtain the passenger source and place distribution characteristics of the target airport.
2. The method for classifying and identifying airport aviation passengers based on mobile phone signaling data as claimed in claim 1, wherein the base stations near the target airport in step 1 are located within 2 km of the selected airport.
3. The method for classifying and identifying airport aviation passengers based on mobile phone signaling data as claimed in claim 1, wherein said repeated signaling data in step 3 is signaling with completely consistent occurrence time, longitude and latitude in user record, and said invalid signaling data is signaling with null occurrence time, longitude and latitude fields in user record.
4. The method for classifying and identifying airport aviation passengers based on mobile phone signaling data as claimed in claim 1, wherein the eight classes of passengers in the step 5 are specifically: passengers entering the port from outside the province, passengers entering the port from inside the province, passengers leaving the port to outside the province, passengers leaving the port to inside the province, transit passengers flying from outside the province to outside the province, transit passengers flying from outside the province to inside the province, transit passengers flying from inside the province to outside the province, and others.
5. The method for classifying and identifying airport aviation passengers based on mobile phone signaling data as claimed in claim 4, wherein in step 5, the identification rules of eight classes of passengers are specifically:
determining the occurrence time t of the first signaling of the mobile phone user 1 corresponding to the mobile equipment identification code in the target airport data set AA1And the time t at which the last signalling occursA2Let the mobile phone user 1 be at tA1The last signaling before is named as front signaling, and the mobile phone user 1 is at tA2The first signaling after this is named post signaling:
(1) no pre-signaling exists in the other airport data sets B and the peripheral data set O, and there is no pre-signaling in the other airport data sets B and O
Then, signaling is carried out, and then the aviation passenger corresponding to the mobile phone user 1 is judged to be a transit passenger who enters a port from outside province and flies outside province;
(2) no pre-signaling is present in other airport datasets B and peripheral datasets O, but post-signaling is present
In the other airport data set B, it is determined that the airline passenger corresponding to the mobile phone user 1 is a "transit passenger who enters the port from outside the province and flies into the province";
(3) no pre-signaling is present in other airport datasets B and peripheral datasets O, but post-signaling is present
In the peripheral data set O, it is determined that the airline passenger corresponding to the mobile phone user 1 is a "passenger who enters the port from outside province";
(4) pre-signaling occurs in the peripheral dataset O, but other airport datasets B and peripheral datasets
If no post-signaling exists in the step O, the aviation passenger corresponding to the mobile phone user 1 is judged to be a passenger from the port to the province;
(5) pre-signalling occurs in the peripheral data set O and post-signalling occurs in the other airport data set B
If so, judging that the aviation passenger corresponding to the mobile phone user 1 is a passenger from port to province;
(6) pre-signaling is present in other airport data set B, but post-signaling is not present in other airport numbers
In the data set B and the peripheral data set O, the aviation passenger corresponding to the mobile phone user 1 is determined to be a transit passenger who enters a port from the province and flies outside the province;
(7) pre-signaling occurs in other airport datasets B and post-signaling occurs in peripheral datasets O
If so, judging that the aviation passenger corresponding to the mobile phone user 1 is a passenger who enters a port from province;
(8) the mobile phone users except the seven types of aviation passengers are judged as other.
6. The method for classifying and identifying airport aviation passengers based on mobile phone signaling data as claimed in claim 1 is characterized in that there is a preferred step between step 1 and step 2: and (3) deleting the mobile equipment identification codes which appear for 3 days or more continuously from the airport mobile phone user list obtained in the step (1).
CN202010456844.2A 2020-05-26 2020-05-26 Airport aviation passenger classification identification method based on mobile phone signaling data Pending CN111651529A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010456844.2A CN111651529A (en) 2020-05-26 2020-05-26 Airport aviation passenger classification identification method based on mobile phone signaling data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010456844.2A CN111651529A (en) 2020-05-26 2020-05-26 Airport aviation passenger classification identification method based on mobile phone signaling data

Publications (1)

Publication Number Publication Date
CN111651529A true CN111651529A (en) 2020-09-11

Family

ID=72344759

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010456844.2A Pending CN111651529A (en) 2020-05-26 2020-05-26 Airport aviation passenger classification identification method based on mobile phone signaling data

Country Status (1)

Country Link
CN (1) CN111651529A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112822639A (en) * 2020-12-18 2021-05-18 河北师范大学 Method for demarcating airport abdominal area of passengers entering and exiting port based on mobile phone signaling
CN114139251A (en) * 2021-11-14 2022-03-04 深圳市规划国土发展研究中心 Integral layout method for land ports of border regions
CN114154393A (en) * 2021-10-15 2022-03-08 广州市交通规划研究院 Method for predicting passenger throughput of target ground airport group based on abdominal theory
CN114302333A (en) * 2021-12-27 2022-04-08 中国电信股份有限公司 User identification method and device, electronic equipment and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1540997A (en) * 2003-04-25 2004-10-27 济南开拓科技有限公司 Method for issuing sort message in subzone and dedicated device
CN108734129A (en) * 2018-05-21 2018-11-02 上海应用技术大学 mobile phone and vehicle location analysis method and system
CN109410568A (en) * 2018-09-18 2019-03-01 广东中标数据科技股份有限公司 The get-off stop estimation method and system of rule are drawn a portrait and changed to based on user
CN110020980A (en) * 2019-04-08 2019-07-16 江苏号百信息服务有限公司 Airport based on mobile phone signaling data identifies and objective feelings analysis method to hair passenger

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1540997A (en) * 2003-04-25 2004-10-27 济南开拓科技有限公司 Method for issuing sort message in subzone and dedicated device
CN108734129A (en) * 2018-05-21 2018-11-02 上海应用技术大学 mobile phone and vehicle location analysis method and system
CN109410568A (en) * 2018-09-18 2019-03-01 广东中标数据科技股份有限公司 The get-off stop estimation method and system of rule are drawn a portrait and changed to based on user
CN110020980A (en) * 2019-04-08 2019-07-16 江苏号百信息服务有限公司 Airport based on mobile phone signaling data identifies and objective feelings analysis method to hair passenger

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
姚海芳; 冯天楠; 刘劲松: "基于手机信令数据的机场航空旅客分类识别研究——以石家庄正定国际机场为例" *
姚海芳; 路紫; 刘劲松: "石家庄正定国际机场航空旅客分布特征识别——基于手机信令数据的研究" *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112822639A (en) * 2020-12-18 2021-05-18 河北师范大学 Method for demarcating airport abdominal area of passengers entering and exiting port based on mobile phone signaling
CN114154393A (en) * 2021-10-15 2022-03-08 广州市交通规划研究院 Method for predicting passenger throughput of target ground airport group based on abdominal theory
CN114154393B (en) * 2021-10-15 2024-04-12 广州市交通规划研究院 Target airport group passenger throughput prediction method based on abdomen ground theory
CN114139251A (en) * 2021-11-14 2022-03-04 深圳市规划国土发展研究中心 Integral layout method for land ports of border regions
CN114302333A (en) * 2021-12-27 2022-04-08 中国电信股份有限公司 User identification method and device, electronic equipment and medium

Similar Documents

Publication Publication Date Title
CN111651529A (en) Airport aviation passenger classification identification method based on mobile phone signaling data
CN102097004B (en) Mobile phone positioning data-based traveling origin-destination (OD) matrix acquisition method
CN108181607B (en) Positioning method and device based on fingerprint database and computer readable storage medium
CN111222744B (en) Method for determining distribution relation between built environment and rail passenger flow based on signaling data
US9830817B2 (en) Bus station optimization evaluation method and system
CN112133090A (en) Multi-mode traffic distribution model construction method based on mobile phone signaling data
CN109583640A (en) A kind of Urban Traffic passenger flow attribute recognition approach based on multi-source location data
CN106096631A (en) A kind of recurrent population's Classification and Identification based on the big data of mobile phone analyze method
CN107040894A (en) A kind of resident trip OD acquisition methods based on mobile phone signaling data
CN111653099B (en) Bus passenger flow OD obtaining method based on mobile phone signaling data
CN109190685A (en) Merge the railway trip feature extracting method of space clustering and base station sequence rule
CN111737605A (en) Travel purpose identification method and device based on mobile phone signaling data
CN105206048A (en) Urban resident traffic transfer mode discovery system and method based on urban traffic OD data
CN105142106A (en) Traveler home-work location identification and trip chain depicting method based on mobile phone signaling data
CN106651027B (en) Internet regular bus route optimization method based on social network
CN107194525A (en) A kind of down town appraisal procedure based on mobile phone signaling
CN115168529B (en) Hub passenger flow tracing method based on mobile phone positioning data
CN107529135A (en) User Activity type identification method based on smart machine data
Qian et al. Using mobile phone data to determine spatial correlations between tourism facilities
CN107770721A (en) A kind of tourist communications passenger flow big data method for visualizing
CN111954160A (en) Method for converting two-dimensional mobile phone signaling data into three-dimensional space trajectory data
CN113282575A (en) Railway station passenger flow characteristic extraction method based on mobile phone signaling
CN112000755A (en) Regional trip corridor identification method based on mobile phone signaling data
CN112800348B (en) Tourism behavior identification method based on mobile phone signaling big data
CN113256978A (en) Method and system for diagnosing urban congestion area and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200911

WD01 Invention patent application deemed withdrawn after publication