CN112541012A - City position balance demographic method based on mobile communication big data and entropy value - Google Patents

City position balance demographic method based on mobile communication big data and entropy value Download PDF

Info

Publication number
CN112541012A
CN112541012A CN202010002299.XA CN202010002299A CN112541012A CN 112541012 A CN112541012 A CN 112541012A CN 202010002299 A CN202010002299 A CN 202010002299A CN 112541012 A CN112541012 A CN 112541012A
Authority
CN
China
Prior art keywords
imsi
base station
time
distance
points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010002299.XA
Other languages
Chinese (zh)
Inventor
杨占军
朱明珠
于海薇
潘志宏
刘增礼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beiling Rongxin Datalnfo Science and Technology Ltd
Original Assignee
Beiling Rongxin Datalnfo Science and Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beiling Rongxin Datalnfo Science and Technology Ltd filed Critical Beiling Rongxin Datalnfo Science and Technology Ltd
Priority to CN202010002299.XA priority Critical patent/CN112541012A/en
Publication of CN112541012A publication Critical patent/CN112541012A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/029Location-based management or tracking services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W8/00Network data management
    • H04W8/18Processing of user or subscriber data, e.g. subscribed services, user preferences or user profiles; Transfer of user or subscriber data

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Tourism & Hospitality (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Educational Administration (AREA)
  • Remote Sensing (AREA)
  • General Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Primary Health Care (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention provides a city occupation balance population statistical method based on mobile communication big data and entropy, which utilizes telecommunication operator signaling data to identify the position of a base station where each IMSI number in a certain city is located in the working place in the day and the residence place in the night by entropy calculation, and calculate the distance between the working place and the residence place, wherein the distance is the working commuting distance of the IMSI, and determines whether the commuting distance of the IMSI is in a reasonable range according to the distance information, and finally, the IMSI with all the commuting distances in the reasonable range is used as a numerator, the number of all the ordinary population in the month of the city is used as a denominator, namely, the percentage of the city occupation balance population in the month is counted.

Description

City position balance demographic method based on mobile communication big data and entropy value
Technical Field
The invention relates to the technical field of mobile communication, in particular to a statistical method for urban position and occupation balance population by utilizing mobile communication big data and entropy calculation.
Background
The problem of job-hold balance directly affects the benign operation and social harmony of traffic functions. The term of occupation balance is a term in the field of urban planning, and the basic connotation means that most residents can work nearby in a given regional range; commuting traffic may be by foot, bicycle, or other non-motorized means; even if the motor vehicle is used, the traveling distance and the traveling time are short and limited within a reasonable range, so that the use of the motor vehicle, particularly a car, is favorably reduced, and the traffic jam and the air pollution are reduced.
In view of the current urban population regulation and control requirements, a statistical department needs to use big data to carry out dynamic monitoring on population data, improve the existing population monitoring system, explore the relationship among industrial regulation and control, functional layout and population development, realize regular tracking and mastering of the flow direction of the dismissed population and timely warn the change trend of the regional population. With the continuous expansion of mobile communication scale and the continuous development of technology, the storage of large-scale communication data, particularly trajectory data, is realized, so that the estimation of urban population scale and flow situation by taking mobile communication big data analysis as a research means becomes possible. Compared with the traditional research method, the method has higher reliability and accuracy based on big data statistics and population monitoring.
Disclosure of Invention
The invention aims to provide a city job-live balance population statistical method based on mobile communication big data and entropy calculation.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for city occupation balance demographics based on mobile communication big data and entropy, characterized by:
(1) data acquisition: acquiring and storing the position of a base station where each IMSI identification number is located in a designated city and the time information of entering and exiting the base station by using signaling data of a telecom operator to obtain the movement track data of each IMSI;
(2) data preprocessing: carrying out interpolation compensation on missing signaling entering and exiting a base station, and if a certain IMSI only enters a certain base station time and does not leave the base station time or only leaves the certain base station time and does not enter the base station time in a statistical time period, carrying out interpolation on missing data, wherein interpolation time points are the starting time and the ending time of the statistical time period;
(3) and (3) working place data screening: reading one IMSI and all base stations visited between nine points early and four points late each time, listing the base stations visited in all working time periods corresponding to one IMSI into a statistical table according to the track data obtained in the data acquisition step, further counting the stay time of each base station, and then calculating the entropy value of the IMSI in each base station;
(4) residential area data screening: reading one IMSI and all base stations visited between eight late points and five early points each time, listing the base stations visited in all residence time periods corresponding to the IMSI into a statistical table according to the track data obtained in the data acquisition step, further counting the stay time of each base station, and then calculating the entropy value of the IMSI in each base station;
(5) and (3) distance calculation: selecting a base station with the largest IMSI entropy value between nine points in the morning and four points in the evening as a working place; selecting one base station with the maximum IMSI entropy value from eight late points to five early points as a residence, and calculating the distance between the residence and a working place according to the longitude and latitude corresponding to the specific base station; continuously calculating the distance between the residence and the working place within one month and averaging, and if the distance average is smaller than the average commuting distance of the city, considering the IMSI as the commuting distance within a reasonable range, namely, the working balance population;
(6) balance of employment demographics: and taking the IMSI of all the commuting distances within a reasonable range as a numerator, and taking the number of all the regular lives in the current month of the city as a denominator, namely counting the percentage of the working and living balance population of the city.
The invention counts the resident time of the mobile user in the accessed base station through the mobile track data of the mobile user, predicts the working place and the residence place of the user through the time of the accessed base station, has reliable data source, simple judgment method and high accuracy of the prediction result, and provides favorable support for the practice of carrying out population statistics and monitoring by utilizing the big communication data.
Detailed Description
The process of the invention is illustrated below by means of a specific example.
The invention can be used for counting the balanced occupational population of any city. The following example is a statistical analysis of the balanced occupational population in Beijing. Currently, the academic world generally considers that the average or medium commute distance in a city should be used as the equilibrium distance of the job, so in this embodiment, the average commute distance of 17.4 km in beijing is used as the reasonable commute distance, and the population smaller than the average commute distance in beijing is used as the equilibrium population of the job.
The specific implementation mode of the invention is as follows:
(1) data acquisition: the data adopted by the embodiment is the mobile phone signaling data between 7 months and 1 day to 31 days in 2019 of Beijing City. And acquiring and storing the position of the base station where each IMSI identification number is located and the time information of entering and exiting the base station by utilizing the signaling data of the telecom operator to obtain the moving track data of each IMSI. The data adopted by the invention comes from signaling data of a mobile operator, and comprises the following steps: subscriber's Mobile phone Number-IMSI (International Mobile Subscriber identity Number); position region identifier lac: for identifying different location areas; base station number ci: combined with a location area identity (lac) for identifying a cell covered in the network; the time the IMSI enters the base station, the time it leaves the base station.
(2) Data preprocessing: and carrying out interpolation compensation on the missing signaling of the in-out base station. To ensure the integrity of data, if a user only enters a sector time and does not leave the sector time or leaves the sector time and does not enter the sector time within a statistical time period, the missing data needs to be interpolated, and the interpolation time points are the starting time and the ending time of the statistical time period.
For example, a user a enters sector X on day 1 of 7 month 23:00:00, leaves sector X on day 2 of 7 month 7:00:00, and enters sector Y at time 23:00: 00:00 on day 2 of 7 month, and leaves at time 7:00:00 on day 3 of 7 month, and when the information of the user a on day 2 of 7 month is collected, the time point of entering sector X and the time point of leaving sector Y are missing, so it is necessary to interpolate the time point of entering sector X at 00:00: 00:00 on day 2 of 7 month, and the time point of leaving sector Y at 23:59:59 on day 2 of 7 month.
(3) And (3) working place data screening: reading one IMSI and all loc-ci accessed between nine points early and four points late each time; the selection from the early nine points to the late four points is to simulate working hours on duty; and according to the track data obtained in the data acquisition step, listing the loc-ci accessed in all the working time periods corresponding to one IMSI into a statistical table, further counting the stay time of each loc-ci, and then calculating the entropy value of the IMSI in each base station. The specific calculation method comprises the following steps:
the entropy value of IMSI i at base station j is:
Figure BDA0002353934450000031
Figure BDA0002353934450000032
for the time IMSI i stays in base station j
(4) Residential area data screening: reading one IMSI and all loc-ci accessed between eight late points and five early points each time; the selection from eight night to five morning is to simulate off-duty rest time; and according to the track data obtained in the data acquisition step, listing the loc-ci accessed in all the working time periods corresponding to one IMSI into a statistical table, further counting the stay time of each loc-ci, and then calculating the entropy value of the IMSI in each base station. The specific calculation method comprises the following steps:
the entropy value of IMSI i at base station j is:
Figure BDA0002353934450000033
Figure BDA0002353934450000034
for the time IMSI i stays in base station j
(5) And (3) distance calculation: selecting a base station with the largest IMSI entropy value between nine points in the morning and four points in the evening as a working place; selecting one base station with the maximum IMSI entropy value from eight late points to five early points as a residence, and calculating the distance between the residence and a working place according to the longitude and latitude corresponding to the specific base station; the distance between the residence and the workplace within one month is continuously calculated and averaged, and if the average distance is less than the average commuting distance in Beijing, the IMSI is considered to be within a reasonable range-the working balance population.
The distance Dis is calculated as follows:
Dis=R*acos(sinpi(y1/180)*sinpi(y2/180)+cospi(y1/180)*cospi(y2/180)*cospi((x1-x2)/180));
wherein R represents the radius of the earth; x1, x2 represent the longitude of two loc-ci, respectively, and y1, y2 represent the latitude of two loc-ci, respectively.
(6) Balance of employment demographics: and taking the IMSI with the commuting distance within a reasonable range as a numerator and taking the number of all the regular living population in the month of Beijing as a denominator, namely counting the percentage of the balanced population of the working and living in the Beijing.

Claims (2)

1. A city occupation balance demographic method based on mobile communication big data and entropy is characterized in that:
(1) data acquisition: acquiring and storing the position of a base station where each IMSI identification number is located in a designated city and the time information of entering and exiting the base station by using signaling data of a telecom operator to obtain the movement track data of each IMSI;
(2) data preprocessing: carrying out interpolation compensation on missing signaling entering and exiting a base station, and if a certain IMSI only enters a certain base station time and does not leave the base station time or only leaves the certain base station time and does not enter the base station time in a statistical time period, carrying out interpolation on missing data, wherein interpolation time points are the starting time and the ending time of the statistical time period;
(3) and (3) working place data screening: reading one IMSI and all base stations visited between nine points early and four points late each time, listing the base stations visited in all working time periods corresponding to one IMSI into a statistical table according to the track data obtained in the data acquisition step, further counting the stay time of each base station, and then calculating the entropy value of the IMSI in each base station;
(4) residential area data screening: reading one IMSI and all base stations visited between eight late points and five early points each time, listing the base stations visited in all residence time periods corresponding to the IMSI into a statistical table according to the track data obtained in the data acquisition step, further counting the stay time of each base station, and then calculating the entropy value of the IMSI in each base station;
(5) and (3) distance calculation: selecting a base station with the largest IMSI entropy value between nine points in the morning and four points in the evening as a working place; selecting one base station with the maximum IMSI entropy value from eight late points to five early points as a residence, and calculating the distance between the residence and a working place according to the longitude and latitude corresponding to the specific base station; continuously calculating the distance between the residence and the working place within one month and averaging, and if the distance average is smaller than the average commuting distance of the city, considering the IMSI as the commuting distance within a reasonable range, namely, the working balance population;
(6) balance of employment demographics: and taking the IMSI of all the commuting distances within a reasonable range as a numerator, and taking the number of all the regular lives in the current month of the city as a denominator, namely counting the percentage of the working and living balance population of the city.
2. The mobile communication big data and entropy based city-occupational balanced demographic method of claim 1, wherein: the distance Dis between two base stations is calculated according to the following formula:
Dis=R*acos(sinpi(y1/180)*sinpi(y2/180)+cospi(y1/180)*cospi(y2/180)*cospi((x1-x2)/180)),
in the formula, R represents the radius of the earth; x1 and x2 respectively represent the longitude of the positions of the two base stations, and y1 and y2 respectively represent the latitude of the positions of the two base stations.
CN202010002299.XA 2020-01-02 2020-01-02 City position balance demographic method based on mobile communication big data and entropy value Withdrawn CN112541012A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010002299.XA CN112541012A (en) 2020-01-02 2020-01-02 City position balance demographic method based on mobile communication big data and entropy value

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010002299.XA CN112541012A (en) 2020-01-02 2020-01-02 City position balance demographic method based on mobile communication big data and entropy value

Publications (1)

Publication Number Publication Date
CN112541012A true CN112541012A (en) 2021-03-23

Family

ID=75013318

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010002299.XA Withdrawn CN112541012A (en) 2020-01-02 2020-01-02 City position balance demographic method based on mobile communication big data and entropy value

Country Status (1)

Country Link
CN (1) CN112541012A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115034524A (en) * 2022-08-11 2022-09-09 北京融信数联科技有限公司 Method, system and storage medium for predicting working population based on mobile phone signaling
CN116128128A (en) * 2023-01-17 2023-05-16 北京融信数联科技有限公司 Urban job-living balance prediction method, system and medium based on intelligent agent map

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017133627A1 (en) * 2016-02-03 2017-08-10 中兴通讯股份有限公司 User commuter track management method, device and system
CN110472775A (en) * 2019-07-26 2019-11-19 广州大学 A kind of series case suspect's foothold prediction technique
CN110473132A (en) * 2019-08-27 2019-11-19 上海云砥信息科技有限公司 Balance evaluation method is lived in a kind of region duty based on mobile data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017133627A1 (en) * 2016-02-03 2017-08-10 中兴通讯股份有限公司 User commuter track management method, device and system
CN110472775A (en) * 2019-07-26 2019-11-19 广州大学 A kind of series case suspect's foothold prediction technique
CN110473132A (en) * 2019-08-27 2019-11-19 上海云砥信息科技有限公司 Balance evaluation method is lived in a kind of region duty based on mobile data

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115034524A (en) * 2022-08-11 2022-09-09 北京融信数联科技有限公司 Method, system and storage medium for predicting working population based on mobile phone signaling
CN116128128A (en) * 2023-01-17 2023-05-16 北京融信数联科技有限公司 Urban job-living balance prediction method, system and medium based on intelligent agent map

Similar Documents

Publication Publication Date Title
CN108322891B (en) Traffic area congestion identification method based on user mobile phone signaling
CN107040894B (en) A kind of resident trip OD acquisition methods based on mobile phone signaling data
CN102238584B (en) Device, system and method for monitoring regional passenger flow
CN103325247B (en) Method and system for processing traffic information
CN101308029B (en) Road network grid matching, road status messages and introduction route information acquisition method
US8504034B2 (en) System and method for population tracking, counting, and movement estimation using mobile operational data and/or geographic information in mobile network
Caceres et al. Review of traffic data estimations extracted from cellular networks
CN101510357B (en) Method for detecting traffic state based on mobile phone signal data
CN102892134B (en) Method for screening high-speed mobile phone user
CN100542330C (en) Mobile object's position update method based on transportation network and GPS
US20160335894A1 (en) Bus Station Optimization Evaluation Method and System
CN105070057B (en) A kind of monitoring method of road real-time road
CN102708689B (en) Real-time traffic monitoring system
CN112541012A (en) City position balance demographic method based on mobile communication big data and entropy value
CN101882373A (en) Motorcade maintaining method and vehicle-mounted communication system
CN105574154A (en) Urban macro regional information analysis system based on large data platform
CN103177562A (en) Method and device for obtaining information of traffic condition prediction
CN111417075A (en) User workplace identification method based on mobile communication big data
CN102722984A (en) Real-time road condition monitoring method
CN109729518A (en) Urban transportation morning peak congestion source recognition methods based on mobile phone signaling
CN102867406B (en) Traffic network generation method applying vehicle detection data
CN114723480B (en) Passenger flow prediction method and cargo scheduling system for rural travel
Yuan et al. Recognition of functional areas based on call detail records and point of interest data
CN112559581A (en) City occupation balance demographic method based on mobile communication big data
CN109587622B (en) Intersection steering flow analysis system and method based on base station signaling data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20210323