CN111417075A - User workplace identification method based on mobile communication big data - Google Patents

User workplace identification method based on mobile communication big data Download PDF

Info

Publication number
CN111417075A
CN111417075A CN201811555580.5A CN201811555580A CN111417075A CN 111417075 A CN111417075 A CN 111417075A CN 201811555580 A CN201811555580 A CN 201811555580A CN 111417075 A CN111417075 A CN 111417075A
Authority
CN
China
Prior art keywords
time
base station
working
data
imsi
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811555580.5A
Other languages
Chinese (zh)
Other versions
CN111417075B (en
Inventor
杨占军
朱明珠
贺炎俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beiling Rongxin Datalnfo Science and Technology Ltd
Original Assignee
Beiling Rongxin Datalnfo Science and Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beiling Rongxin Datalnfo Science and Technology Ltd filed Critical Beiling Rongxin Datalnfo Science and Technology Ltd
Priority to CN201811555580.5A priority Critical patent/CN111417075B/en
Publication of CN111417075A publication Critical patent/CN111417075A/en
Application granted granted Critical
Publication of CN111417075B publication Critical patent/CN111417075B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/029Location-based management or tracking services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/08Testing, supervising or monitoring using real traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/20Services signaling; Auxiliary data signalling, i.e. transmitting data via a non-traffic channel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/30Services specially adapted for particular environments, situations or purposes
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention provides a user workplace identification method based on mobile communication big data, which utilizes signaling data of a telecom operator to collect and store the position of a base station where each IMSI identification number is located and the time information of entering and exiting the base station; reading one IMSI each time, finding the longitude and latitude corresponding to each specific base station, and calculating the distance between every two base stations; and converting the acquired original signaling data into the distance of the running track according to the time sequence, determining the stay time of the IMSI in the base station according to the distance information, and predicting the working place according to the stay time of the working day and the stay time of the rest day.

Description

User workplace identification method based on mobile communication big data
Technical Field
The invention relates to the technical field of mobile communication, in particular to a method for identifying a work place by utilizing mobile communication big data.
Background
In view of the current demand for urban population regulation and control, and with the continuous expansion of mobile communication scale and the continuous development of technology, the storage of large-scale communication data, especially trajectory data, is realized, so that the estimation of urban population scale and flow condition by taking mobile communication big data analysis as a research means is possible. The statistical department needs to use big data to develop dynamic monitoring of population data, improve the existing population monitoring system, explore the relationship between industrial regulation and control, functional layout and population development, realize regular tracking and grasp of dismissal population flow direction, and timely warn the change trend of regional population. Compared with the traditional research method, the method has higher reliability and accuracy based on big data statistics and population monitoring.
The problem of job-hold balance directly affects the benign operation and social harmony of traffic functions. The term of the occupation balance is a term of the urban planning field, and the explanation of the professional and academic fields of the term means that the number of workers in residents is approximately equal to the number of employment posts, and most residents can work nearby in a given regional range; commuting traffic may be by foot, bicycle, or other non-motorized means; even if the motor vehicle is used, the traveling distance and the traveling time are short and limited within a reasonable range, so that the use of the motor vehicle, particularly a car, is favorably reduced, and the traffic jam and the air pollution are reduced.
The statistics of the working places of residents is an important premise for carrying out the job-accommodation balance analysis, and the effective working place prediction method can reduce statistical errors and improve the statistical accuracy.
Disclosure of Invention
The invention aims to provide a user work place identification method based on mobile communication big data, which is used for mining implicit work place information through track information of mobile users and providing powerful data support for city planning management.
In order to achieve the purpose, the invention adopts the following technical scheme:
a user workplace identification method based on mobile communication big data is characterized in that:
(1) data acquisition: acquiring and storing the position of the base station where each IMSI identification number is located and the time information of entering and exiting the base station by using signaling data of a telecom operator to obtain the moving track data of each IMSI;
(2) data preprocessing: carrying out interpolation compensation on missing signaling entering and exiting a base station, and if a user only enters a certain base station time and does not leave the base station time or only leaves the certain base station time and does not enter the base station time in a statistical time period, carrying out interpolation on missing data, wherein interpolation time points are the starting time and the ending time of the statistical time period;
(3) and (3) screening data: reading one IMSI and all base stations visited between the nine-point-early and six-point-late points each time, listing the base stations visited in all working time periods corresponding to one IMSI into a statistical table according to the track data obtained in the data acquisition step, and further counting the stay time of each base station;
(4) and (3) distance calculation: selecting three base stations between the early nine points and the late six points, wherein the IMSI residence time is the longest, as seed work places, and calculating the distance from the base station corresponding to each seed work place to each other base station in the statistical table according to the longitude and latitude corresponding to each specific base station; if the distance between the two base stations is less than 1000 meters, combining the two base stations into a seed working place, and overlapping the stay time lengths of the two base stations;
(5) candidates operatively determine: respectively calculating the distance between each seed working place and other seed working places, combining the two seed working places into one if the distance between the two seed working places is less than 1000 meters, and superposing the stay time periods of the two seed working places; marking the seed workplace with the residence time of more than 5 hours as a candidate workplace;
(6) and (4) predicting the working places, counting the candidate working places every day in a month according to the method in the step (5), counting the times from working days to each candidate working place in a month and the times from holidays to the candidate working places, and calculating the ratio α of the two times:
α ═ number of weekdays to candidate workplaces + k)/(number of weekdays to candidate workplaces + k),
where K is a constant greater than 0, in order to prevent the denominator from being 0;
the α with the largest value is selected from the one-month calculation data, and the corresponding candidate working place is regarded as the working place of the IMSI.
The invention counts the resident time of the mobile user in the accessed base station through the mobile track data of the mobile user, predicts the working place of the user through the distance between the accessed base stations, has reliable data source, simple judgment method and high accuracy of the prediction result, and provides favorable support for the method of utilizing the big communication data to perform population statistics and monitoring.
Detailed Description
The specific implementation mode of the invention is as follows:
(1) data acquisition: acquiring and storing the position of the base station where each IMSI identification number is located and the time information of entering and exiting the base station by utilizing signaling data of a telecom operator; the data adopted by the invention comes from signaling data of a mobile operator, and comprises the following steps: subscriber's Mobile phone Number-IMSI (International Mobile subscriber identity Number); position region identifier lac: for identifying different location areas; base station number ci: combined with a location area identity (lac) for identifying a cell covered in the network; the time the IMSI enters the base station, the time it leaves the base station.
(2) Data preprocessing: and carrying out interpolation compensation on the missing signaling of the in-out base station. To ensure the integrity of data, if a user only enters a sector time and does not leave the sector time or leaves the sector time and does not enter the sector time within a statistical time period, the missing data needs to be interpolated, and the interpolation time points are the starting time and the ending time of the statistical time period.
For example, a user a enters sector X at 23:00:00 on day 1 of 5 month, leaves sector X at 7:00:00 on day 2 of 5 month, and the time when the user a enters sector Y is 23:00: 00:00 on day 2 of 5 month, and leaves at 7:00: 00:00 on day 3 of 5 month, the time point when the user a enters sector X and the time point when the user a leaves sector Y will be missing when the information of the user a on day 2 of 5 month is collected, and therefore it is necessary to interpolate the time point when the user a enters sector X is 00:00: 00:00 on day 2 of 5 month, and the time point when the user b leaves sector Y is 23:59:59 on day 2 of 5 month.
(3) And (3) screening data: reading one IMSI and all loc-ci accessed between nine points early and six points late each time; the selection from nine morning points to six evening points is to simulate working hours on duty; and according to the track data obtained in the data acquisition step, listing the loc-ci accessed in all the working time periods corresponding to one IMSI into a statistical table, and further counting the stay time of each loc-ci.
(4) And (3) distance calculation: and finding the longitude and latitude corresponding to each specific base station, and calculating the distance between every two base stations. And selecting three loc-ci between nine points early and six points late, wherein the IMSI residence time is longest, as seed working places, calculating the distance from each seed working place to other loc-ci in the statistical table, and classifying the three loc-ci into one class if the distance Dis is less than 1000 m, namely, overlapping the access times.
The distance is calculated as follows:
Dis=R*acos(sinpi(y1/180)*sinpi(y2/180)+cospi(y1/180)*cospi(y2/180)*cospi((x1-x2)/180));
wherein R represents the radius of the earth; x1, x2 represent the longitude of two loc-ci, respectively, and y1, y2 represent the latitude of two loc-ci, respectively.
(5) Candidates operatively determine: next, the distance between the seed workspaces is calculated, and if less than 1000 meters, the two seed workspaces are classified into one class, and the seed workspaces with residence time exceeding 5 hours are marked as candidate workspaces.
(6) The statistical tables, according to the working days (Monday through Friday) and the rest days (Saturday), respectively count the candidate workplaces for each day of the month, count the number of working days to candidate workplaces in the month and the number of rest days to candidate workplaces, and calculate the ratio of the two α:
α ═ number of weekdays to candidate workplaces + k)/(number of weekdays to candidate workplaces + k)
K represents a minimum value in order to prevent the denominator from being 0.
The α with the largest calculation value in the data of one month is selected, and the corresponding candidate working place is predicted to be the working place of the IMSI.

Claims (2)

1. A user workplace identification method based on mobile communication big data is characterized in that:
(1) data acquisition: acquiring and storing the position of the base station where each IMSI identification number is located and the time information of entering and exiting the base station by using signaling data of a telecom operator to obtain the moving track data of each IMSI;
(2) data preprocessing: carrying out interpolation compensation on missing signaling entering and exiting a base station, and if a user only enters a certain base station time and does not leave the base station time or only leaves the certain base station time and does not enter the base station time in a statistical time period, carrying out interpolation on missing data, wherein interpolation time points are the starting time and the ending time of the statistical time period;
(3) and (3) screening data: reading one IMSI and all base stations visited between the nine-point-early and six-point-late points each time, listing the base stations visited in all working time periods corresponding to one IMSI into a statistical table according to the track data obtained in the data acquisition step, and further counting the stay time of each base station;
(4) and (3) distance calculation: selecting three base stations between the early nine points and the late six points, wherein the IMSI residence time is the longest, as seed work places, and calculating the distance from the base station corresponding to each seed work place to each other base station in the statistical table according to the longitude and latitude corresponding to each specific base station; if the distance between the two base stations is less than 1000 meters, combining the two base stations into a seed working place, and overlapping the stay time lengths of the two base stations;
(5) candidates operatively determine: respectively calculating the distance between each seed working place and other seed working places, combining the two seed working places into one if the distance between the two seed working places is less than 1000 meters, and superposing the stay time periods of the two seed working places; marking the seed workplace with the residence time of more than 5 hours as a candidate workplace;
(6) and (4) predicting the working places, counting the candidate working places every day in a month according to the method in the step (5), counting the times from working days to each candidate working place in a month and the times from holidays to the candidate working places, and calculating the ratio α of the two times:
α ═ number of weekdays to candidate workplaces + k)/(number of weekdays to candidate workplaces + k),
where K is a constant greater than 0, in order to prevent the denominator from being 0;
the α with the largest value is selected from the one-month calculation data, and the corresponding candidate working place is regarded as the working place of the IMSI.
2. The method for operatively identifying a user based on big data of mobile communication according to claim 1, wherein: the distance Dis between two base stations is calculated according to the following formula:
Dis=R*acos(sinpi(y1/180)*sinpi(y2/180)+cospi(y1/180)*cospi(y2/180)*cospi((x1-x2)/180)),
in the formula, R represents the radius of the earth; x1 and x2 respectively represent the longitude of the positions of the two base stations, and y1 and y2 respectively represent the latitude of the positions of the two base stations.
CN201811555580.5A 2018-12-18 2018-12-18 User workplace identification method based on mobile communication big data Active CN111417075B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811555580.5A CN111417075B (en) 2018-12-18 2018-12-18 User workplace identification method based on mobile communication big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811555580.5A CN111417075B (en) 2018-12-18 2018-12-18 User workplace identification method based on mobile communication big data

Publications (2)

Publication Number Publication Date
CN111417075A true CN111417075A (en) 2020-07-14
CN111417075B CN111417075B (en) 2023-06-06

Family

ID=71493922

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811555580.5A Active CN111417075B (en) 2018-12-18 2018-12-18 User workplace identification method based on mobile communication big data

Country Status (1)

Country Link
CN (1) CN111417075B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114037239A (en) * 2021-10-29 2022-02-11 南京大学 Potential model employment reachability analysis method based on multi-source big data
CN115086878A (en) * 2022-08-02 2022-09-20 北京融信数联科技有限公司 User action track obtaining method, system and storage medium based on mobile phone signaling
CN117336683A (en) * 2023-12-01 2024-01-02 北京航空航天大学 Method and system for identifying typical stay of large-scale personnel based on signaling data

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102075850A (en) * 2009-11-19 2011-05-25 中国移动通信集团吉林有限公司 Method and device for determining occupational information of mobile subscriber
CN105142106A (en) * 2015-07-29 2015-12-09 西南交通大学 Traveler home-work location identification and trip chain depicting method based on mobile phone signaling data
CN105682025A (en) * 2016-01-05 2016-06-15 重庆邮电大学 User residing location identification method based on mobile signaling data
CN105989226A (en) * 2015-02-12 2016-10-05 中兴通讯股份有限公司 Method and apparatus for analyzing track of user
CN106570184A (en) * 2016-11-11 2017-04-19 同济大学 Method of extracting recreation-dwelling connection data set from mobile-phone signaling data
CN106792514A (en) * 2016-11-30 2017-05-31 南京华苏科技有限公司 User's duty residence analysis method based on signaling data
CN107770721A (en) * 2017-10-10 2018-03-06 东南大学 A kind of tourist communications passenger flow big data method for visualizing
CN108055645A (en) * 2018-01-19 2018-05-18 深圳技术大学(筹) A kind of path identification method and system based on mobile phone signaling data
CN108650632A (en) * 2018-04-28 2018-10-12 广州市交通规划研究院 It is a kind of based on duty live correspondence and when space kernel clustering stationary point judgment method

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102075850A (en) * 2009-11-19 2011-05-25 中国移动通信集团吉林有限公司 Method and device for determining occupational information of mobile subscriber
CN105989226A (en) * 2015-02-12 2016-10-05 中兴通讯股份有限公司 Method and apparatus for analyzing track of user
CN105142106A (en) * 2015-07-29 2015-12-09 西南交通大学 Traveler home-work location identification and trip chain depicting method based on mobile phone signaling data
CN105682025A (en) * 2016-01-05 2016-06-15 重庆邮电大学 User residing location identification method based on mobile signaling data
CN106570184A (en) * 2016-11-11 2017-04-19 同济大学 Method of extracting recreation-dwelling connection data set from mobile-phone signaling data
CN106792514A (en) * 2016-11-30 2017-05-31 南京华苏科技有限公司 User's duty residence analysis method based on signaling data
CN107770721A (en) * 2017-10-10 2018-03-06 东南大学 A kind of tourist communications passenger flow big data method for visualizing
CN108055645A (en) * 2018-01-19 2018-05-18 深圳技术大学(筹) A kind of path identification method and system based on mobile phone signaling data
CN108650632A (en) * 2018-04-28 2018-10-12 广州市交通规划研究院 It is a kind of based on duty live correspondence and when space kernel clustering stationary point judgment method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
唐小勇等: "一种基于手机信令的通勤OD训练方法", 《交通运输系统工程与信息》 *
陈欢: "基于手机信令数据的人员出行特征跟踪调查", 《交通与运输(学术版)》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114037239A (en) * 2021-10-29 2022-02-11 南京大学 Potential model employment reachability analysis method based on multi-source big data
CN115086878A (en) * 2022-08-02 2022-09-20 北京融信数联科技有限公司 User action track obtaining method, system and storage medium based on mobile phone signaling
CN115086878B (en) * 2022-08-02 2023-04-28 北京融信数联科技有限公司 Method, system and storage medium for obtaining user action track based on mobile phone signaling
CN117336683A (en) * 2023-12-01 2024-01-02 北京航空航天大学 Method and system for identifying typical stay of large-scale personnel based on signaling data
CN117336683B (en) * 2023-12-01 2024-02-13 北京航空航天大学 Method and system for identifying typical stay of large-scale personnel based on signaling data

Also Published As

Publication number Publication date
CN111417075B (en) 2023-06-06

Similar Documents

Publication Publication Date Title
Zhu et al. Understanding spatio-temporal heterogeneity of bike-sharing and scooter-sharing mobility
Wang et al. Applying mobile phone data to travel behaviour research: A literature review
CN111615054B (en) Population analysis method and device
CN108322891B (en) Traffic area congestion identification method based on user mobile phone signaling
Caceres et al. Review of traffic data estimations extracted from cellular networks
CN110516708B (en) Path prediction method based on track and road network matching
US6879907B2 (en) Method and system for modeling and processing vehicular traffic data and information and applying thereof
Poonawala et al. Singapore in motion: Insights on public transport service level through farecard and mobile data analytics
Wang et al. Estimating dynamic origin-destination data and travel demand using cell phone network data
Shen et al. Spatiotemporal influence of land use and household properties on automobile travel demand
US20120115475A1 (en) System and method for population tracking, counting, and movement estimation using mobile operational data and/or geographic information in mobile network
CN111417075B (en) User workplace identification method based on mobile communication big data
Qin et al. EXIMIUS: A measurement framework for explicit and implicit urban traffic sensing
Zheng et al. Exploring both home-based and work-based jobs-housing balance by distance decay effect
Mungthanya et al. Constructing time-dependent origin-destination matrices with adaptive zoning scheme and measuring their similarities with taxi trajectory data
CN112738729A (en) Method and system for distinguishing visiting hometown visitor by mobile phone signaling data
CN114723480B (en) Passenger flow prediction method and cargo scheduling system for rural travel
CN109587622B (en) Intersection steering flow analysis system and method based on base station signaling data
Kan et al. Understanding space-time patterns of vehicular emission flows in urban areas using geospatial technique
Imai et al. Origin-destination trips generated from operational data of a mobile network for urban transportation planning
Li et al. Estimating crowd flow and crowd density from cellular data for mass rapid transit
CN112541012A (en) City position balance demographic method based on mobile communication big data and entropy value
CN109743723A (en) A method of cellular base station data are assigned to peripheral space unit
Cui et al. Usage demand forecast and quantity recommendation for urban shared bicycles
CN112559581A (en) City occupation balance demographic method based on mobile communication big data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant