CN109495856B - Mobile phone user type marking method based on big data - Google Patents

Mobile phone user type marking method based on big data Download PDF

Info

Publication number
CN109495856B
CN109495856B CN201811550202.8A CN201811550202A CN109495856B CN 109495856 B CN109495856 B CN 109495856B CN 201811550202 A CN201811550202 A CN 201811550202A CN 109495856 B CN109495856 B CN 109495856B
Authority
CN
China
Prior art keywords
base station
user
data
antenna
service signaling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811550202.8A
Other languages
Chinese (zh)
Other versions
CN109495856A (en
Inventor
周俊蓉
蓝良姬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Fangwei Technology Co ltd
Original Assignee
Chengdu Fangwei Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Fangwei Technology Co ltd filed Critical Chengdu Fangwei Technology Co ltd
Priority to CN201811550202.8A priority Critical patent/CN109495856B/en
Publication of CN109495856A publication Critical patent/CN109495856A/en
Application granted granted Critical
Publication of CN109495856B publication Critical patent/CN109495856B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/20Services signaling; Auxiliary data signalling, i.e. transmitting data via a non-traffic channel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/029Location-based management or tracking services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/20Services signaling; Auxiliary data signalling, i.e. transmitting data via a non-traffic channel
    • H04W4/203Services signaling; Auxiliary data signalling, i.e. transmitting data via a non-traffic channel for converged personal network application service interworking, e.g. OMA converged personal network services [CPNS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W64/00Locating users or terminals or network equipment for network management purposes, e.g. mobility management

Abstract

The invention relates to the technical field of data analysis, in particular to a mobile phone user type marking method based on big data. The invention acquires the mobile service signaling data of the previous day provided by the communication operator every day, organically combines with the geographical entity boundary provided by the map service provider, accurately positions the specific geographical entity which is resident by the mobile phone user each time to form the positioning track chain of the user on the same day, analyzes the information of the work place, the living place, the frequently-visited entertainment place and the like of the user by combining the historical positioning track chain, forms the user characteristic label, can fully release the value of the mobile service signaling data, and has strong practicability.

Description

Mobile phone user type marking method based on big data
Technical Field
The invention relates to the technical field of data analysis, in particular to a mobile phone user type marking method based on big data.
Background
In modern life, the mobile phone brings great convenience to everyone, and one phone can be used for all things regardless of taking a take-out or a taxi. So to say, the mobile phone is the most great invention in modern times, and the mobile phone does not have the great invention because the communication and the communication between the mobile phone and the world are increased very rapidly, and the obstacle of the communication between people is shortened. The mobile phone is always placed in the touch range by people from getting up in the morning to sleeping at night for a whole day. When going out, the mobile phone is also in the same important position as the key, so that the mobile phone is basically 24 hours from people. When the mobile phone is connected with the communication network, the mobile phone is connected with the base stations in various places, and one mobile phone is only connected with one base station at the same time. However, most of the existing methods for analyzing the location of the mobile phone user by using the mobile service signaling data have the problems of inaccurate and not intelligent location, incapability of analyzing the personal condition of the mobile phone user by using the service signaling data and incapability of fully utilizing the existing mobile service signaling.
Disclosure of Invention
The invention provides a mobile phone user type marking method based on big data, and solves the problem that the personal condition of a mobile phone user cannot be analyzed in the prior art.
The technical scheme adopted by the invention is as follows:
a mobile phone user type marking method based on big data comprises the following steps:
s1, acquiring base station engineering parameters, mobile service signaling data and a set of spatial block actual position coordinate points provided by a map service provider, wherein the base station engineering parameters, the mobile service signaling data and the set of spatial block actual position coordinate points are provided by a communication operator;
s2, forming a geographic entity characteristic fingerprint through the base station engineering parameters and the spatial block actual position coordinate point set;
s3, aggregating the service signaling data according to the time and space relation, and determining the service signaling track data characteristics of the user; due to traffic signaling, there is only one base station at a time. However, when a user is at one location, base station switching may occur due to various factors, that is, multiple continuous service signaling of the user may all point to one location, and therefore, the service signaling of the user needs to be aggregated according to a time and space relationship;
s4, positioning each time interval of the mobile phone user according to the aggregated service signaling track data characteristics, and judging the specific geographic entity of the user in each time interval;
and S5, generating a daily positioning track chain of the user according to the positioning of the user at each moment and time, and marking a characteristic label for the user by combining the historical positioning track chain of the user, wherein the label content comprises occupation, residence and workplace.
Preferably, in the step S2, the step of forming the geographic entity feature fingerprint includes:
s201, calculating the coverage area of the base station according to the base station engineering parameters;
s202, according to the coverage range of the geographic entity and the coverage surface of the base station, calculating to obtain a cross area S covered by the geographic entity and the base station through an gis space calculation engine; the coverage area of the geographic entity is as follows: connecting every two actual position coordinate points of the geographic entity provided by a map service provider to form a closed coverage area, namely a geographic entity coverage area;
s203: calculating the coverage area Sb of the base station according to the engineering parameters of the base station;
s204: calculating a spatial relationship coefficient alpha of the geographic entity and the base station through an equation according to the coverage area Sb and the cross area S of the base station, wherein the calculation equation is as follows: α ═ S ÷ Sb;
s205: outputting a relationship of a geographic entity and a base station covering the geographic entity:
{B,{Lc1,α},{Lc2,α}{Lc3,α}..{Lcn,α}} (1)
wherein, B is a geographic entity, and Lc is a base station number.
Preferably, in the step S3, the determining the service signaling trajectory data characteristic of the user includes the following steps:
s301, sequencing user service signaling records according to occurrence time, and combining two service signaling records if the continuous service signaling records are switched repeatedly;
for example, the base station A- > … - > base station A, if the time interval between the two occurrences of the base station A does not exceed 2 hours, and the distance between the other base station and the base station A which occur before the two occurrences of the base station A does not exceed 1km, then the records are merged;
s302, merging the service signaling data with the time interval of 1 minute;
because the service signaling acquisition sources are a plurality of data sources and the time of each data source may be slightly different, service signaling data with the time interval of 1 minute are merged;
s303, iteratively executing the step S301 and the step S302 until the combination can not be carried out;
s304, dividing the merged records into a plurality of time intervals according to 'start-end' time, wherein a plurality of records exist in each time interval, correcting error data, finding out the base station with the longest occurrence time in each time interval, and eliminating the records with the distance between the records and the base station being more than 1km in the time interval;
s305, learning historical data, storing the record processed in the step S304 into a database, performing similarity matching with the historical record, and merging the similar historical record into the time interval;
s306, calculating the occurrence frequency W of each base station occurring in the same time period in the last month;
s307, outputting the merged record:
{U,Ts,Te,{Lc1,W1},{Lc2,W2},{Lc3,W3}…{Lcn,Wn}} (2)
wherein, U is a user identifier, Ts is a time interval starting time, Te is a time interval ending time, Lcn is a base station cell identifier, and Wn is the occurrence frequency of the base station cell in the last month.
Preferably, in the step S305, if the historical records have a similarity greater than 80% with the time interval, and are all working days or all non-working days, and the longitude and latitude of the base station in the historical records are less than 1km from the longitude and latitude of all the base stations in the current time interval, the historical records are also merged into the time interval. Time interval similarity is the square of the same minutes over two time intervals divided by two minutes over one minute interval.
As a preferable aspect of the foregoing technical solution, in step S4, the determining the specific geographic entity where the user is located at each time interval includes:
performing correlation calculation on the formula (1) and the formula (2) according to an equation (3) to obtain a probability size P that the user may be located in the time period, wherein the equation (3) is as follows:
P{u,b}=∑W*α (3)
forming a data set of likelihood sizes for each user within each geographic entity per time period,
{U,Ts,Te,{B1,P1},{B2,P2},{B3,P3}…{Bn,Pn}} (4)
wherein the geographic entity with the largest P is the resident position of the user in the time period.
Preferably, the base station engineering parameters include a regional area code, a base station identification code, a network type, an antenna azimuth angle, a base station coverage type, a base station antenna position longitude coordinate and a base station antenna position latitude coordinate; the mobile service signaling data comprises time, user numbers and base station numbers.
Preferably, the coverage type of the base station includes an indoor type and a non-indoor type; the antenna types comprise an omnidirectional antenna and a directional antenna; the coverage radius R of the indoor base station is a fixed value; the coverage radius R of the non-indoor base station is the product of the longitude and latitude coordinates of the base station antenna and the average distance of the nearest three non-indoor base stations and a specific coefficient. The specific coefficient is 1.6; the coverage radius R of the indoor base station is 400 meters by default;
preferably, in the above technical solution, the method for calculating the coverage area of the omni-directional antenna base station includes: and taking the longitude and latitude of the antenna as a central point, extending the length of the coverage radius R of the base station outwards every 45 degrees to respectively obtain eight coordinate points, and connecting every two adjacent coordinate points by using straight lines to form a closed base station coverage area, namely obtaining the coverage surface of the omnidirectional antenna base station.
Preferably, in the above technical solution, the method for calculating the coverage area of the directional antenna base station includes: taking the longitude and latitude of the antenna as a central point, respectively extending the length of a coverage radius R of the base station outwards according to angles of A, A + H/6, A + H/3, A + H/2, A-H/6, A-H/3 and A + H/2 to obtain seven coordinate points, connecting every two adjacent coordinate points with straight lines, and respectively connecting the two coordinate points at the two ends with the longitude and latitude points of the antenna to form a closed base station coverage area, namely obtaining a coverage surface of the omnidirectional antenna base station; the angle A is the antenna azimuth angle, and the angle H is the horizontal lobe angle. The horizontal lobe angle calculation method is that if the number of the directional antennas of the base station is less than or equal to 2, the angle is 180 degrees, otherwise, the angle is 120 degrees.
Preferably, in the step S5, the method for tagging the feature label for the user includes:
s501, according to the historical positioning track chain of the user, counting the residence frequency, the residence starting time period, the residence ending time period, the average residence time, the residence days on workdays, the residence days on non-workdays and the residence days on the same type position of the user in the same month;
s502, marking a behavior label for each resident behavior of the user according to the data counted in the S501, wherein the behavior label comprises residence and work;
and S503, according to the behavior label, combining with the type of the geographic entity, and marking a characteristic label for the user by adopting an unsupervised clustering analysis method. The geographic entity type is provided by a map service provider.
The invention has the beneficial effects that:
the invention acquires the mobile service signaling data of the previous day provided by the communication operator every day, organically combines with the geographical entity boundary provided by the map service provider, accurately positions the specific geographical entity which is resident by the mobile phone user each time to form the positioning track chain of the user on the same day, analyzes the information of the work place, the living place, the frequently-visited entertainment place and the like of the user by combining the historical positioning track chain, forms the user characteristic label, can fully release the value of the mobile service signaling data, and has strong practicability.
Drawings
FIG. 1 is an example of definition criteria of a user feature tag according to embodiment 1 of the present invention;
fig. 2 is an example of the content of a user feature tag according to embodiment 1 of the present invention.
Detailed Description
The present invention will be described in detail below.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
In the description of the present invention, it should be noted that the terms "upper", "vertical", "inside", "outside", and the like indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, or orientations or positional relationships that are conventionally arranged when the products of the present invention are used, or orientations or positional relationships that are conventionally understood by those skilled in the art, and are used for convenience of description and simplification of description, but do not indicate or imply that the devices or elements that are referred to must have specific orientations, be constructed in specific orientations, and be operated, and thus, should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and the like are used merely to distinguish one description from another, and are not to be construed as indicating or implying relative importance.
In the description of the present invention, it should also be noted that, unless otherwise explicitly specified or limited, the terms "disposed," "mounted," and "connected" are to be construed broadly, e.g., as meaning fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
Example 1:
the embodiment provides a method for marking the type of a mobile phone user based on big data.
A mobile phone user type marking method based on big data comprises the following steps:
s1, acquiring base station engineering parameters, mobile service signaling data and a set of spatial block actual position coordinate points provided by a map service provider, wherein the base station engineering parameters, the mobile service signaling data and the set of spatial block actual position coordinate points are provided by a communication operator;
s2, forming a geographic entity characteristic fingerprint through the base station engineering parameters and the spatial block actual position coordinate point set;
s3, aggregating the service signaling data according to the time and space relation, and determining the service signaling track data characteristics of the user; due to traffic signaling, there is only one base station at a time. However, when a user is at one location, base station switching may occur due to various factors, that is, multiple continuous service signaling of the user may all point to one location, and therefore, the service signaling of the user needs to be aggregated according to a time and space relationship;
s4, positioning each time interval of the mobile phone user according to the aggregated service signaling track data characteristics, and judging the specific geographic entity of the user in each time interval;
s5, according to the positioning of the user at each moment, generating a daily positioning track chain of the user according to time, and marking a characteristic label for the user by combining the historical positioning track chain of the user, wherein the characteristic label comprises occupation, residence, workplace, business and entertainment places, double-city people and travelers.
In step S2, the step of forming the geographic entity feature fingerprint includes:
s201, calculating the coverage area of the base station according to the base station engineering parameters;
s202, according to the coverage range of the geographic entity and the coverage surface of the base station, calculating to obtain a cross area S covered by the geographic entity and the base station through an gis space calculation engine; the coverage area of the geographic entity is as follows: connecting every two actual position coordinate points of the geographic entity provided by a map service provider to form a closed coverage area, namely a geographic entity coverage area;
s203: calculating the coverage area Sb of the base station according to the engineering parameters of the base station;
s204: calculating a spatial relationship coefficient alpha of the geographic entity and the base station through an equation according to the coverage area Sb and the cross area S of the base station, wherein the calculation equation is as follows: α ═ S ÷ Sb;
s205: outputting a relationship of a geographic entity and a base station covering the geographic entity:
{B,{Lc1,α},{Lc2,α}{Lc3,α}..{Lcn,α}} (1)
wherein, B is a geographic entity, and Lc is a base station number.
In step S3, the step of determining the service signaling trajectory data feature of the user includes the following steps:
s301, sequencing user service signaling records according to occurrence time, and combining two service signaling records if the continuous service signaling records are switched repeatedly;
for example, the base station A- > … - > base station A, if the time interval between the two occurrences of the base station A does not exceed 2 hours, and the distance between the other base station and the base station A which occur before the two occurrences of the base station A does not exceed 1km, then the records are merged;
s302, merging the service signaling data with the time interval of 1 minute;
because the service signaling acquisition sources are a plurality of data sources and the time of each data source may be slightly different, service signaling data with the time interval of 1 minute are merged;
s303, iteratively executing the step S301 and the step S302 until the combination can not be carried out;
s304, dividing the merged records into a plurality of time intervals according to 'start-end' time, wherein a plurality of records exist in each time interval, correcting error data, finding out the base station with the longest occurrence time in each time interval, and eliminating the records with the distance between the records and the base station being more than 1km in the time interval;
s305, learning historical data, storing the record processed in the step S304 into a database, performing similarity matching with the historical record, and merging the similar historical record into the time interval;
s306, calculating the occurrence frequency W of each base station occurring in the same time period in the last month;
s307, outputting the merged record:
{U,Ts,Te,{Lc1,W1},{Lc2,W2},{Lc3,W3}…{Lcn,Wn}} (2)
wherein, U is a user identifier, Ts is a time interval starting time, Te is a time interval ending time, Lcn is a base station cell identifier, and Wn is the occurrence frequency of the base station cell in the last month.
In S305, if the historical records have a similarity greater than 80% with the time interval, and are both working days or both non-working days, and the longitude and latitude of the base station in the historical records are less than 1km from the longitude and latitude of all base stations in the current time interval, the historical records are also merged into the time interval. Time interval similarity is the square of the same minutes over two time intervals divided by two minutes over one minute interval.
In step S4, the step of determining the specific geographic entity where the user is located at each time interval includes:
performing correlation calculation on the formula (1) and the formula (2) according to an equation (3) to obtain a probability size P that the user may be located in the time period, wherein the equation (3) is as follows:
P{u,b}=∑W*α (3)
forming a data set of likelihood sizes for each user within each geographic entity per time period,
{U,Ts,Te,{B1,P1},{B2,P2},{B3,P3}…{Bn,Pn}} (4)
wherein the geographic entity with the largest P is the resident position of the user in the time period.
The base station engineering parameters comprise a regional area code, a base station identification code, a network type, an antenna azimuth angle, a base station coverage type, a base station antenna position longitude coordinate and a base station antenna position latitude coordinate; the mobile service signaling data comprises time, user numbers and base station numbers.
The coverage type of the base station comprises an indoor type and a non-indoor type; the antenna types comprise an omnidirectional antenna and a directional antenna; the coverage radius R of the indoor base station is a fixed value; the coverage radius R of the non-indoor base station is the product of the longitude and latitude coordinates of the base station antenna and the average distance of the nearest three non-indoor base stations and a specific coefficient. The specific coefficient is 1.6; the coverage radius R of the indoor base station is 400 meters by default;
the method for calculating the coverage area of the base station of the omnidirectional antenna comprises the following steps: and taking the longitude and latitude of the antenna as a central point, extending the length of the coverage radius R of the base station outwards every 45 degrees to respectively obtain eight coordinate points, and connecting every two adjacent coordinate points by using straight lines to form a closed base station coverage area, namely obtaining the coverage surface of the omnidirectional antenna base station.
The method for calculating the coverage area of the directional antenna base station comprises the following steps: taking the longitude and latitude of the antenna as a central point, respectively extending the length of a coverage radius R of the base station outwards according to angles of A, A + H/6, A + H/3, A + H/2, A-H/6, A-H/3 and A + H/2 to obtain seven coordinate points, connecting every two adjacent coordinate points with straight lines, and respectively connecting the two coordinate points at the two ends with the longitude and latitude points of the antenna to form a closed base station coverage area, namely obtaining a coverage surface of the omnidirectional antenna base station; the angle A is the antenna azimuth angle, and the angle H is the horizontal lobe angle. The horizontal lobe angle calculation method is that if the number of the directional antennas of the base station is less than or equal to 2, the angle is 180 degrees, otherwise, the angle is 120 degrees.
In step S5, the method for labeling the feature tag for the user includes:
s501, according to the historical positioning track chain of the user, counting the residence frequency, the residence starting time period, the residence ending time period, the average residence time, the residence days on workdays, the residence days on non-workdays and the residence days on the same type position of the user in the same month;
s502, according to the data counted in the S501, marking a behavior label on each resident behavior of the user, wherein the behavior label comprises residence, work, business entertainment and passing;
and S503, according to the behavior label, combining with the type of the geographic entity, and marking a characteristic label for the user by adopting an unsupervised clustering analysis method. The geographic entity type is provided by a map service provider.
As shown in fig. 1, an example of the definition standard of the user feature tag is:
a residential area: the average daily residence time of the user is more than 5 hours, and 80% of the end time period is located in the interval of 7-9 points;
the working place is as follows: the average daily residence time of the user is more than 3 hours, and 80% of the starting time interval is positioned in an interval of 8-10 points or an interval of 13-14 points;
the entertainment land of the business: the average residence time of the user is more than 2 hours, 80 percent of the residence time is on a non-working day, and the geographic entity types are leisure entertainment, shopping and tourist places;
people in double cities: two fixed residential places exist on a user working day and a user non-working day;
the business and tourist: the user resides at an airport or a train station, and then the track loss exists for more than 1 day.
As shown in fig. 2, the user feature tag content example:
1. the user name is as follows: a 35687416;
2. age: age 25;
3. sex: male;
4. occupation: a company employee;
5. the place of residence in the workday: a, building;
6. the working place is as follows: b, building;
7. non-workday residence: c, building;
8. the entertainment land of the business: d building, E building, F building and G site.
Example 2:
the embodiment provides a space-time big data analysis system supporting the invention, which comprises a calculation layer and a service layer, wherein:
the calculation layer is used for calculating a track chain of each mobile phone user every day according to base station engineering parameters, mobile service signaling data and a coordinate point set of an actual position of a space block, wherein the base station engineering parameters, the mobile service signaling data and the coordinate point set are provided by a map service provider, and labeling is carried out on each mobile phone user;
and the service layer extracts different data in the calculation layer according to different business requirements, and obtains corresponding business model data after counting the extracted data.
The tag content includes occupation, work and residence attributes of the mobile phone user.
The base station engineering parameters comprise a regional area code, a base station identification code, a network type, an antenna azimuth angle, a base station coverage type, a base station antenna position longitude coordinate and a base station antenna position latitude coordinate; the mobile service signaling data comprises time, user numbers and base station numbers.
The coverage type of the base station comprises an indoor type and a non-indoor type; the antenna types comprise an omnidirectional antenna and a directional antenna; the coverage radius R of the indoor base station is a fixed value; the coverage radius R of the non-indoor base station is the product of the longitude and latitude coordinates of the base station antenna and the average distance of the nearest three non-indoor base stations and a specific coefficient. The specific coefficient is 1.6; the coverage radius R of the indoor base station is 400 meters by default.
The service layer converts the obtained business model data into one or more of API, SDK and visual components for the third-party software to call.
And the computing layer and the service layer are both provided with system detection modules, the system detection modules are used for detecting whether the operation of each module in the system is normal or not, and if the operation state of the system is abnormal, alarm information is sent out.
The computation layer includes:
the track library is used for storing a track chain of each mobile phone user every day;
the population library is used for storing each mobile phone user label;
the basic database is used for storing the acquired base station engineering parameters, mobile service signaling data and a set of spatial block actual position coordinate points provided by a map service provider;
and the model library is used for storing an algorithm module, and the algorithm module is used for obtaining a track library and a population library according to the content of the basic database.
The service layer comprises:
the service DB is used for storing data read in a track library and a population library of the computing layer according to different service requirements;
the third-party data access/acquisition module is used for receiving the service data input by a third party or actively acquiring the third-party service data;
and the business service module is used for counting the data stored in the business DB according to business needs to obtain corresponding business model data.
The mode of actively collecting the third-party service data is to read the required information in the search engine through a web crawler.
The service layer also comprises a user management module, and the user management module is used for user registration and user authority management; the user management module is respectively connected with a user library and an operation and maintenance library in a data mode, the user library is used for storing registered user information, and the operation and maintenance library is used for storing system operation data and operation logs.
The service layer also comprises a charging module, and the charging module is used for charging the user and managing the balance according to the consumption condition of the user. After the user recharges, the charging module records the balance after the user recharges, when the user accesses the data in the calculation layer, the charging is carried out according to the population number, the geographical area range, the geographical precision, the service use duration, the label use type and the use depth of the tracking data included in the user access data, the fee is deducted from the balance in real time, and the deducted balance is displayed.
Example 3:
the embodiment provides a method for marking the type of a mobile phone user based on big data.
A mobile phone user type marking method based on big data comprises the following steps:
s1, acquiring base station engineering parameters, mobile service signaling data and a set of spatial block actual position coordinate points provided by a map service provider, wherein the base station engineering parameters, the mobile service signaling data and the set of spatial block actual position coordinate points are provided by a communication operator;
s2, forming a geographic entity characteristic fingerprint through the base station engineering parameters and the spatial block actual position coordinate point set;
s3, aggregating the service signaling data according to the time and space relation, and determining the service signaling track data characteristics of the user; due to traffic signaling, there is only one base station at a time. However, when a user is at one location, base station switching may occur due to various factors, that is, multiple continuous service signaling of the user may all point to one location, and therefore, the service signaling of the user needs to be aggregated according to a time and space relationship;
s4, positioning each time interval of the mobile phone user according to the aggregated service signaling track data characteristics, and judging the specific geographic entity of the user in each time interval;
s5, according to the positioning of the user at each moment, generating a daily positioning track chain of the user according to time, and marking a characteristic label for the user by combining the historical positioning track chain of the user, wherein the characteristic label comprises occupation, residence, workplace, business and entertainment places, double-city people and travelers.
The present invention is not limited to the above-described alternative embodiments, and various other forms of products can be obtained by anyone in light of the present invention. The above detailed description should not be taken as limiting the scope of the invention, which is defined in the claims, and which the description is intended to be interpreted accordingly.

Claims (8)

1. A mobile phone user type marking method based on big data is characterized by comprising the following steps:
s1, acquiring base station engineering parameters, mobile service signaling data and a set of spatial block actual position coordinate points provided by a map service provider, wherein the base station engineering parameters, the mobile service signaling data and the set of spatial block actual position coordinate points are provided by a communication operator;
s2, forming a geographic entity characteristic fingerprint through the base station engineering parameters and the spatial block actual position coordinate point set;
s3, aggregating the service signaling data according to the time and space relation, and determining the service signaling track data characteristics of the user;
s4, positioning each time interval of the mobile phone user according to the aggregated service signaling track data characteristics, and judging the specific geographic entity of the user in each time interval;
s5, generating a daily positioning track chain of the user according to the positioning of the user at each moment and time, and marking a feature label for the user by combining the historical positioning track chain of the user, wherein the content of the feature label comprises occupation, residence and workplace;
in step S2, the step of forming the geographic entity feature fingerprint includes:
s201, calculating the coverage area of the base station according to the base station engineering parameters;
s202, according to the coverage range of the geographic entity and the coverage surface of the base station, calculating to obtain a cross area S covered by the geographic entity and the base station through an gis space calculation engine;
s203: calculating the coverage area Sb of the base station according to the engineering parameters of the base station;
s204: calculating a spatial relationship coefficient alpha of the geographic entity and the base station through an equation according to the coverage area Sb and the cross area S of the base station, wherein the calculation equation is as follows: α ═ S ÷ Sb;
s205: outputting a relationship of a geographic entity and a base station covering the geographic entity:
{B,{Lc1,α},{Lc2,α}{Lc3,α}..{Lcn,α}}(1)
b is a geographic entity, and Lc is a base station number;
in step S3, the step of determining the service signaling trajectory data feature of the user includes the following steps:
s301, sequencing user service signaling records according to occurrence time, and combining two service signaling records if the continuous service signaling records are switched repeatedly;
s302, merging the service signaling data with the time interval of 1 minute;
s303, iteratively executing the step S301 and the step S302 until the combination can not be carried out;
s304, correcting error data, finding out the base station with the longest occurrence time in each time interval, and eliminating records with the distance between the base station and the base station being more than 1km in the time interval;
s305, learning historical data, storing the record processed in the step S304 into a database, performing similarity matching with the historical record, and merging the similar historical record into the time interval;
s306, calculating the occurrence frequency W of each base station occurring in the same time period in the last month;
s307, outputting the merged record:
{U,Ts,Te,{Lc1,W1},{Lc2,W2},{Lc3,W3}…{Lcn,Wn}}(2)
wherein, U is a user identifier, Ts is a time interval starting time, Te is a time interval ending time, Lcn is a base station cell identifier, and Wn is the occurrence frequency of the base station cell in the last month.
2. The method according to claim 1, wherein in S305, if the history record has a similarity greater than 80% with the time slot, and is a working day or a non-working day, and the longitude and latitude of the base station in the history record are less than 1km from the longitude and latitude of all the base stations in the current time slot, the history record is also incorporated into the time slot.
3. The big-data-based mobile phone user type tagging method according to claim 1, wherein in the step S4, the step of judging the specific geographic entity where the user is located at each time interval comprises:
performing correlation calculation on the formula (1) and the formula (2) according to an equation (3) to obtain a probability size P that the user may be located in the time period, wherein the equation (3) is as follows:
P{u,b}=∑W*α (3)
forming a data set of likelihood sizes for each user within each geographic entity per time period,
{U,Ts,Te,{B1,P1},{B2,P2},{B3,P3}…{Bn,Pn}}(4)
wherein the geographic entity with the largest P is the resident position of the user in the time period.
4. The big data-based mobile phone user type marking method according to claim 1, wherein: the base station engineering parameters comprise an antenna type, an antenna azimuth angle, a base station coverage type, a base station antenna position longitude coordinate and a base station antenna position latitude coordinate; the mobile service signaling data comprises time, user numbers and base station numbers.
5. The big data-based mobile phone user type marking method according to claim 4, wherein: the coverage type of the base station comprises an indoor type and a non-indoor type; the antenna types comprise an omnidirectional antenna and a directional antenna; the coverage radius R of the indoor base station is a fixed value; the coverage radius R of the non-indoor base station is the product of the longitude and latitude coordinates of the base station antenna and the average distance of the nearest three non-indoor base stations and a specific coefficient.
6. The big data-based mobile phone user type marking method according to claim 5, wherein: the method for calculating the coverage area of the base station of the omnidirectional antenna comprises the following steps: and taking the longitude and latitude of the antenna as a central point, extending the length of the coverage radius R of the base station outwards every 45 degrees to respectively obtain eight coordinate points, and connecting every two adjacent coordinate points by using straight lines to form a closed base station coverage area, namely obtaining the coverage surface of the omnidirectional antenna base station.
7. The big data-based mobile phone user type marking method according to claim 6, wherein: the method for calculating the coverage area of the directional antenna base station comprises the following steps: taking the longitude and latitude of the antenna as a central point, respectively extending the length of a coverage radius R of the base station outwards according to angles of A, A + H/6, A + H/3, A + H/2, A-H/6, A-H/3 and A + H/2 to obtain seven coordinate points, connecting every two adjacent coordinate points with straight lines, and respectively connecting the two coordinate points at the two ends with the longitude and latitude points of the antenna to form a closed base station coverage area, namely obtaining a coverage surface of the omnidirectional antenna base station; the angle A is the antenna azimuth angle, and the angle H is the horizontal lobe angle.
8. The big data-based mobile phone user type tagging method according to claim 1, wherein in the step S5, the method for tagging features for the user is as follows:
s501, according to the historical positioning track chain of the user, counting the residence frequency, the residence starting time period, the residence ending time period, the average residence time, the residence days on workdays, the residence days on non-workdays and the residence days on the same type position of the user in the same month;
s502, marking a behavior label for each resident behavior of the user according to the data counted in the S501, wherein the behavior label comprises residence and work;
and S503, according to the behavior label, combining with the type of the geographic entity, and marking a characteristic label for the user by adopting an unsupervised clustering analysis method.
CN201811550202.8A 2018-12-18 2018-12-18 Mobile phone user type marking method based on big data Active CN109495856B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811550202.8A CN109495856B (en) 2018-12-18 2018-12-18 Mobile phone user type marking method based on big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811550202.8A CN109495856B (en) 2018-12-18 2018-12-18 Mobile phone user type marking method based on big data

Publications (2)

Publication Number Publication Date
CN109495856A CN109495856A (en) 2019-03-19
CN109495856B true CN109495856B (en) 2021-08-10

Family

ID=65710656

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811550202.8A Active CN109495856B (en) 2018-12-18 2018-12-18 Mobile phone user type marking method based on big data

Country Status (1)

Country Link
CN (1) CN109495856B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110363583B (en) * 2019-07-02 2023-08-04 北京淇瑀信息科技有限公司 Method and device for creating user consumption behavior label based on position information and electronic equipment
CN110516017B (en) * 2019-08-02 2022-05-20 Oppo广东移动通信有限公司 Location information processing method and device based on terminal equipment, electronic equipment and storage medium
CN110545522B (en) * 2019-08-13 2021-06-01 广州瀚信通信科技股份有限公司 User position and functional area identification method based on mobile big data
CN114173276B (en) * 2020-09-09 2023-08-01 中国移动通信集团广东有限公司 User positioning method and device
CN113382402A (en) * 2021-05-31 2021-09-10 中科苏州智能计算技术研究院 Population characteristic analysis method based on universal base station and application thereof
CN115086878B (en) * 2022-08-02 2023-04-28 北京融信数联科技有限公司 Method, system and storage medium for obtaining user action track based on mobile phone signaling
CN116980833B (en) * 2023-09-22 2024-01-23 北京融信数联科技有限公司 Regional population age group identification method and system based on signaling data
CN116992267B (en) * 2023-09-28 2024-01-23 北京融信数联科技有限公司 Regional population gender identification method and system based on signaling data

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105205155A (en) * 2015-09-25 2015-12-30 珠海世纪鼎利科技股份有限公司 Big data criminal accomplice screening system and method
CN105682025A (en) * 2016-01-05 2016-06-15 重庆邮电大学 User residing location identification method based on mobile signaling data
CN105674995A (en) * 2015-12-31 2016-06-15 百度在线网络技术(北京)有限公司 Method for acquiring commuting route based on user's travel locus, and apparatus thereof
CN106772506A (en) * 2016-12-22 2017-05-31 天绘北斗(深圳)科技有限公司 A kind of implementation method based on the positioning of big-dipper satellite short message group
CN106878951A (en) * 2017-02-28 2017-06-20 上海讯飞瑞元信息技术有限公司 User trajectory analysis method and system
CN107655490A (en) * 2017-08-29 2018-02-02 重庆邮电大学 Hotspot path based on mobile subscriber track segmentation and most hot search finds method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10134049B2 (en) * 2014-11-20 2018-11-20 At&T Intellectual Property I, L.P. Customer service based upon in-store field-of-view and analytics

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105205155A (en) * 2015-09-25 2015-12-30 珠海世纪鼎利科技股份有限公司 Big data criminal accomplice screening system and method
CN105674995A (en) * 2015-12-31 2016-06-15 百度在线网络技术(北京)有限公司 Method for acquiring commuting route based on user's travel locus, and apparatus thereof
CN105682025A (en) * 2016-01-05 2016-06-15 重庆邮电大学 User residing location identification method based on mobile signaling data
CN106772506A (en) * 2016-12-22 2017-05-31 天绘北斗(深圳)科技有限公司 A kind of implementation method based on the positioning of big-dipper satellite short message group
CN106878951A (en) * 2017-02-28 2017-06-20 上海讯飞瑞元信息技术有限公司 User trajectory analysis method and system
CN107655490A (en) * 2017-08-29 2018-02-02 重庆邮电大学 Hotspot path based on mobile subscriber track segmentation and most hot search finds method

Also Published As

Publication number Publication date
CN109495856A (en) 2019-03-19

Similar Documents

Publication Publication Date Title
CN109495856B (en) Mobile phone user type marking method based on big data
CN109362041B (en) Population space-time distribution analysis method based on big data
CN106462627B (en) Analyzing semantic places and related data from multiple location data reports
Zhong et al. Inferring building functions from a probabilistic model using public transportation data
US10524093B2 (en) User description based on contexts of location and time
CN106931974B (en) Method for calculating personal commuting distance based on mobile terminal GPS positioning data record
Chen et al. Understanding travel time uncertainty impacts on the equity of individual accessibility
US8825383B1 (en) Extracting patterns from location history
CN111212383B (en) Method, device, server and medium for determining number of regional permanent population
KR101566022B1 (en) 3D GIS including sensor map based festival vistor statistics management system and method
CN108027940A (en) Analysis system and analysis method
Lee et al. Urban spatiotemporal analysis using mobile phone data: Case study of medium-and large-sized Korean cities
US9049546B2 (en) User description based on a context of travel
CN112215666A (en) Characteristic identification method for different trip activities based on mobile phone positioning data
Chen et al. Impact of extreme weather events on urban human flow: A perspective from location-based service data
CN111737605A (en) Travel purpose identification method and device based on mobile phone signaling data
CN110414732A (en) A kind of trip Future Trajectory Prediction method, apparatus, storage medium and electronic equipment
CN109672986A (en) A kind of space-time big data analysis system
CN110059919B (en) Population anomaly information detection method and system based on big data
CN109495848B (en) User space positioning method
KR101540677B1 (en) GIS based festival vistor statistics management system and method
McKenzie et al. Measuring urban regional similarity through mobility signatures
CN114511432A (en) Digital country management service system based on block chain
CN109039827B (en) Social software hotspot acquisition system and method based on positions
Lee et al. QRLoc: User-involved calibration using quick response codes for Wi-Fi based indoor localization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant