CN110659320A - Analysis method and analysis device for occupational distribution and readable storage medium - Google Patents

Analysis method and analysis device for occupational distribution and readable storage medium Download PDF

Info

Publication number
CN110659320A
CN110659320A CN201910824917.6A CN201910824917A CN110659320A CN 110659320 A CN110659320 A CN 110659320A CN 201910824917 A CN201910824917 A CN 201910824917A CN 110659320 A CN110659320 A CN 110659320A
Authority
CN
China
Prior art keywords
access
user
information
time period
geographic grid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910824917.6A
Other languages
Chinese (zh)
Other versions
CN110659320B (en
Inventor
王江飞
刘波
苏颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Enyike (beijing) Data Technology Co Ltd
Original Assignee
Enyike (beijing) Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Enyike (beijing) Data Technology Co Ltd filed Critical Enyike (beijing) Data Technology Co Ltd
Priority to CN201910824917.6A priority Critical patent/CN110659320B/en
Publication of CN110659320A publication Critical patent/CN110659320A/en
Application granted granted Critical
Publication of CN110659320B publication Critical patent/CN110659320B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Abstract

The application provides an analysis method, an analysis device and a readable storage medium for occupational distribution, which are used for identifying user identity identification information, access position information and access time information by acquiring a plurality of access data of an area to be analyzed in a preset time period from each platform database; according to each user identity, carrying out aggregation processing on the access position information and the access time information to obtain a plurality of access positions and a plurality of access times corresponding to each user identity representing the same user; matching the position point corresponding to each access position to the corresponding geographic grid; and determining the occupational areas of the users corresponding to the user identification marks in the area to be analyzed based on a preset working time period, a preset rest time period, the matched geographic grid distribution diagram and a plurality of access times corresponding to the user identification marks. Therefore, the diversity and the accuracy of the analysis data can be ensured, and the accuracy of the analysis result of the user place can be improved.

Description

Analysis method and analysis device for occupational distribution and readable storage medium
Technical Field
The application relates to the technical field of big data processing, in particular to an analysis method and an analysis device for occupational distribution, and a readable storage medium which is stored with storage instructions capable of being read by electronic equipment.
Background
The occupations are the joint term of the workplace and the residential site, and help to provide decision support for city planning by analyzing the distribution of the occupations. In the marketing field, stores can be distributed according to the occupation areas of the users, so that the near marketing and the accurate marketing are realized, and convenience is brought to marketers and the users.
The traditional method for determining the occupational area distribution mainly comprises the steps that a server acquires communication signaling data of intelligent equipment, each signaling data corresponds to a unique equipment serial number, and the equipment is positioned by summarizing the data of the same equipment serial number, so that the occupational area distribution of a user is determined. However, in the positioning of the device by the communication signaling, the determined position of the workplace is still not precise due to positioning deviation or positioning drift, and the result of the workplace distribution is likely to be deviated.
Disclosure of Invention
In view of the above, an object of the present application is to provide an analysis method, an analysis device and a readable storage medium for job and accommodation distribution, which can accurately obtain statistics and distribution of job and accommodation through multi-platform data analysis, thereby improving accuracy and precision of a job and accommodation analysis result and reducing a probability of distribution deviation in the job and accommodation distribution result.
The embodiment of the application provides an analysis method for occupational distribution, which comprises the following steps:
acquiring a plurality of access data of an area to be analyzed within a preset time period from each platform database;
identifying user identification information, access position information and access time information from the plurality of access data;
performing aggregation processing on the access position information and the access time information by taking each user identity indicated by the user identity information as a basis to obtain a plurality of access positions and a plurality of access times corresponding to each user identity representing the same user;
matching position points corresponding to each access position to the corresponding geographic grids based on the geographic areas represented by the geographic grids in the obtained geographic grid distribution diagram and the plurality of access positions corresponding to each user;
and determining the occupational areas of the users corresponding to the user identification marks in the area to be analyzed based on a preset working time period, a preset rest time period, the matched geographic grid distribution diagram and a plurality of access times corresponding to the user identification marks.
Further, identifying the user identification information, the access location information, and the access time information from the plurality of access data comprises:
carrying out format normalization processing on the plurality of access data;
and identifying user identity identification information, access position information and access time information from the access data after format normalization processing.
Further, performing format normalization processing on the plurality of access data includes:
and storing data representing the same type of information in the plurality of access data into a database to be analyzed according to a preset data format.
Further, the user identification information includes: a computer serial number and a handheld intelligent device serial number;
after the format normalization processing is performed on the plurality of access data, before the user identification information, the access position information and the access time information are identified from the access data after the format normalization processing, the analysis method further includes:
acquiring user reservation information from the access data after format normalization processing;
and recognizing that the user reservation information represents the computer serial number and the handheld intelligent equipment serial number of the same user, and determining the computer serial number and the handheld intelligent equipment serial number as the same user identity which represents the same user.
Further, determining the place of employment of the user corresponding to each user identity in the area to be analyzed based on a preset working time period, a preset rest time period, a matched geographic grid distribution diagram and a plurality of access times corresponding to each user identity, comprising:
acquiring a preset working time period and a preset rest time period;
for each user identity, determining a plurality of first access times in the working time period and a plurality of second access times in the rest time period from a plurality of access times corresponding to the user identity, and determining a first access position of the user identity at each first access time and a second access position of the user identity at each second access time from a plurality of access positions corresponding to the user identity;
respectively determining a first target geographic grid and a second target geographic grid from the geographic grids of the matched geographic grid distribution diagram based on the position point corresponding to each first access position and the position point corresponding to each second access position, and determining that the actual geographic area corresponding to the first target geographic grid is the working place of the user identity and the actual geographic area corresponding to the second target geographic grid is the residence place of the user identity;
the number of the corresponding first visiting positions in the first target geographic grid is greater than the number of the corresponding first visiting positions in other geographic grids except the first target geographic grid in the matched geographic grid distribution diagram, and the number of the corresponding second visiting positions in the second target geographic grid is greater than the number of the corresponding second visiting positions in other geographic grids except the second target geographic grid in the matched geographic grid distribution diagram.
The embodiment of the present application further provides an analysis device for distribution of occupational sites, the device includes:
the acquisition module is used for acquiring a plurality of access data of the area to be analyzed in a preset time period from each platform database;
the identification module is used for identifying user identity identification information, access position information and access time information from the plurality of access data;
the processing module is used for carrying out aggregation processing on the access position information and the access time information by taking each user identity indicated by the user identity information as a basis to obtain a plurality of access positions and a plurality of access times corresponding to each user identity representing the same user;
the matching module is used for matching position points corresponding to each access position to the corresponding geographic grids based on the geographic areas represented by the geographic grids in the obtained geographic grid distribution diagram and the plurality of access positions corresponding to each user;
and the determining module is used for determining the occupational area of the user corresponding to each user identity in the area to be analyzed based on the preset working time period, the preset rest time period, the matched geographic grid distribution diagram and a plurality of access times corresponding to each user identity.
Further, the identification module comprises:
the normalization processing unit is used for carrying out format normalization processing on the plurality of access data;
and the identification unit is used for identifying the user identity identification information, the access position information and the access time information from the access data after the format normalization processing.
Further, the normalization processing unit is further configured to store data representing the same type of information from the plurality of access data in a database to be analyzed according to a preset data format.
Further, the identification module further comprises:
the first acquisition unit is used for acquiring user reservation information from the access data after format normalization processing;
and the first determining unit is used for identifying the computer serial number and the handheld intelligent equipment serial number of the same user represented by the user reservation information and determining the computer serial number and the handheld intelligent equipment serial number as the same user identity representing the same user.
Further, the determining module comprises:
the second acquisition unit is used for acquiring a preset working time period and a preset rest time period;
a second determining unit, configured to determine, for each user identifier, a plurality of first access times located in the working time period and a plurality of second access times located in the rest time period from a plurality of access times corresponding to the user identifier, and a first access position of the user identifier at each first access time and a second access position of the user identifier at each second access time from a plurality of access positions corresponding to the user identifier;
a third determining unit, configured to determine, based on a location point corresponding to each first access location and a location point corresponding to each second access location, a first target geographic grid and a second target geographic grid from geographic grids of the matched geographic grid distribution map, respectively, and determine that an actual geographic area corresponding to the first target geographic grid is a work place of the user identifier, and an actual geographic area corresponding to the second target geographic grid is a residence place of the user identifier;
the number of the corresponding first visiting positions in the first target geographic grid is greater than the number of the corresponding first visiting positions in other geographic grids except the first target geographic grid in the matched geographic grid distribution diagram, and the number of the corresponding second visiting positions in the second target geographic grid is greater than the number of the corresponding second visiting positions in other geographic grids except the second target geographic grid in the matched geographic grid distribution diagram.
An embodiment of the present application further provides an electronic device, including: a processor, a memory and a bus, the memory storing machine readable instructions executable by the processor, the processor and the memory communicating over the bus when the electronic device is operating, the machine readable instructions when executed by the processor performing the steps of the accommodation distribution parsing method as described above.
Embodiments of the present application also provide a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to execute the steps of the method for resolving a occupational distribution as described above.
According to the analysis method, the analysis device and the readable storage medium for the occupational region distribution, a plurality of access data of a region to be analyzed in a preset time period are obtained from each platform database; identifying user identification information, access position information and access time information from the plurality of access data; performing aggregation processing on the access position information and the access time information by taking each user identity indicated by the user identity information as a basis to obtain a plurality of access positions and a plurality of access times corresponding to each user identity representing the same user; matching position points corresponding to each access position to the corresponding geographic grids based on the geographic areas represented by the geographic grids in the obtained geographic grid distribution diagram and the plurality of access positions corresponding to each user; and determining the occupational areas of the users corresponding to the user identification marks in the area to be analyzed based on a preset working time period, a preset rest time period, the matched geographic grid distribution diagram and a plurality of access times corresponding to the user identification marks.
Compared with the prior art, the method for determining the position of the user in the place through the communication signaling data has the advantages that the multiple access data of the area to be analyzed in the preset time period are obtained from the platform databases, the data sources are wide, the types are rich, the collection frequency is high, the access data of different platforms can be unified, multiple access positions and multiple access times corresponding to each user identity mark of the same user are summarized, the position of the place of the user is obtained, the diversity and the accuracy of the analysis data can be guaranteed, the accuracy of the analysis result of the position of the user is improved, and the probability of distribution deviation in the distribution result of the place of the user is reduced.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 is a flowchart illustrating a method for resolving a occupational distribution according to an embodiment of the present disclosure;
fig. 2 is a flowchart illustrating another method for resolving occupational distribution provided by an embodiment of the present application;
fig. 3 is a schematic structural diagram illustrating a resolution device for a occupational distribution according to an embodiment of the present application;
FIG. 4 shows a schematic structural diagram of the identification module shown in FIG. 3;
FIG. 5 shows a schematic diagram of the structure of the determination module shown in FIG. 3;
fig. 6 shows a schematic structural diagram of an electronic device provided in an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. Every other embodiment that can be obtained by a person skilled in the art without making creative efforts based on the embodiments of the present application falls within the protection scope of the present application.
In the prior art, most of the methods are that communication signaling data between a mobile phone and a base station in a certain time period are obtained, the signaling data of the same mobile phone serial number are collected, the position information of the serial number is identified, and the position of a user place of employment is determined.
Based on this, the embodiment of the application provides an analysis method for the occupational region distribution, so as to improve the analysis accuracy of the occupational region positions of the users and reduce the deviation of occupational region distribution results.
Referring to fig. 1, fig. 1 is a flowchart illustrating a method for analyzing occupational distribution according to an embodiment of the present disclosure. As shown in fig. 1, the method for analyzing occupational distribution provided in the embodiment of the present application includes:
s101, acquiring a plurality of access data of an area to be analyzed in a preset time period from each platform database.
Generally, when a user logs in an application program for the first time, the application program requests to acquire user position information, and the platform database acquires a login location of the device when the user clicks for confirmation. When the user uses the application program, the platform database records a plurality of access data such as access time, position, equipment serial number and the like according to the information fed back by the user side; or, the platform acquires browsing data of the visitor by monitoring browsing records of the launched resources in real time, for example: the exposure time, exposure place, equipment serial number, etc. of the advertisement are monitored. And acquiring access data of each platform database according to the set time range, and taking the access data as basic data of subsequent processing.
The platform can be an application platform such as social contact, news information, video playing, traffic navigation, shopping and the like, and can also be a certain client website and a certain client applet; the access data can be user access records and can also be video/audio exposure information; the preset time period refers to a certain time range in the past, and can be in units of years, seasons and months.
Therefore, the access data of each platform is acquired, and more accurate position information of the occupational sites is obtained later.
S102, identifying user identity identification information, access position information and access time information from the plurality of access data.
The obtained platform access data usually contains different types of information, such as access time, access content, access times and the like. And respectively screening out data required for acquiring the places of employment of the users from the access data of the platform databases, and identifying the user identity identification information, the access position information and the access time information from the data.
S103, carrying out aggregation processing on the access position information and the access time information according to each user identity indicated by the user identity information to obtain a plurality of access positions and a plurality of access times corresponding to each user identity representing the same user.
The method comprises the steps that each platform database records browsing information of visitors, each user can log in an application program within a period of time and log in other application programs within another period of time, and in the step, access position information and access time information of user identification representing the same user in each platform database are gathered together, so that access information of each user on each application platform is obtained.
Wherein, the same user can correspond to a plurality of user identities.
S104, matching position points corresponding to each access position to the corresponding geographic grids based on the geographic areas represented by the geographic grids in the obtained geographic grid distribution diagram and the plurality of access positions corresponding to each user.
Here, the geographic grid distribution map may be obtained by dividing a national map by grids of which the preset length is a side length, where each grid corresponds to an actual geographic area and also covers latitude and longitude information. And matching the access position corresponding to each user with the geographic grid respectively to obtain the actual geographic area where the user is located.
Specifically, a national map can be divided into a plurality of square grids with preset side lengths, and each grid is marked according to longitude and latitude information covered by the square grids to obtain a unique identifier of each grid; establishing indexes between each grid mark and a specific geographic position area; and when the position point corresponding to the access position is input, calculating the corresponding grid identification according to the position point, and further indexing the corresponding geographical position area.
And S105, determining the occupational places of the users corresponding to the user identification marks in the area to be analyzed based on the preset working time period, the preset rest time period, the matched geographic grid distribution diagram and the multiple access times corresponding to the user identification marks.
In the step, for the obtained user identity, geographic area and access time, determining the place of employment of the user through the geographic area of the user identity in the working time period and the rest time period, specifically, the preset working time period is 9:00 to 16:00 of a working day, and the geographic area frequently appearing in the working time period is the working place of the user; the preset rest period is 22:00 on weekdays to 5:00 on the next day, and the geographical area that often occurs during this period is the user's residence. The working time period and the rest time period can be set according to the specificity of occupation, for example, a taxi driver works often at night, and at the moment, a proper time period can be set to determine the place of work and residence of the taxi driver.
According to the analysis method for the occupational distribution, a plurality of access data of an area to be analyzed in a preset time period are obtained from each platform database; identifying user identification information, access position information and access time information from the plurality of access data; performing aggregation processing on the access position information and the access time information by taking each user identity indicated by the user identity information as a basis to obtain a plurality of access positions and a plurality of access times corresponding to each user identity representing the same user; matching position points corresponding to each access position to the corresponding geographic grids based on the geographic areas represented by the geographic grids in the obtained geographic grid distribution diagram and the plurality of access positions corresponding to each user; and determining the occupational areas of the users corresponding to the user identification marks in the area to be analyzed based on a preset working time period, a preset rest time period, the matched geographic grid distribution diagram and a plurality of access times corresponding to the user identification marks.
Compared with the prior art, the method for determining the occupational sites of the users through the communication signaling data has the advantages that the multiple access data of the areas to be analyzed in the preset time period are obtained from the platform databases, the data sources are wide, the types are rich, the collection frequency is high, the access data of different platforms can be unified, multiple access positions and multiple access times corresponding to each user identity mark of the same user are summarized, the occupational site positions of the users are obtained, accordingly, the diversity and accuracy of the analysis data can be guaranteed, the accuracy of the occupational site analysis results is improved, and the probability of distribution deviation in the occurance site distribution results is reduced.
Referring to fig. 2, fig. 2 is a flowchart illustrating a method for resolving occupational distribution according to another embodiment of the present application. As shown in fig. 2, the method for analyzing occupational distribution provided in the embodiment of the present application includes:
s201, acquiring a plurality of access data of the to-be-analyzed area in a preset time period from each platform database.
The description of S201 may refer to the description of S101, and the same technical effect may be achieved, which is not described in detail herein.
And S202, carrying out format normalization processing on the plurality of access data.
In this step, since the access data is from each platform database and the storage format of the access data is different for each platform, a uniform data format is required before processing the data.
S203, identifying user identity identification information, access position information and access time information from the access data after format normalization processing.
Further, performing format normalization processing on the plurality of access data includes:
and storing data representing the same type of information in the plurality of access data into a database to be analyzed according to a preset data format.
Firstly, extracting data representing the same kind of information from the plurality of access data, and finally loading the data into a database to be analyzed in a preset data format after data cleaning and conversion; the data cleaning is to identify and correct error data or invalid data in the data; the conversion is to output the data according to a preset data format.
Further, the user identification information includes: a computer serial number and a handheld intelligent device serial number;
after the format normalization processing is performed on the plurality of access data, before the user identification information, the access position information and the access time information are identified from the access data after the format normalization processing, the analysis method further includes:
and acquiring user reservation information from the access data after format normalization processing.
And recognizing that the user reservation information represents the computer serial number and the handheld intelligent equipment serial number of the same user, and determining the computer serial number and the handheld intelligent equipment serial number as the same user identity which represents the same user.
In this step, when a user accesses the same data platform using different systems, the user usually logs in using the same account, and the device serial numbers belonging to the different systems are determined to represent the same user identity of the same user by identifying the device serial numbers of the same reserved information.
The user reservation information may be an account number of the user, or reservation information such as a mobile phone number and a mailbox.
And S204, carrying out aggregation processing on the access position information and the access time information according to each user identity indicated by the user identity information to obtain a plurality of access positions and a plurality of access times corresponding to each user identity representing the same user.
S205, matching position points corresponding to each access position to the corresponding geographic grids based on the geographic areas represented by the geographic grids in the obtained geographic grid distribution diagram and the multiple access positions corresponding to each user.
S206, determining the occupational area of the user corresponding to each user identity in the area to be analyzed based on the preset working time period, the preset rest time period, the matched geographic grid distribution diagram and the multiple access times corresponding to each user identity.
The descriptions of S204 to S206 may refer to the descriptions of S103 to S105, and the same technical effects can be achieved, which are not described in detail.
Further, determining the place of employment of the user corresponding to each user identity in the area to be analyzed based on a preset working time period, a preset rest time period, a matched geographic grid distribution diagram and a plurality of access times corresponding to each user identity, comprising:
acquiring a preset working time period and a preset rest time period;
for each user identity, determining a plurality of first access times in the working time period and a plurality of second access times in the rest time period from a plurality of access times corresponding to the user identity, and determining a first access position of the user identity at each first access time and a second access position of the user identity at each second access time from a plurality of access positions corresponding to the user identity;
respectively determining a first target geographic grid and a second target geographic grid from the geographic grids of the matched geographic grid distribution diagram based on the position point corresponding to each first access position and the position point corresponding to each second access position, and determining that the actual geographic area corresponding to the first target geographic grid is the working place of the user identity and the actual geographic area corresponding to the second target geographic grid is the residence place of the user identity;
the number of the corresponding first visiting positions in the first target geographic grid is greater than the number of the corresponding first visiting positions in other geographic grids except the first target geographic grid in the matched geographic grid distribution diagram, and the number of the corresponding second visiting positions in the second target geographic grid is greater than the number of the corresponding second visiting positions in other geographic grids except the second target geographic grid in the matched geographic grid distribution diagram.
In the step, firstly, a working time period and a rest time period are set, for each user identity, the access time in the working time period and the rest time period is respectively screened out from the access time of the obtained user identity, and meanwhile, the access position of the user at the access time is reserved.
The position points corresponding to the access positions can be longitude and latitude, specifically, the position points are matched into a geographic grid distribution diagram, each grid in the geographic grid distribution diagram covers longitude and latitude information, meanwhile, the grid to which the position points belong is found out according to the longitude and latitude of the position points corresponding to an actual geographic area.
Determining the target geographic grid with the maximum number of matched grids in the working time period as the working place of the user identity; and determining the target geographic grid with the maximum number of matched grids in the rest time period as the residence of the user identity.
According to the analysis method for the occupational distribution, a plurality of access data of an area to be analyzed in a preset time period are obtained from each platform database; identifying user identification information, access position information and access time information from the plurality of access data; performing aggregation processing on the access position information and the access time information by taking each user identity indicated by the user identity information as a basis to obtain a plurality of access positions and a plurality of access times corresponding to each user identity representing the same user; matching position points corresponding to each access position to the corresponding geographic grids based on the geographic areas represented by the geographic grids in the obtained geographic grid distribution diagram and the plurality of access positions corresponding to each user; and determining the occupational areas of the users corresponding to the user identification marks in the area to be analyzed based on a preset working time period, a preset rest time period, the matched geographic grid distribution diagram and a plurality of access times corresponding to the user identification marks.
Compared with the prior art, the method for determining the occupational sites of the users through the communication signaling data has the advantages that the multiple access data of the areas to be analyzed in the preset time period are obtained from the platform databases, the data sources are wide, the types are rich, the collection frequency is high, the access data of different platforms can be unified, multiple access positions and multiple access times corresponding to each user identity mark of the same user are summarized, the occupational site positions of the users are obtained, accordingly, the diversity and accuracy of the analysis data can be guaranteed, the accuracy of the occupational site analysis results is improved, and the probability of distribution deviation in the occurance site distribution results is reduced.
Referring to fig. 3, fig. 4 and fig. 5, fig. 3 is a schematic structural diagram of an analysis device for occupational distribution according to an embodiment of the present application, fig. 4 is a schematic structural diagram of an identification module shown in fig. 3, and fig. 5 is a schematic structural diagram of a determination module shown in fig. 3. As shown in fig. 3, the resolution device 300 of the occupational distribution includes:
an obtaining module 310, configured to obtain, from each platform database, multiple access data of an area to be analyzed within a preset time period;
an identifying module 320, configured to identify user identification information, access location information, and access time information from the plurality of access data;
a processing module 330, configured to aggregate the access location information and the access time information according to each user identity indicated by the user identity information, to obtain multiple access locations and multiple access times corresponding to each user identity representing the same user;
a matching module 340, configured to match, based on the geographic area represented by each geographic grid in the obtained geographic grid distribution map and the multiple access positions corresponding to each user, a position point corresponding to each access position into a corresponding geographic grid;
a determining module 350, configured to determine, based on a preset work time period, a rest time period, the matched geographic grid distribution map, and a plurality of access times corresponding to each user identifier, a place of employment of a user corresponding to each user identifier in the area to be analyzed.
Further, as shown in fig. 4, the identification module 320 includes:
a normalization processing unit 321 configured to perform format normalization processing on the plurality of access data;
and the identifying unit 322 is configured to identify the user identity information, the access location information, and the access time information from the access data after the format normalization processing.
Further, the normalization processing unit 321 is further configured to store, in a database to be analyzed, data representing the same type of information from the multiple pieces of access data according to a preset data format.
Further, the identification module 320 further includes:
a first obtaining unit 323, configured to obtain user reservation information from the access data after format normalization processing;
the first determining unit 324 is configured to recognize that the user reservation information represents a computer serial number and a handheld smart device serial number of the same user, and determine the computer serial number and the handheld smart device serial number as the same user id representing the same user.
Further, as shown in fig. 5, the determining module 350 includes:
a second obtaining unit 351, configured to obtain a preset working time period and a preset rest time period;
a second determining unit 352, configured to determine, for each user identifier, a plurality of first access times located in the working time period and a plurality of second access times located in the rest time period from the plurality of access times corresponding to the user identifier, and determine, from the plurality of access positions corresponding to the user identifier, a first access position of the user identifier at each first access time and a second access position of the user identifier at each second access time;
a third determining unit 353, configured to determine, based on a location point corresponding to each first access location and a location point corresponding to each second access location, a first target geographic grid and a second target geographic grid from geographic grids of the matched geographic grid distribution map, respectively, and determine that an actual geographic area corresponding to the first target geographic grid is a work place of the user identifier, and an actual geographic area corresponding to the second target geographic grid is a residence place of the user identifier;
the number of the corresponding first visiting positions in the first target geographic grid is greater than the number of the corresponding first visiting positions in other geographic grids except the first target geographic grid in the matched geographic grid distribution diagram, and the number of the corresponding second visiting positions in the second target geographic grid is greater than the number of the corresponding second visiting positions in other geographic grids except the second target geographic grid in the matched geographic grid distribution diagram.
According to the analysis device for the occupational distribution, a plurality of access data of an area to be analyzed in a preset time period are obtained from each platform database; identifying user identification information, access position information and access time information from the plurality of access data; performing aggregation processing on the access position information and the access time information by taking each user identity indicated by the user identity information as a basis to obtain a plurality of access positions and a plurality of access times corresponding to each user identity representing the same user; matching position points corresponding to each access position to the corresponding geographic grids based on the geographic areas represented by the geographic grids in the obtained geographic grid distribution diagram and the plurality of access positions corresponding to each user; and determining the occupational areas of the users corresponding to the user identification marks in the area to be analyzed based on a preset working time period, a preset rest time period, the matched geographic grid distribution diagram and a plurality of access times corresponding to the user identification marks.
Compared with the prior art, the method for determining the occupational sites of the users through the communication signaling data has the advantages that the multiple access data of the areas to be analyzed in the preset time period are obtained from the platform databases, the data sources are wide, the types are rich, the collection frequency is high, the access data of different platforms can be unified, multiple access positions and multiple access times corresponding to each user identity mark of the same user are summarized, the occupational site positions of the users are obtained, accordingly, the diversity and accuracy of the analysis data can be guaranteed, the accuracy of the occupational site analysis results is improved, and the probability of distribution deviation in the occurance site distribution results is reduced.
Referring to fig. 6, fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. As shown in fig. 6, the electronic device 600 includes a processor 610, a memory 620, and a bus 630.
The memory 620 stores machine-readable instructions executable by the processor 610, when the electronic device 600 runs, the processor 610 communicates with the memory 620 through the bus 630, and when the machine-readable instructions are executed by the processor 610, the steps of the analysis method of the occupational distribution in the method embodiments shown in fig. 1 and fig. 2 may be performed.
An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the step of the method for analyzing occupational distribution in the method embodiments shown in fig. 1 and fig. 2 may be executed.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope disclosed in the present application; such modifications, changes or substitutions do not depart from the spirit and scope of the exemplary embodiments of the present application, and are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (12)

1. An analytic method of occupational distribution, the analytic method comprising:
acquiring a plurality of access data of an area to be analyzed within a preset time period from each platform database;
identifying user identification information, access position information and access time information from the plurality of access data;
performing aggregation processing on the access position information and the access time information by taking each user identity indicated by the user identity information as a basis to obtain a plurality of access positions and a plurality of access times corresponding to each user identity representing the same user;
matching position points corresponding to each access position to the corresponding geographic grids based on the geographic areas represented by the geographic grids in the obtained geographic grid distribution diagram and the plurality of access positions corresponding to each user;
and determining the occupational areas of the users corresponding to the user identification marks in the area to be analyzed based on a preset working time period, a preset rest time period, the matched geographic grid distribution diagram and a plurality of access times corresponding to the user identification marks.
2. The method of resolving a occupational distribution of claim 1, wherein identifying user identification information, access location information, and access time information from the plurality of access data comprises:
carrying out format normalization processing on the plurality of access data;
and identifying user identity identification information, access position information and access time information from the access data after format normalization processing.
3. The method of resolving occupational distributions of claim 2, wherein performing format normalization processing on the plurality of access data comprises:
and storing data representing the same type of information in the plurality of access data into a database to be analyzed according to a preset data format.
4. The method of resolving occupational distribution of claim 2, wherein the user identification information comprises: a computer serial number and a handheld intelligent device serial number;
after the format normalization processing is performed on the plurality of access data, before the user identification information, the access position information and the access time information are identified from the access data after the format normalization processing, the analysis method further includes:
acquiring user reservation information from the access data after format normalization processing;
and recognizing that the user reservation information represents the computer serial number and the handheld intelligent equipment serial number of the same user, and determining the computer serial number and the handheld intelligent equipment serial number as the same user identity which represents the same user.
5. The method for resolving the occupational distribution according to claim 1, wherein the step of determining the occupational distribution of the users corresponding to the user identifiers in the area to be resolved based on the preset work time period, the rest time period, the matched geographic grid distribution map and the multiple access times corresponding to each user identifier comprises:
acquiring a preset working time period and a preset rest time period;
for each user identity, determining a plurality of first access times in the working time period and a plurality of second access times in the rest time period from a plurality of access times corresponding to the user identity, and determining a first access position of the user identity at each first access time and a second access position of the user identity at each second access time from a plurality of access positions corresponding to the user identity;
respectively determining a first target geographic grid and a second target geographic grid from the geographic grids of the matched geographic grid distribution diagram based on the position point corresponding to each first access position and the position point corresponding to each second access position, and determining that the actual geographic area corresponding to the first target geographic grid is the working place of the user identity and the actual geographic area corresponding to the second target geographic grid is the residence place of the user identity;
the number of the corresponding first visiting positions in the first target geographic grid is greater than the number of the corresponding first visiting positions in other geographic grids except the first target geographic grid in the matched geographic grid distribution diagram, and the number of the corresponding second visiting positions in the second target geographic grid is greater than the number of the corresponding second visiting positions in other geographic grids except the second target geographic grid in the matched geographic grid distribution diagram.
6. An analytic device of occupational distribution, the device comprising:
the acquisition module is used for acquiring a plurality of access data of the area to be analyzed in a preset time period from each platform database;
the identification module is used for identifying user identity identification information, access position information and access time information from the plurality of access data;
the processing module is used for carrying out aggregation processing on the access position information and the access time information by taking each user identity indicated by the user identity information as a basis to obtain a plurality of access positions and a plurality of access times corresponding to each user identity representing the same user;
the matching module is used for matching position points corresponding to each access position to the corresponding geographic grids based on the geographic areas represented by the geographic grids in the obtained geographic grid distribution diagram and the plurality of access positions corresponding to each user;
and the determining module is used for determining the occupational area of the user corresponding to each user identity in the area to be analyzed based on the preset working time period, the preset rest time period, the matched geographic grid distribution diagram and a plurality of access times corresponding to each user identity.
7. The resolution mechanism for occupational distribution according to claim 6, wherein the identification module comprises:
the normalization processing unit is used for carrying out format normalization processing on the plurality of access data;
and the identification unit is used for identifying the user identity identification information, the access position information and the access time information from the access data after the format normalization processing.
8. The apparatus for resolving a occupational distribution according to claim 7, wherein the normalization processing unit is further configured to store data representing the same type of information from the plurality of access data in a preset data format in the database to be resolved.
9. The apparatus for resolving occupational distributions of claim 7, wherein the identification module further comprises:
the first acquisition unit is used for acquiring user reservation information from the access data after format normalization processing;
and the first determining unit is used for identifying the computer serial number and the handheld intelligent equipment serial number of the same user represented by the user reservation information and determining the computer serial number and the handheld intelligent equipment serial number as the same user identity representing the same user.
10. The apparatus for resolving occupational distributions of claim 6, wherein the determination module comprises:
the second acquisition unit is used for acquiring a preset working time period and a preset rest time period;
a second determining unit, configured to determine, for each user identifier, a plurality of first access times located in the working time period and a plurality of second access times located in the rest time period from a plurality of access times corresponding to the user identifier, and a first access position of the user identifier at each first access time and a second access position of the user identifier at each second access time from a plurality of access positions corresponding to the user identifier;
a third determining unit, configured to determine, based on a location point corresponding to each first access location and a location point corresponding to each second access location, a first target geographic grid and a second target geographic grid from geographic grids of the matched geographic grid distribution map, respectively, and determine that an actual geographic area corresponding to the first target geographic grid is a work place of the user identifier, and an actual geographic area corresponding to the second target geographic grid is a residence place of the user identifier;
the number of the corresponding first visiting positions in the first target geographic grid is greater than the number of the corresponding first visiting positions in other geographic grids except the first target geographic grid in the matched geographic grid distribution diagram, and the number of the corresponding second visiting positions in the second target geographic grid is greater than the number of the corresponding second visiting positions in other geographic grids except the second target geographic grid in the matched geographic grid distribution diagram.
11. An electronic device, comprising: a processor, a memory and a bus, the memory storing machine readable instructions executable by the processor, the processor and the memory communicating over the bus when the electronic device is running, the machine readable instructions when executed by the processor performing the steps of the method of resolving a occupational distribution of activities of one of claims 1 to 5.
12. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, performs the steps of the method for resolving a occupational distribution as claimed in one of the claims 1 to 5.
CN201910824917.6A 2019-09-02 2019-09-02 Analysis method and analysis device for occupational distribution and readable storage medium Active CN110659320B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910824917.6A CN110659320B (en) 2019-09-02 2019-09-02 Analysis method and analysis device for occupational distribution and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910824917.6A CN110659320B (en) 2019-09-02 2019-09-02 Analysis method and analysis device for occupational distribution and readable storage medium

Publications (2)

Publication Number Publication Date
CN110659320A true CN110659320A (en) 2020-01-07
CN110659320B CN110659320B (en) 2022-08-09

Family

ID=69036708

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910824917.6A Active CN110659320B (en) 2019-09-02 2019-09-02 Analysis method and analysis device for occupational distribution and readable storage medium

Country Status (1)

Country Link
CN (1) CN110659320B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111428197A (en) * 2020-03-18 2020-07-17 北京城市象限科技有限公司 Data processing method, device and equipment
CN113268679A (en) * 2021-04-19 2021-08-17 宁波市测绘和遥感技术研究院 Visual processing method based on internet big data

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102356390A (en) * 2009-03-16 2012-02-15 微软公司 Flexible logging, such as for a web server
CN103136959A (en) * 2011-11-25 2013-06-05 北京中交兴路信息科技有限公司 Method for aggregating and displaying mobile target information in mobile target monitoring
CN103154928A (en) * 2010-06-24 2013-06-12 奥比融移动有限公司 Network server arrangement for processing non-parametric, multi-dimensional, spatial and temporal human behavior or technical observations measured pervasively, and related method for the same
US20130227023A1 (en) * 2005-10-26 2013-08-29 Cortica, Ltd. System and method for profiling users interest based on multimedia content analysis
CN104902438A (en) * 2015-05-04 2015-09-09 林茜茜 Statistical method and system for analyzing passenger flow characteristic information on the basis of mobile communication terminal
CN106161553A (en) * 2015-04-16 2016-11-23 腾讯科技(深圳)有限公司 Community application information-pushing method and system
CN106547894A (en) * 2016-11-03 2017-03-29 浙江夏农信息技术有限公司 The system and method that location tags are lived in duty is excavated based on mobile communication signaling big data
CN106792514A (en) * 2016-11-30 2017-05-31 南京华苏科技有限公司 User's duty residence analysis method based on signaling data
CN107092638A (en) * 2012-06-22 2017-08-25 谷歌公司 The method and computing device of coherent element information are provided based on position from map history
US20170286845A1 (en) * 2016-04-01 2017-10-05 International Business Machines Corporation Automatic extraction of user mobility behaviors and interaction preferences using spatio-temporal data
CN107527313A (en) * 2016-06-20 2017-12-29 同济大学 User Activity mode division and attribute estimation method
CN108170741A (en) * 2017-12-18 2018-06-15 北京中交兴路信息科技有限公司 A kind of geographic position data and the matching process and device of administrative region

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130227023A1 (en) * 2005-10-26 2013-08-29 Cortica, Ltd. System and method for profiling users interest based on multimedia content analysis
CN102356390A (en) * 2009-03-16 2012-02-15 微软公司 Flexible logging, such as for a web server
CN103154928A (en) * 2010-06-24 2013-06-12 奥比融移动有限公司 Network server arrangement for processing non-parametric, multi-dimensional, spatial and temporal human behavior or technical observations measured pervasively, and related method for the same
CN103136959A (en) * 2011-11-25 2013-06-05 北京中交兴路信息科技有限公司 Method for aggregating and displaying mobile target information in mobile target monitoring
CN107092638A (en) * 2012-06-22 2017-08-25 谷歌公司 The method and computing device of coherent element information are provided based on position from map history
CN106161553A (en) * 2015-04-16 2016-11-23 腾讯科技(深圳)有限公司 Community application information-pushing method and system
CN104902438A (en) * 2015-05-04 2015-09-09 林茜茜 Statistical method and system for analyzing passenger flow characteristic information on the basis of mobile communication terminal
US20170286845A1 (en) * 2016-04-01 2017-10-05 International Business Machines Corporation Automatic extraction of user mobility behaviors and interaction preferences using spatio-temporal data
CN107527313A (en) * 2016-06-20 2017-12-29 同济大学 User Activity mode division and attribute estimation method
CN106547894A (en) * 2016-11-03 2017-03-29 浙江夏农信息技术有限公司 The system and method that location tags are lived in duty is excavated based on mobile communication signaling big data
CN106792514A (en) * 2016-11-30 2017-05-31 南京华苏科技有限公司 User's duty residence analysis method based on signaling data
CN108170741A (en) * 2017-12-18 2018-06-15 北京中交兴路信息科技有限公司 A kind of geographic position data and the matching process and device of administrative region

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
S ZENG 等: "Predictability and Prediction of Human Mobility Based on Application-Collected Location Data", 《IEEE INTERNATIONAL CONFERENCE ON MOBILE AD HOC & SENSOR SYSTEMS》 *
毛峰: "基于多源轨迹数据挖掘的居民通勤行为与城市职住空间特征研究", 《中国优秀博硕士学位论文全文数据库(博士)基础科学辑》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111428197A (en) * 2020-03-18 2020-07-17 北京城市象限科技有限公司 Data processing method, device and equipment
CN111428197B (en) * 2020-03-18 2024-02-09 北京城市象限科技有限公司 Data processing method, device and equipment
CN113268679A (en) * 2021-04-19 2021-08-17 宁波市测绘和遥感技术研究院 Visual processing method based on internet big data

Also Published As

Publication number Publication date
CN110659320B (en) 2022-08-09

Similar Documents

Publication Publication Date Title
CN103220376B (en) Method for positioning IP (Internet Protocol) by position data of mobile terminal
CN107330459B (en) Data processing method and device and electronic equipment
CN109635857B (en) Human-vehicle track monitoring and analyzing method, device, equipment and storage medium
EP1738524B1 (en) Method and system for generating a population representative of a set of users of a communication network
CN111212383B (en) Method, device, server and medium for determining number of regional permanent population
CN110659320B (en) Analysis method and analysis device for occupational distribution and readable storage medium
CN104699835A (en) Method and device used for determining webpages including POI (point of interest) data
KR101783721B1 (en) Group targeting system and group targeting method using range ip
CN108427679B (en) People stream distribution processing method and equipment thereof
WO2015084584A2 (en) Method and system for collecting resource access information
EP2495696A1 (en) Management server, population information calculation management server, zero population distribution area management method, and population information calculation method
CN111177289A (en) Method and system for extracting and checking related information of data space of multi-source network
CN111447292B (en) IPv6 geographical position positioning method, device, equipment and storage medium
CN110012436B (en) User position determination method, device, equipment and computer readable storage medium
CN103177189A (en) Public source position check-in data quality analysis method
CN114780556A (en) Method and device for determining update frequency of map
CN105069079B (en) Method and device for screening POI (Point of interest) data
CN109729123B (en) Method and device for monitoring advertisement delivery region
CN110990651B (en) Address data processing method and device, electronic equipment and computer readable medium
CN111143639B (en) User intimacy calculation method, device, equipment and medium
Woods et al. Exploring methods for mapping seasonal population changes using mobile phone data
CN111092764A (en) Real-time dynamic intimacy relationship analysis method and system
CN110943989A (en) Equipment identification method and device, electronic equipment and readable storage medium
CN111126653A (en) User position prediction method, device and storage medium
CN111797181B (en) Positioning method, device, control equipment and storage medium for user location

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant