CN111209487B - User data analysis method, server, and computer-readable storage medium - Google Patents

User data analysis method, server, and computer-readable storage medium Download PDF

Info

Publication number
CN111209487B
CN111209487B CN202010008906.3A CN202010008906A CN111209487B CN 111209487 B CN111209487 B CN 111209487B CN 202010008906 A CN202010008906 A CN 202010008906A CN 111209487 B CN111209487 B CN 111209487B
Authority
CN
China
Prior art keywords
consumption
user
community
score
place
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010008906.3A
Other languages
Chinese (zh)
Other versions
CN111209487A (en
Inventor
喻宁
史良洵
陈克炎
朱园丽
朱艳乔
陈皓云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Ping An Property and Casualty Insurance Company of China Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Ping An Property and Casualty Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd, Ping An Property and Casualty Insurance Company of China Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202010008906.3A priority Critical patent/CN111209487B/en
Publication of CN111209487A publication Critical patent/CN111209487A/en
Application granted granted Critical
Publication of CN111209487B publication Critical patent/CN111209487B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Abstract

The invention relates to a data analysis technology, and discloses a user data analysis method, which comprises the following steps: collecting LBS track data of a user in a preset time period; POI data in a preset range are obtained; matching LBS track data and POI data of a user, and analyzing a consumption place where the user appears; acquiring consumption information corresponding to the consumption place; grading each consumption place according to the consumption information and a preset rule; taking the consumption places with the same score as a community seed set, and classifying the users by using a CNM (computer network model) community discovery model according to the grade score of the consumption place corresponding to each user; and establishing a consumption capacity label for each user according to the grade score of the community to which each user belongs. The invention also provides a server and a computer readable storage medium. The invention can effectively judge the consumption capability of the user under the condition of not needing to acquire information such as consumption records or consumption behavior data of the user.

Description

User data analysis method, server, and computer-readable storage medium
Technical Field
The present invention relates to the field of data analysis technologies, and in particular, to a user data analysis method, a server, and a computer-readable storage medium.
Background
At present, how to determine the abundance degree and the consumption capacity of the user is often realized by performing questionnaire survey on the user, inquiring bank window records, credit card consumption records and the like. In addition, the consumption capability of the user can be judged from online consumption behavior data and the like. However, for an enterprise without consumption data, the consumption capability of its user cannot be judged.
Therefore, how to judge the consumption capability of the user without consuming information such as records or consumption behavior data becomes a technical problem to be solved urgently.
Disclosure of Invention
In view of the above, the present invention provides a user data analysis method, a server and a computer readable storage medium to solve the above technical problems.
First, in order to achieve the above object, the present invention provides a user data analysis method, including:
collecting LBS track data of a user in a preset time period;
POI data in a preset range are obtained;
matching the LBS track data and the POI data of the user, and analyzing the consumption place of the user;
acquiring consumption information corresponding to the consumption place;
grading each consumption place according to the consumption information and a preset rule;
taking the consumption places with the same score as a community seed set, and classifying the users by using a CNM (computer network model) community discovery model according to the grade score of the consumption place corresponding to each user; and
and establishing a consumption capacity label for each user according to the grade score of the community to which each user belongs.
Optionally, the consumption information includes a place type and a consumption level, and in the step of obtaining the consumption information corresponding to the consumption place, the consumption information of each consumption place is obtained from the POI data and the public opinion data.
Optionally, the step of performing a grade score for each consumption location according to the consumption information and a preset rule includes:
presetting the weight of each scoring item;
scoring each scoring item of each consumption place according to a preset rule;
calculating the total score of the consumption place according to the score and the weight of each scoring item;
and obtaining the grade score of the place according to the total score and a preset score interval corresponding to each grade.
Optionally, the step of performing a grade score for each consumption location according to the consumption information and preset rules further includes:
and grading each consumption place by using a logistic regression model, wherein data output by the model is a grading result of the consumption place.
Optionally, the step of classifying the users by using the CNM community discovery model according to the grade score of the consumption location corresponding to each user includes:
and classifying all the users into one of the communities by using the CNM community discovery model according to the grade score of the consumption place corresponding to each user, the occurrence frequency and the residence time of the user in each consumption place.
Optionally, in the step of classifying the users by using a CNM community discovery model according to the grade scores of the consumption places corresponding to the users, each user is classified into a community with the highest frequency of occurrence and the longest residence time, where in the CNM community discovery model, the weights of the frequency of occurrence and the residence time are 50% respectively.
Optionally, the step of establishing a consumption capability tag for each user according to the rating score of the community to which the user belongs includes:
after each user is classified into a community, the grade score of a consumption place corresponding to each community is used as the consumption capacity grade score of each user in the community, and a consumption capacity label corresponding to the consumption capacity grade score is established for the user.
Optionally, after obtaining the consumer ability level score of each user, the method further comprises:
and aiming at each user, synthesizing the consumption capability grade grading results of a plurality of time periods to obtain a comprehensive grade, and establishing a consumption capability label corresponding to the comprehensive consumption capability grade grading for the user.
In addition, in order to achieve the above object, the present invention further provides a server, which includes a memory and a processor, wherein the memory stores a user data analysis system capable of running on the processor, and the user data analysis system implements the steps of the user data analysis method when being executed by the processor.
Further, to achieve the above object, the present invention also provides a computer readable storage medium storing a user data analysis system, which is executable by at least one processor to cause the at least one processor to perform the steps of the user data analysis method as described above.
Compared with the prior art, the user data analysis method, the server and the computer-readable storage medium provided by the invention can analyze the consumption places where the users appear through the LBS track data and the POI data of the users, then score according to the consumption information of the consumption places, and classify the users by using the CNM community discovery model according to the appearance frequency and residence time of the users in each consumption place, thereby judging the consumption capacity of the users. The invention can analyze and obtain the consumption capability of the user without acquiring the information of the consumption record or the consumption behavior data and the like of the user, and provides a scheme for effectively judging the consumption capability of the user for enterprises with LBS track data but without the user consumption data (consumption record or consumption behavior data) in the industry.
Drawings
FIG. 1 is a schematic diagram of an alternative hardware architecture for a server according to the present invention;
FIG. 2 is a schematic diagram of program modules of a first embodiment of a user data analysis system according to the present invention;
FIG. 3 is a schematic diagram of program modules of a second embodiment of a user data analysis system according to the present invention;
FIG. 4 is a flowchart illustrating a first embodiment of a user data analysis method according to the present invention;
FIG. 5 is a flowchart illustrating a second embodiment of a user data analysis method according to the present invention;
the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the description relating to "first", "second", etc. in the present invention is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.
Fig. 1 is a schematic diagram of an alternative hardware architecture of the server 2 according to the present invention.
In this embodiment, the server 2 may include, but is not limited to, a memory 11, a processor 12, and a network interface 13, which may be communicatively connected to each other through a system bus. It is noted that fig. 1 only shows the server 2 with components 11-13, but it is to be understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead.
The server 2 may be a rack server, a blade server, a tower server, or a rack server, and the server 2 may be an independent server or a server cluster formed by a plurality of servers.
The memory 11 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the storage 11 may be an internal storage unit of the server 2, such as a hard disk or a memory of the server 2. In other embodiments, the memory 11 may also be an external storage device of the server 2, such as a plug-in hard disk, a Smart Memory Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like, provided on the server 2. Of course, the memory 11 may also comprise both an internal storage unit of the server 2 and an external storage device thereof. In this embodiment, the memory 11 is generally used for storing an operating system installed in the server 2 and various types of application software, such as program codes of the user data analysis system 200. Furthermore, the memory 11 may also be used to temporarily store various types of data that have been output or are to be output.
The processor 12 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 12 is typically used to control the overall operation of the server 2. In this embodiment, the processor 12 is configured to run the program codes stored in the memory 11 or process data, for example, run the user data analysis system 200.
The network interface 13 may comprise a wireless network interface or a wired network interface, and the network interface 13 is generally used for establishing communication connection between the server 2 and other electronic devices.
The hardware structure and functions of the related devices of the present invention have been described in detail so far. Various embodiments of the present invention will be presented based on the above description.
First, the present invention provides a user data analysis system 200.
Referring to fig. 2, a program module diagram of a first embodiment of a user data analysis system 200 according to the present invention is shown.
In this embodiment, the user data analysis system 200 includes a series of computer program instructions stored on the memory 11, which when executed by the processor 12, may implement the user data analysis operations of the embodiments of the present invention. In some embodiments, user data analysis system 200 may be divided into one or more modules based on the particular operations implemented by the portions of the computer program instructions. For example, in fig. 2, the user data analysis system 200 may be divided into an acquisition module 201, an acquisition module 202, a matching module 203, a scoring module 204, a classification module 205, and an establishment module 206. Wherein:
the acquisition module 201 is configured to acquire LBS trajectory data of a user within a preset time period.
Specifically, Location Based Services (LBS) uses various types of positioning technologies to acquire the current Location of the positioning device, and provides information resources and basic Services to the positioning device through the mobile internet. Which acquires geographical location coordinate information of a user and integrates the same with other information based on a spatial database using a wireless communication network (or a satellite positioning system) by a mobile terminal corresponding to the user to provide the user with a desired location-related value-added service. Currently, the common positioning technologies for LBS include: global positioning system, WiFi positioning, IP address positioning, triangulation, etc. The LBS has the main characteristics that: wide coverage, high positioning precision and simple operation.
The information specifically included in the LBS track data is that the device (generally, a mobile terminal) corresponding to the user appears at a certain location (latitude and longitude) at a certain time point, that is, the information includes time and location (latitude and longitude).
The preset time period may be one week, one month, etc. And collecting all LBS track data of each user in the preset time period.
The obtaining module 202 is configured to obtain Point of Interest (POI) data within a preset range.
Specifically, in the geographic information system, one POI may be one house, one shop, one mailbox, one bus station, and the like. Each POI contains information such as name, category, coordinates, etc., where each category corresponds to the code and name of the corresponding business. In short, the POI data may show what location each location (latitude and longitude) corresponds to. Typically, the POI data includes point-of-interest location information, point-of-interest categories, point-of-interest maps, and the like.
The preset range may be an area where the user is located or a geographical range covered by the LBS track data of the user.
The matching module 203 is configured to match the LBS track data of the user with the POI data, and analyze a consumption location where the user appears.
Specifically, which positions belong to the consumption places can be known according to the POI data, and then the LBS track data of the user is matched with the POI data, so that the consumption places corresponding to the LBS track data, namely which consumption places the user appears at, can be analyzed.
In this embodiment, the (mobile terminal) device number, time (time of entering and exiting each consumption location), and the consumption location where the user appears may be recorded correspondingly. For example: user A/device number 001 Monday 18:05-19:32, restaurant B; tuesday 07:20-07:25, convenience store C; saturday 10:30-11:50, mall D; saturday 11:56-12:47, restaurant E; weekdays 15:19-16:24, pet hospital F.
The obtaining module 202 is further configured to obtain consumption information corresponding to the consumption location.
Specifically, the consumption information may include a place type (e.g., a mall, a convenience store, a hotel, a restaurant, a hospital, a pet hospital, etc.), a consumption level (e.g., a per-person consumption amount), and the like. In this embodiment, the consumption information of each consumption place may be acquired from data sources such as the POI data and the public opinion data.
And the scoring module 204 is used for grading each consumption place according to the consumption information and preset rules.
Specifically, weights of various scoring items (for example, site types and consumption levels in the consumption information) are preset, each scoring item of each consumption site is scored according to a preset rule, and then a total score of the site is calculated according to the score and the weight of each scoring item. And finally, grading the place according to the total score and a preset score interval corresponding to each grade.
Alternatively, a Logistic regression (Logistic regression) model may be used to perform grade scoring for each consumption location, and the data output by the model is the grade scoring result of each consumption location.
In this embodiment, the rating scores for each consumer location may be divided into 0-10 in the manner described above.
For example, convenience store C has a rating of 2, restaurant B and restaurant E each have a rating of 5, mall C has a rating of 6, and pet hospital F has a rating of 7.
The classification module 205 is configured to use the consumption places with the same score as a community seed set, and classify the users by using a CNM community discovery model according to the grade score of the consumption place corresponding to each user.
Specifically, in the CNM community discovery model, the consumption places with the same score are used as a community seed set (for example, restaurant B and restaurant E with a grade score of 5 belong to the same community), and then all users are classified into a certain community by using the CNM community discovery model according to the grade score of the consumption place corresponding to each user. The weight calculation method of the community edge is as follows: the frequency and residence time (50% each weight) of occurrences in the community (set of consumption sites). It is noted that, by the CNM community discovery model, all users can be classified into one of the communities, and each user can be classified into only one of the communities.
That is, according to the above processing procedure, a plurality of consumption places where each user comes in and goes out, the frequency of occurrence and the residence time at each consumption place can be obtained, then, through the CNM community discovery model, the user is classified according to the frequency of occurrence and the residence time of each user at each consumption place, and finally, each user is classified into the community with the highest frequency of occurrence and the longest residence time.
For example, the convenience store C belongs to the second community (rating score of 2), the restaurant B and the restaurant E belong to the fifth community (rating score of 5), the mall C belongs to the sixth community (rating score of 6), and the pet hospital F belongs to the seventh community (rating score of 7), and the user a has the highest frequency/longest duration of visiting the consumption place corresponding to the fifth community in terms of frequency of appearance and residence time, so the user a is classified into the fifth community.
The establishing module 206 is configured to establish a consumption capability label for each user according to the rating score of the community to which the user belongs.
Specifically, after each user is classified into one of the communities, the grade score of the consumption place corresponding to each community is used as the consumption capacity grade score of each user in the community, and corresponding consumption capacity labels are established for all the users. For example, the grade score of 5 of the fifth community is taken as the consumption capability label of the user a, i.e., the consumption capability grade score of 5 of the user a.
The user data analysis system provided by this embodiment may analyze the consumption places where the user appears through the LBS track data and the POI data of the user, score according to consumption information of the consumption places, and classify the user by using the CNM community discovery model according to the frequency and residence time of the user appearing in each consumption place, thereby determining the consumption capability of the user. According to the embodiment, the consumption capacity of the user can be analyzed without acquiring information such as consumption records or consumption behavior data of the user, and a scheme for effectively judging the consumption capacity of the user is provided for enterprises with LBS track data but without user consumption data (consumption records or consumption behavior data) in the industry.
Fig. 3 is a block diagram of a user data analysis system 200 according to a second embodiment of the present invention. In this embodiment, the user data analysis system 200 further includes a synthesis module 207 in addition to the acquisition module 201, the acquisition module 202, the matching module 203, the scoring module 204, the classification module 205, and the establishment module 206 in the first embodiment.
The synthesis module 207 is configured to synthesize the rating scores of the multiple time periods for each user to obtain a comprehensive rating score, which is used as a consumption capability label of the user.
Specifically, LBS track data of the user in multiple time periods may be collected (for example, if the preset time period is one week, data of one month/four weeks is collected, to obtain four sets of LBS track data), the consumption ability grade scores of the user may be obtained according to the above process according to the data of each time period (to obtain four grade scores), the grade scores corresponding to each time period are integrated to obtain an average value, and the final integrated grade score is obtained and used as the consumption ability label of the user.
The user data analysis system provided by this embodiment can synthesize multiple consumption ability level scores of the user obtained according to the LBS track data analysis in multiple time periods, and more accurately determine the consumption ability of the user.
In addition, the invention also provides a user data analysis method.
Fig. 4 is a schematic flow chart of a user data analysis method according to a first embodiment of the present invention. In this embodiment, the execution order of the steps in the flowchart shown in fig. 4 may be changed and some steps may be omitted according to different requirements.
The method comprises the following steps:
and step S400, collecting LBS track data of the user in a preset time period.
Specifically, the LBS is a location technology that uses various types to obtain the current location of the location equipment, and provides information resources and basic services to the location equipment through the mobile internet. Which acquires geographical location coordinate information of a user and integrates the same with other information based on a spatial database using a wireless communication network (or a satellite positioning system) by a mobile terminal corresponding to the user to provide the user with a desired location-related value-added service. Currently, the common positioning technologies for LBS include: global positioning system, WiFi positioning, IP address positioning, triangulation, etc. The LBS has the main characteristics that: wide coverage, high positioning precision and simple operation.
The information specifically included in the LBS track data is that the device (generally, a mobile terminal) corresponding to the user appears at a certain location (latitude and longitude) at a certain time point, that is, the information includes time and location (latitude and longitude).
The preset time period may be one week, one month, etc. And collecting all LBS track data of each user in the preset time period.
Step S402, POI data in a preset range are acquired.
Specifically, in the geographic information system, one POI may be one house, one shop, one mailbox, one bus station, and the like. Each POI contains information such as name, category, coordinates, etc., where each category corresponds to the code and name of the corresponding business. In short, the POI data may show what location each location (latitude and longitude) corresponds to. Typically, the POI data includes point-of-interest location information, point-of-interest categories, point-of-interest maps, and the like.
The preset range may be an area where the user is located or a geographical range covered by the LBS track data of the user.
And S404, matching LBS track data of the user with the POI data, and analyzing the consumption place where the user appears.
Specifically, which positions belong to the consumption places can be known according to the POI data, and then the LBS track data of the user is matched with the POI data, so that the consumption places corresponding to the LBS track data, namely which consumption places the user appears at, can be analyzed.
In this embodiment, the (mobile terminal) device number, time (time of entering and exiting each consumption location), and the consumption location where the user appears may be recorded correspondingly. For example: user A/device number 001 Monday 18:05-19:32, restaurant B; tuesday 07:20-07:25, convenience store C; saturday 10:30-11:50, mall D; saturday 11:56-12:47, restaurant E; weekdays 15:19-16:24, pet hospital F.
And step S406, acquiring consumption information corresponding to the consumption place.
Specifically, the consumption information may include a place type (e.g., a mall, a convenience store, a hotel, a restaurant, a hospital, a pet hospital, etc.), a consumption level (e.g., a per-person consumption amount), and the like. In this embodiment, the consumption information of each consumption place may be acquired from data sources such as the POI data and the public opinion data.
And step S408, grading each consumption place according to the consumption information and preset rules.
Specifically, weights of various scoring items (for example, site types and consumption levels in the consumption information) are preset, each scoring item of each consumption site is scored according to a preset rule, and then a total score of the site is calculated according to the score and the weight of each scoring item. And finally, grading the place according to the total score and a preset score interval corresponding to each grade.
Alternatively, a Logistic regression (Logistic regression) model may be used to perform grade scoring for each consumption location, and the data output by the model is the grade scoring result of each consumption location.
In this embodiment, the rating scores for each consumer location may be divided into 0-10 in the manner described above.
For example, convenience store C has a rating of 2, restaurant B and restaurant E each have a rating of 5, mall C has a rating of 6, and pet hospital F has a rating of 7.
And step S410, taking the consumption places with the same score as a community seed set, and classifying the users by using a CNM (computer network management) community discovery model according to the grade score of the consumption place corresponding to each user.
Specifically, in the CNM community discovery model, the consumption places with the same score are used as a community seed set (for example, restaurant B and restaurant E with a grade score of 5 belong to the same community), and then all users are classified into a certain community by using the CNM community discovery model according to the grade score of the consumption place corresponding to each user. The weight calculation method of the community edge is as follows: the frequency and residence time (50% each weight) of occurrences in the community (set of consumption sites). It is noted that, by the CNM community discovery model, all users can be classified into one of the communities, and each user can be classified into only one of the communities.
That is, according to the above steps S400-S404, a plurality of consumption locations where each user comes in and goes out, and the frequency and residence time of each user at each consumption location can be obtained, and then the CNM community discovery model is used to classify each user according to the frequency and residence time of each user at each consumption location, and finally each user is classified into the community with the highest frequency of occurrence and the longest residence time.
For example, the convenience store C belongs to the second community (rating score of 2), the restaurant B and the restaurant E belong to the fifth community (rating score of 5), the mall C belongs to the sixth community (rating score of 6), and the pet hospital F belongs to the seventh community (rating score of 7), and the user a has the highest frequency/longest duration of visiting the consumption place corresponding to the fifth community in terms of frequency of appearance and residence time, so the user a is classified into the fifth community.
Step S412, according to the grade scores of the communities to which the users belong, consumer ability labels are established for the users.
Specifically, after each user is classified into one of the communities, the grade score of the consumption place corresponding to each community is used as the consumption capacity grade score of each user in the community, and corresponding consumption capacity labels are established for all the users. For example, the grade score of 5 of the fifth community is taken as the consumption capability label of the user a, i.e., the consumption capability grade score of 5 of the user a.
According to the user data analysis method provided by the embodiment, the consumption places where the users appear can be analyzed through LBS track data and POI data of the users, then grading is carried out according to consumption information of the consumption places, and the users are classified by using a CNM (computer network model) community discovery model according to the frequency and residence time of the users appearing in each consumption place, so that the consumption capacity of the users is judged. According to the embodiment, the consumption capacity of the user can be analyzed without acquiring information such as consumption records or consumption behavior data of the user, and a scheme for effectively judging the consumption capacity of the user is provided for enterprises with LBS track data but without user consumption data (consumption records or consumption behavior data) in the industry.
Fig. 5 is a schematic flow chart of a user data analysis method according to a second embodiment of the present invention. In this embodiment, steps S500 to S510 of the user data analysis method are similar to steps S400 to S410 of the first embodiment, except that the method further includes steps S512 to S514.
The method comprises the following steps:
step S500, collecting LBS track data of a user in a preset time period.
Specifically, the LBS is a location technology that uses various types to obtain the current location of the location equipment, and provides information resources and basic services to the location equipment through the mobile internet. Which acquires geographical location coordinate information of a user and integrates the same with other information based on a spatial database using a wireless communication network (or a satellite positioning system) by a mobile terminal corresponding to the user to provide the user with a desired location-related value-added service. Currently, the common positioning technologies for LBS include: global positioning system, WiFi positioning, IP address positioning, triangulation, etc. The LBS has the main characteristics that: wide coverage, high positioning precision and simple operation.
The information specifically included in the LBS track data is that the device (generally, a mobile terminal) corresponding to the user appears at a certain location (latitude and longitude) at a certain time point, that is, the information includes time and location (latitude and longitude).
The preset time period may be one week, one month, etc. And collecting all LBS track data of each user in the preset time period.
Step S502, POI data in a preset range are obtained.
Specifically, in the geographic information system, one POI may be one house, one shop, one mailbox, one bus station, and the like. Each POI contains information such as name, category, coordinates, etc., where each category corresponds to the code and name of the corresponding business. In short, the POI data may show what location each location (latitude and longitude) corresponds to. Typically, the POI data includes point-of-interest location information, point-of-interest categories, point-of-interest maps, and the like.
The preset range may be an area where the user is located or a geographical range covered by the LBS track data of the user.
And step S504, matching LBS track data of the user with the POI data, and analyzing the consumption place where the user appears.
Specifically, which positions belong to the consumption places can be known according to the POI data, and then the LBS track data of the user is matched with the POI data, so that the consumption places corresponding to the LBS track data, namely which consumption places the user appears at, can be analyzed.
In this embodiment, the (mobile terminal) device number, time (time of entering and exiting each consumption location), and the consumption location where the user appears may be recorded correspondingly. For example: user A/device number 001 Monday 18:05-19:32, restaurant B; tuesday 07:20-07:25, convenience store C; saturday 10:30-11:50, mall D; saturday 11:56-12:47, restaurant E; weekdays 15:19-16:24, pet hospital F.
And step S506, acquiring consumption information corresponding to the consumption place.
Specifically, the consumption information may include a place type (e.g., a mall, a convenience store, a hotel, a restaurant, a hospital, a pet hospital, etc.), a consumption level (e.g., a per-person consumption amount), and the like. In this embodiment, the consumption information of each consumption place may be acquired from data sources such as the POI data and the public opinion data.
And step S508, grading each consumption place according to the consumption information and preset rules.
Specifically, weights of various scoring items (for example, site types and consumption levels in the consumption information) are preset, each scoring item of each consumption site is scored according to a preset rule, and then a total score of the site is calculated according to the score and the weight of each scoring item. And finally, grading the place according to the total score and a preset score interval corresponding to each grade.
Or, a Logistic regression model can be used for grading each consumption place, and the data output by the model is the grading result of each consumption place.
In this embodiment, the rating scores for each consumer location may be divided into 0-10 in the manner described above.
For example, convenience store C has a rating of 2, restaurant B and restaurant E each have a rating of 5, mall C has a rating of 6, and pet hospital F has a rating of 7.
Step S510, using the consumption places with the same score as a community seed set, and classifying the users by using a CNM community discovery model according to the grade score of the consumption place corresponding to each user.
Specifically, in the CNM community discovery model, the consumption places with the same score are used as a community seed set (for example, restaurant B and restaurant E with a grade score of 5 belong to the same community), and then all users are classified into a certain community by using the CNM community discovery model according to the grade score of the consumption place corresponding to each user. The weight calculation method of the community edge is as follows: the frequency and residence time (50% each weight) of occurrences in the community (set of consumption sites). It is noted that, by the CNM community discovery model, all users can be classified into one of the communities, and each user can be classified into only one of the communities.
That is, according to the above steps S500-S504, a plurality of consumption locations that each user comes in and goes out, and the frequency and residence time of each user at each consumption location can be obtained, then the CNM community discovery model is used to classify each user according to the frequency and residence time of each user at each consumption location, and finally each user is classified into the community with the highest frequency of occurrence and the longest residence time.
For example, the convenience store C belongs to the second community (rating score of 2), the restaurant B and the restaurant E belong to the fifth community (rating score of 5), the mall C belongs to the sixth community (rating score of 6), and the pet hospital F belongs to the seventh community (rating score of 7), and the user a has the highest frequency/longest duration of visiting the consumption place corresponding to the fifth community in terms of frequency of appearance and residence time, so the user a is classified into the fifth community.
Step S512, according to the grade scores of the communities to which the users belong, the consumption ability grade scores of the users are obtained.
Specifically, after each user is classified into one of the communities, the grade score of the consumption place corresponding to each community is used as the consumption capacity grade score of each user in the community, and corresponding consumption capacity labels are established for all the users. For example, the grade score of 5 of the fifth community is taken as the consumption capability label of the user a, i.e., the consumption capability grade score of 5 of the user a.
Step S514, for each user, integrating the grade scoring results of a plurality of time periods to obtain an integrated grade score as the consumption capability label of the user.
Specifically, LBS track data of the user in multiple time periods may be collected (for example, if the preset time period is one week, data of one month/four weeks is collected, to obtain four sets of LBS track data), the consumption ability grade scores of the user may be obtained according to the above steps according to the data of each time period (to obtain four grade scores), the grade scores corresponding to each time period are integrated to obtain an average value, and the final integrated grade score is obtained and used as the consumption ability label of the user.
The user data analysis method provided by the embodiment can analyze the consumption capacity of the user without acquiring information such as consumption records or consumption behavior data of the user, and can integrate multiple consumption capacity grade scores of the user, which are obtained by analyzing the LBS track data in multiple time periods, so that the consumption capacity of the user can be more accurately judged.
The present invention also provides another embodiment, which is to provide a computer-readable storage medium storing a user data analysis program, the user data analysis program being executable by at least one processor to cause the at least one processor to perform the steps of the user data analysis method as described above.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (5)

1. A method for analyzing user data, the method comprising the steps of:
acquiring location-based service LBS track data of a user within a preset time period;
obtaining POI data in a preset range;
matching the LBS track data and the POI data of the user, and analyzing the consumption place of the user;
acquiring consumption information corresponding to the consumption place;
grading each consumption place according to the consumption information and a preset rule;
taking the consumption places with the same score as a community seed set, and classifying the users by using a CNM (computer network model) community discovery model according to the grade score of the consumption place corresponding to each user; and
establishing a consumption capacity label for each user according to the grade score of the community to which the user belongs;
the consumption information comprises a place type and consumption levels, the place type comprises one or more of a market, a convenience store, a hotel, a restaurant, a hospital and a pet hospital, the consumption levels comprise per-person consumption money, and in the step of acquiring the consumption information corresponding to the consumption places, the consumption information of each consumption place is acquired from the POI data and the public comment data;
the step of classifying the users by using the CNM community discovery model according to the grade scores of the consumption places corresponding to the users comprises the following steps:
according to the grade score of the consumption place corresponding to each user, the occurrence frequency and the residence time of the user in each consumption place, classifying all the users into one of the communities by using the CNM community discovery model;
in the step of classifying the users by using a CNM community discovery model according to the grade scores of the consumption places corresponding to the users, classifying the users into communities with the highest occurrence frequency and the longest residence time, wherein the CNM community discovery model has the weight of 50% of each occurrence frequency and the weight of the residence time;
the step of establishing a consumption capability label for each user according to the grade score of the community to which the user belongs comprises the following steps:
after each user is classified into a community, taking the grade score of a consumption place corresponding to each community as the consumption capability grade score of each user in the community, and establishing a consumption capability label corresponding to the consumption capability grade score for the user;
after obtaining the consumer capacity rating score of each user, the method further comprises:
and aiming at each user, synthesizing the consumption capability grade grading results of a plurality of time periods to obtain a comprehensive consumption capability grade grading, and establishing a consumption capability label corresponding to the comprehensive consumption capability grade grading for the user.
2. The user data analysis method of claim 1, wherein the step of rating each consumption location according to a preset rule based on the consumption information comprises:
presetting the weight of each scoring item;
scoring each scoring item of each consumption place according to a preset rule;
calculating the total score of the consumption place according to the score and the weight of each scoring item;
and obtaining the grade score of the place according to the total score and a preset score interval corresponding to each grade.
3. The user data analysis method of claim 2, wherein the step of rating each consumption location according to the consumption information by a preset rule further comprises:
and grading each consumption place by using a logistic regression model, wherein data output by the model is a grading result of the consumption place.
4. A server, characterized in that the server comprises a memory, a processor, the memory having stored thereon a user data analysis system executable on the processor, the user data analysis system when executed by the processor implementing the steps of the user data analysis method according to any one of claims 1-3.
5. A computer-readable storage medium storing a user data analysis system executable by at least one processor to cause the at least one processor to perform the steps of the user data analysis method according to any one of claims 1-3.
CN202010008906.3A 2020-01-02 2020-01-02 User data analysis method, server, and computer-readable storage medium Active CN111209487B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010008906.3A CN111209487B (en) 2020-01-02 2020-01-02 User data analysis method, server, and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010008906.3A CN111209487B (en) 2020-01-02 2020-01-02 User data analysis method, server, and computer-readable storage medium

Publications (2)

Publication Number Publication Date
CN111209487A CN111209487A (en) 2020-05-29
CN111209487B true CN111209487B (en) 2020-10-27

Family

ID=70789718

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010008906.3A Active CN111209487B (en) 2020-01-02 2020-01-02 User data analysis method, server, and computer-readable storage medium

Country Status (1)

Country Link
CN (1) CN111209487B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113573245A (en) * 2021-09-27 2021-10-29 广东邦盛北斗科技股份公司 Merchant consumption supply information issuing method, storage medium and issuing system
CN114741612B (en) * 2022-06-13 2022-09-02 北京融信数联科技有限公司 Consumption habit classification method, system and storage medium based on big data
CN116681507A (en) * 2023-05-18 2023-09-01 北京大也智慧数据科技服务有限公司 Payment index calculation method, device, storage medium and equipment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107798557A (en) * 2017-09-30 2018-03-13 平安科技(深圳)有限公司 Electronic installation, the service location based on LBS data recommend method and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101163827B1 (en) * 2009-05-15 2012-07-09 현대자동차주식회사 Apparatus and Method for Location Based Data Service
CN106779789A (en) * 2015-11-24 2017-05-31 亿阳信通股份有限公司 The commercial value analysis system and method in a kind of geographical position
CN109635190B (en) * 2018-11-28 2023-06-27 四川亨通网智科技有限公司 User feature mining method based on position and behavior joint analysis

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107798557A (en) * 2017-09-30 2018-03-13 平安科技(深圳)有限公司 Electronic installation, the service location based on LBS data recommend method and storage medium

Also Published As

Publication number Publication date
CN111209487A (en) 2020-05-29

Similar Documents

Publication Publication Date Title
CN107798557B (en) Electronic device, service place recommendation method based on LBS data and storage medium
CN111209487B (en) User data analysis method, server, and computer-readable storage medium
CN108446281B (en) Method, device and storage medium for determining user intimacy
US9536202B2 (en) Identifying geospatial patterns from device data
US9646318B2 (en) Updating point of interest data using georeferenced transaction data
Ahas et al. Using mobile positioning data to model locations meaningful to users of mobile phones
KR102121361B1 (en) Method and device for identifying the type of geographic location where the user is located
CN106709606B (en) Personalized scene prediction method and device
CN110020221B (en) Job distribution confirmation method, apparatus, server and computer readable storage medium
CN112398895B (en) Method and device for providing service information
JP2016525834A (en) Associating attributes to network addresses
US8554788B2 (en) Apparatus and method for analyzing information about floating population
CN104902438A (en) Statistical method and system for analyzing passenger flow characteristic information on the basis of mobile communication terminal
CN111161086A (en) Method, system, computer equipment and storage medium for inquiring business data
JP2002342367A (en) System and method for distributing information
CN112861972A (en) Site selection method and device for exhibition area, computer equipment and medium
CN111475746B (en) Point-of-interest mining method, device, computer equipment and storage medium
CN111723959A (en) Region dividing method, region dividing device, storage medium and electronic device
CN112653748A (en) Information pushing method and device, electronic equipment and readable storage medium
CN111339409A (en) Map display method and system
CN112241489A (en) Information pushing method and device, readable storage medium and computer equipment
CN112819544A (en) Advertisement putting method, device, equipment and storage medium based on big data
CN114547386A (en) Positioning method and device based on Wi-Fi signal and electronic equipment
CN111242723B (en) User child and child condition judgment method, server and computer readable storage medium
CN111711668A (en) Method, device and computer equipment for pushing service in real time based on POI (Point of interest)

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant