CN114741612B - Consumption habit classification method, system and storage medium based on big data - Google Patents

Consumption habit classification method, system and storage medium based on big data Download PDF

Info

Publication number
CN114741612B
CN114741612B CN202210658815.3A CN202210658815A CN114741612B CN 114741612 B CN114741612 B CN 114741612B CN 202210658815 A CN202210658815 A CN 202210658815A CN 114741612 B CN114741612 B CN 114741612B
Authority
CN
China
Prior art keywords
information
grid
user
consumption
residence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210658815.3A
Other languages
Chinese (zh)
Other versions
CN114741612A (en
Inventor
成立立
张广志
于笑博
刘增礼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beiling Rongxin Datalnfo Science and Technology Ltd
Original Assignee
Beiling Rongxin Datalnfo Science and Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beiling Rongxin Datalnfo Science and Technology Ltd filed Critical Beiling Rongxin Datalnfo Science and Technology Ltd
Priority to CN202210658815.3A priority Critical patent/CN114741612B/en
Publication of CN114741612A publication Critical patent/CN114741612A/en
Application granted granted Critical
Publication of CN114741612B publication Critical patent/CN114741612B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/021Services related to particular areas, e.g. point of interest [POI] services, venue services or geofences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/029Location-based management or tracking services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/30Services specially adapted for particular environments, situations or purposes
    • H04W4/35Services specially adapted for particular environments, situations or purposes for the management of goods or merchandise
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a consumption habit classification method, a system and a storage medium based on big data, wherein the method comprises the following steps: acquiring signaling big data path information, base station information, work participation table information and point of interest (poi) data information, and acquiring track information of a user through the signaling big data path information; then, the track information, the base station information, the employee information and the poi data information of the user are sent to a preset consumption habit classification model to be stored; obtaining a consumption interest value of the user through a consumption habit classification model, and obtaining an interest point type with the highest average interest of the user within a month according to the consumption interest value of the user; and taking the interest point type with the highest average interest within the month of the user as the consumption habit classification of the user. The invention combines the consumption habit classification model, the base station information, the work participation table information and the poi data information on the basis of the large signaling data path, so that the consumption habit classification of the user is more convenient and accurate.

Description

Consumption habit classification method, system and storage medium based on big data
Technical Field
The present application relates to the field of data analysis, and more particularly, to a consumption habit classification method, system and storage medium based on big data.
Background
With the progress of the times and the rapid development of the society, the living consumption of people is more and more intelligent, the living of people is more and more convenient from cash payment to mobile phone payment and then face brushing payment, and the consumption records of consumers are also saved when the high-tech payment ends are convenient for the living of people. However, the existing consumption habit classification method needs to spend a lot of time to count the activity tracks and consumption places of the users, and the missing of the counting is easily caused by the way that the users remember the records or follow the beats.
Accordingly, there are deficiencies in the art and improvements are needed.
Disclosure of Invention
In view of the foregoing problems, it is an object of the present invention to provide a method, a system, and a storage medium for classifying consumption habits based on big data, which can more conveniently and accurately classify consumption habits of users.
The invention provides a consumption habit classification method based on big data in a first aspect, which comprises the following steps:
acquiring signaling big data path information;
obtaining user track information according to the signaling big data path information;
acquiring base station information, employee participation table information and poi data information;
sending the user track information, the base station information, the employee information and the poi data information to a preset consumption habit classification model for storage;
obtaining a consumption interest value of the user through a consumption habit classification model, and obtaining an interest point type with the highest average interest of the user within a month according to the consumption interest value of the user;
and taking the interest point type with the highest average interest of the user in the month as the consumption habit classification of the user and displaying the interest point type at the terminal.
In this scheme, still include:
carrying out longitude and latitude conversion on the poi data to adapt to the industrial parameters and the signaling data;
establishing a relevant link between the poi data and the grid to obtain the number of various interest points in the corresponding grid and the total number information of all consumption interest points in the grid;
and transmitting the converted po grid parameters to a preset database for storage.
In this scheme, still include:
matching the base station with the grid to obtain grid position information corresponding to the base station;
and sending the grid position information corresponding to the base station to a preset database for storage.
In this scheme, still include:
acquiring a dwell point in a user track and time information at the dwell point;
judging whether the time of the user at the residence point is lower than a first preset threshold value or not, and if so, obtaining the residence information that the residence point is unreasonable consumption; if not, obtaining the residence point as reasonable consumption residence information;
sending the unreasonable consumption residence to a preset temporary track table for storage;
and sending the reasonable consumption residence to a preset grid residence information table for storage.
In this scheme, still include:
judging whether the resident grid contains the poi data, if not, obtaining that the resident grid is a non-consumption resident grid;
the non-consuming resident grid is deleted.
In this scheme, still include:
acquiring user work and residence meter information;
obtaining the information of the work place and the residence place of the user according to the information of the work and the residence form of the user;
converting the user workplace and the user residence into a grid code;
and deleting the working grid and the residential grid in the grid resident information table.
The invention provides a consumption habit classification system based on big data, which is characterized by comprising a memory and a processor, wherein the memory stores a consumption habit classification method program based on big data, and the consumption habit classification method program based on big data realizes the following steps when being executed by the processor:
acquiring signaling big data path information;
obtaining user track information according to the signaling big data path information;
acquiring base station information, employee participation table information and poi data information;
sending the user track information, the base station information, the employee information and the poi data information to a preset consumption habit classification model for storage;
obtaining a consumption interest value of the user through the consumption habit classification model, and obtaining an interest point type with the highest average interest of the user in a month according to the consumption interest value of the user;
and taking the interest point type with the highest average interest of the user in the month as the consumption habit classification of the user and displaying the interest point type at the terminal.
In this scheme, still include:
carrying out longitude and latitude conversion on the poi data to adapt to the working parameters and the signaling data;
establishing a relevant link between the poi data and the grid to obtain the number of various interest points in the corresponding grid and the total number information of all consumption interest points in the grid;
and transmitting the converted poi grid parameters to a preset database for storage.
In this scheme, still include:
matching the base station with the grid to obtain grid position information corresponding to the base station;
and sending the grid position information corresponding to the base station to a preset database for storage.
In this scheme, still include:
acquiring a dwell point in a user track and time information at the dwell point;
judging whether the time of the user at the residence point is lower than a first preset threshold value or not, and if so, obtaining the residence information that the residence point is unreasonable consumption; if not, obtaining the residence point as reasonable consumption residence information;
sending the unreasonable consumption residence to a preset temporary track table for storage;
and sending the reasonable consumption residence to a preset grid residence information table for storage.
In this scheme, still include:
judging whether the resident grid contains the poi data or not, and if not, obtaining that the resident grid is a non-consumption resident grid;
the non-consuming resident grid is deleted.
In this scheme, still include:
acquiring user work and residence meter information;
obtaining the information of the work place and the residence of the user according to the information of the work and the residence form of the user;
converting the user workplace and the user residence into a grid code;
and deleting the working grid and the residential grid in the grid resident information table.
A third aspect of the present invention provides a computer storage medium, in which a big data-based consumption habit classification method program is stored, and when the big data-based consumption habit classification method program is executed by a processor, the steps of the big data-based consumption habit classification method described in any one of the above are implemented.
The invention discloses a consumption habit classification method, a system and a storage medium based on big data, wherein the method comprises the following steps: acquiring signaling big data path information, base station information, work participation table information and point of interest (poi) data information, and acquiring track information of a user through the signaling big data path information; then, the track information, the base station information, the employee information and the poi data information of the user are sent to a preset consumption habit classification model to be stored; obtaining a consumption interest value of the user through a consumption habit classification model, and obtaining an interest point type with the highest average interest of the user within a month according to the consumption interest value of the user; and taking the interest point type with the highest average interest of the user in the month as the consumption habit classification of the user. The invention combines the consumption habit classification model, the base station information, the work participation table information and the poi data information on the basis of the large signaling data path, so that the consumption habit classification of the user is more convenient and accurate.
Drawings
FIG. 1 is a flow chart illustrating a big data based consumption habit classification method according to the present invention;
FIG. 2 illustrates a flow diagram of the poi data pre-processing of a big data-based consumption habit classification method of the present invention;
FIG. 3 is a flow chart of base station data preprocessing of a big data based consumption habit classification method according to the present invention;
FIG. 4 is a block diagram illustrating a big data based consumption habit classification system according to the present invention.
Detailed Description
In order that the above objects, features and advantages of the present invention can be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced otherwise than as specifically described herein and, therefore, the scope of the present invention is not limited by the specific embodiments disclosed below.
FIG. 1 is a flow chart of a big data based consumption habit classification method according to the invention.
As shown in FIG. 1, the invention discloses a consumption habit classification method based on big data, comprising the following steps:
s102, acquiring signaling big data path information;
s104, obtaining user track information according to the signaling big data path information;
s106, acquiring base station information, employee attendance table information and poi data information;
s108, sending the user track information, the base station information, the employee information and the poi data information to a preset consumption habit classification model for storage;
s110, obtaining a consumption interest value of the user through a consumption habit classification model, and obtaining an interest point type with the highest average interest of the user in a month according to the consumption interest value of the user;
and S112, taking the interest point type with the highest average interest of the user in the month as the consumption habit classification of the user and displaying the interest point type on the terminal.
It should be noted that the user trajectory is a travel trajectory of the user activity route, and information of the place where the user arrives and the time length where the user resides is recorded. The poi data represents (point of interest) point data, which is any meaningful point on the map with non-geographic meaning, such as: shops, gyms, gas stations, etc. Sending user track information, base station information, employee participation table information and poi data information to a preset consumption habit classification model for storage, wherein the consumption habit classification model is a grid model, rasterizing the place where the user track belongs to, determining the rasterization size according to actual requirements, transmitting data to corresponding grid positions, for example, if the user track is distributed in Beijing, rasterizing a map range of Beijing city by 500m, counting the consumption habits of the user, representing various consumption interest points in a grid, wherein the number is p, the total number of interest points in the grid with the grid prosperity degree is set as Z, the total duration of the user residing in the grid is set as t, and obtaining an interest value x of the user in the type of consumption, and the formula is as follows:
Figure DEST_PATH_IMAGE001
where n is the total number of active resident grids for the user on the day,
Figure 640515DEST_PATH_IMAGE002
indicating the number of consumption interest points of the corresponding type in the ith grid where the user resides on the current day,
Figure DEST_PATH_IMAGE003
showing the prosperity of the grid and the total number of interest points in the grid,
Figure 433021DEST_PATH_IMAGE004
representing the total length of time the user resides on the grid. The consumption level of the user is set as y, and the formula is as follows:
Figure DEST_PATH_IMAGE005
and summarizing the daily information of the current month to obtain an average value, and counting the interest point type with the highest average interest of the user in the month as the consumption habit classification of the user.
FIG. 2 shows a flow chart of the poi data preprocessing of a big data-based consumption habit classification method of the present invention.
As shown in fig. 2, the present invention discloses a consumption habit classification method based on big data, further comprising:
s202, carrying out longitude and latitude conversion on the poi data to adapt to the engineering parameter and the signaling data;
s204, establishing a relevant link between the poi data and the grid to obtain the number of various interest points in the corresponding grid and the total number information of all consumption interest points in the grid;
and S206, transmitting the converted poi grid parameters to a preset database for storage.
It should be noted that the data preprocessing includes preprocessing the poi data, performing longitude and latitude conversion on the poi data to adapt to the business parameters and the signaling data, establishing a relevant connection between the poi data and the grids, so that the poi data is displayed on the corresponding grids, for example, after rasterizing a map range of beijing city by 500m, where the grid a includes interest points such as restaurants, bars, gas stations, etc., and the user a sequentially passes through the interest points, respectively making relevant connection between the interest points such as the restaurants, bars, gas stations, etc. and the business parameters and the signaling data of the interest points such as the restaurants, bars, gas stations, etc. of the user a are sent to the database and displayed in the grid a. And summarizing the poi data by taking the grid as a unit, counting the quantity and the total quantity of various consumption places in the grid, wherein the total quantity is used as a main basis of the prosperity degree of the grid, and the preset grid is a grid in a consumption habit classification model.
Fig. 3 is a flow chart illustrating base station data preprocessing of a big data-based consumption habit classification method according to the present invention.
As shown in fig. 3, the present invention discloses a consumption habit classification method based on big data, further comprising:
s302, matching the base station with the grid to obtain grid position information corresponding to the base station;
and S304, sending the grid position information corresponding to the base station to a preset database for storage.
The data preprocessing includes preprocessing base station data, obtaining base station location information through technologies such as a GPS, extracting longitude and latitude of a base station location, performing comparative analysis according to the longitude and latitude and a preset rasterization size, determining a grid to which the longitude and latitude of the base station location belong, obtaining grid location information corresponding to the base station, and labeling, for example, using longitude and latitude coordinates of WGS1984 in China, the longitude difference 1 second is 23.6m, the latitude difference 1 second is 30.9m, if a map range is rasterized by 500m, the longitude difference of one grid is 21.19 seconds, the latitude difference is 16.18 seconds, obtaining grid location information corresponding to the base station according to the original rasterized longitude and latitude, the rasterized size standard and the base station longitude and latitude, and importing the grid location to which the base station belongs and the base station information into a database for storage.
According to the embodiment of the invention, the method further comprises the following steps:
acquiring a dwell point in a user track and time information at the dwell point;
judging whether the time of the user at the residence point is lower than a first preset threshold value, if so, obtaining the unreasonable consumption residence information of the residence point; if not, obtaining the residence point as reasonable consumption residence information;
sending the unreasonable consumption residence to a preset temporary track table for storage;
and sending the reasonable consumption residence to a preset grid residence information table for storage.
It should be noted that whether the residence point is the reasonable consumption residence or not is obtained by judging according to the time length of the user at the residence point, for example, if the first preset threshold is 20 minutes, it indicates that the residence time below 20 minutes is the unreasonable consumption residence, the internet of things card is screened out, the residence information below 20 minutes is removed, and the reasonable consumption residence is more accurate by removing the short-time passing information and the short-term residence information. Merging the reasonable consumption residence points in the track into a grid to form records, such as: and recording the information such as the residence time and the like.
According to the embodiment of the invention, the method further comprises the following steps:
judging whether the resident grid contains the poi data, if not, obtaining that the resident grid is a non-consumption resident grid;
and deleting the non-consumption resident grid.
It should be noted that when the residence grid does not contain the poi data, it indicates that the user has no interest point at the residence point, and the user may be a non-consumption activity such as chat, stay, etc. at the residence point, the residence is set as a non-consumption residence, and the grid corresponding to the residence is set as a non-consumption residence grid and deleted.
According to the embodiment of the invention, the method comprises the following steps:
acquiring user work and residence meter information;
obtaining the information of the work place and the residence of the user according to the information of the work and the residence form of the user;
converting the user workplace and the user residence into a grid code;
and deleting the working grid and the residential grid in the grid resident information table.
The work place and the residence place of the user are converted into the grid codes, and the work grid and the residence grid of each unit are deleted from the grid residence information table, so that the interference caused by the long-term residence of the work place and the residence place is reduced.
According to the embodiment of the invention, the method comprises the following steps:
acquiring interest points on the grid lines;
setting the interest points as sharing interest points;
and judging the attribution of the interest points according to the grids to which the user tracks belong.
It should be noted that, when an interest point is on a grid line, the grid corresponding to the grid line records the interest point, the interest point is determined according to the grid to which a user trajectory belongs, for example, an interest point C exists on the grid line between the grid a and the grid B, information of the interest point C is recorded by both the grid a and the grid B, when the user enters the interest point C from the grid a to reside, the resident information is recorded by the grid a, the record is not made by the grid B, if the user does not exist on the grid line, the record is performed according to the grid to which the user belongs, for example, the interest point C exists on the grid line between the grid a and the grid B, a part of the interest point C is on the grid a, another part of the interest point C belongs to the grid B, the information of the interest point C is recorded by both the grid a and the grid B, the resident information is recorded by the grid B, grid a does not make a record.
FIG. 4 is a block diagram illustrating a big data based consumption habit classification system according to the present invention.
As shown in fig. 4, a second aspect of the present invention provides a big data based consumption habit classification system 4, which includes a memory 41 and a processor 42, wherein the memory stores a big data based consumption habit classification method program, and when the processor executes the big data based consumption habit classification method program, the method includes the following steps:
acquiring signaling big data path information;
obtaining user track information according to the signaling big data path information;
acquiring base station information, employee participation table information and poi data information;
sending the user track information, the base station information, the employee information and the poi data information to a preset consumption habit classification model for storage;
obtaining a consumption interest value of the user through a consumption habit classification model, and obtaining an interest point type with the highest average interest of the user within a month according to the consumption interest value of the user;
and taking the interest point type with the highest average interest of the user in the month as the consumption habit classification of the user and displaying the interest point type at the terminal.
It should be noted that the user trajectory is a travel trajectory of the user activity route, and information of the place where the user arrives and the time length where the user resides is recorded. The poi data represents (point of interest) point data, which is any meaningful point on the map with non-geographic meaning, such as: shops, gyms, gas stations, etc. Sending user track information, base station information, employee participation table information and poi data information to a preset consumption habit classification model for storage, wherein the consumption habit classification model is a grid model, the place where the user track belongs is rasterized, the rasterization size of the user track is determined according to actual requirements, then data are transmitted to corresponding grid positions, for example, if the user track is distributed in Beijing, a map range of the Beijing city is rasterized by 500m, the consumption habits of the user are counted, various consumption interest points are represented in a grid, the number of the interest points is p, the total number of the interest points in the grid with the grid prosperity degree is set as Z, the total residence time of the user in the grid is set as t, the interest value x of the user in the type of consumption is obtained, and the formula is as follows:
Figure 353704DEST_PATH_IMAGE001
where n is the total number of valid resident grids for the user on the day,
Figure 748913DEST_PATH_IMAGE002
indicating the number of consumption interest points of the corresponding type in the ith grid where the user resides on the current day,
Figure 360023DEST_PATH_IMAGE003
showing the prosperity of the grid and the total number of interest points in the grid,
Figure 182485DEST_PATH_IMAGE004
representing the total length of time the user resides on the grid. The consumption level of the user is set as y, and the formula is as follows:
Figure 528147DEST_PATH_IMAGE005
and summarizing the daily information of the current month to obtain an average value, and counting the interest point type with the highest average interest of the user in the month as the consumption habit classification of the user.
FIG. 2 shows a flow chart of the poi data preprocessing of a big data-based consumption habit classification method of the present invention.
As shown in fig. 2, the present invention discloses a consumption habit classification method based on big data, further comprising:
carrying out longitude and latitude conversion on the poi data to adapt to the industrial parameters and the signaling data;
establishing a relevant link between the poi data and the grid to obtain the number of various interest points in the corresponding grid and the total number information of all consumption interest points;
and transmitting the converted poi grid parameters to a preset database for storage.
It should be noted that the data preprocessing includes preprocessing the poi data, performing latitude and longitude conversion on the poi data to adapt to the worker reference and signaling data, and establishing a relevant connection between the poi data and a grid to display the poi data on the corresponding grid, for example, after rasterizing a map range of beijing city by 500m, where the grid a includes interest points such as restaurants, bars, gas stations, etc., and the user a sequentially passes through the interest points, then respectively performing relevant connection between the interest points such as the restaurants, bars, gas stations, etc. and the worker reference and signaling data of the interest points such as the restaurants, bars, gas stations, etc. of the user a are sent to the database and displayed in the grid a. And summarizing the poi data by taking the grid as a unit, and counting the quantity and the total quantity of various consumption places in the grid, wherein the total quantity is used as a main basis of the prosperity degree of the grid.
Fig. 3 is a flow chart illustrating base station data preprocessing of a big data-based consumption habit classification method according to the present invention.
As shown in fig. 3, the present invention discloses a consumption habit classification method based on big data, further comprising:
matching the base station with the grid to obtain grid position information corresponding to the base station;
and sending the grid position information corresponding to the base station to a preset database for storage.
The data preprocessing includes preprocessing base station data, obtaining base station location information through technologies such as a GPS, extracting longitude and latitude of a base station location, performing comparative analysis according to the longitude and latitude and a preset rasterization size, determining a grid to which the longitude and latitude of the base station location belong, obtaining grid location information corresponding to the base station, and labeling, for example, using longitude and latitude coordinates of WGS1984 in China, the longitude difference 1 second is 23.6m, the latitude difference 1 second is 30.9m, if a map range is rasterized by 500m, the longitude difference of one grid is 21.19 seconds, the latitude difference is 16.18 seconds, obtaining grid location information corresponding to the base station according to the original rasterized longitude and latitude, the rasterized size standard and the base station longitude and latitude, and importing the grid location to which the base station belongs and the base station information into a database for storage.
According to the embodiment of the invention, the method further comprises the following steps:
obtaining a resident point in a user track and time information at the resident point;
judging whether the time of the user at the residence point is lower than a first preset threshold value or not, and if so, obtaining the residence information that the residence point is unreasonable consumption; if not, obtaining the residence point as reasonable consumption residence information;
sending the unreasonable consumption residence to a preset temporary track table for storage;
and sending the reasonable consumption residence to a preset grid residence information table for storage.
It should be noted that, whether the residence point is a reasonable consumption residence or not is obtained by judging according to the time length of the user at the residence point, for example, if the first preset threshold is 20 minutes, it indicates that residence time less than 20 minutes is an unreasonable consumption residence, the internet of things card is screened out, residence information less than 20 minutes is removed, and reasonable consumption residence is more accurate by removing short-time passing information and short-term residence information. Merging reasonable consumption residence points in the track into a grid to form records, such as: and recording the information such as the residence time and the like.
According to the embodiment of the invention, the method further comprises the following steps:
judging whether the resident grid contains the poi data or not, and if not, obtaining that the resident grid is a non-consumption resident grid;
and deleting the non-consumption resident grid.
It should be noted that when the residence grid does not contain the poi data, it indicates that the user has no interest point at the residence point, and the user may be a non-consumption activity such as chat, stay, etc. at the residence point, the residence is set as a non-consumption residence, and the grid corresponding to the residence is set as a non-consumption residence grid and deleted.
According to the embodiment of the invention, the method comprises the following steps:
acquiring user work and residence meter information;
obtaining the information of the work place and the residence place of the user according to the information of the work and the residence form of the user;
converting the user workplace and the user residence into a grid code;
and deleting the working grid and the residential grid in the grid resident information table.
The work place and the residence place of the user are converted into the grid codes, and the work grid and the residence grid of each unit are deleted from the grid residence information table, so that the interference caused by the long-term residence of the work place and the residence place is reduced.
According to the embodiment of the invention, the method comprises the following steps:
acquiring interest points on grid lines;
setting the interest points as sharing interest points;
and judging the attribution of the interest points according to the grids to which the user tracks belong.
It should be noted that, when an interest point is on a grid line, the grid corresponding to the grid line records the interest point, and the interest point is determined according to the grid to which the user trajectory belongs, for example, an interest point C exists on the grid line between the grid a and the grid B, and the information of the interest point C is recorded by both the grid a and the grid B, when the user enters the residence of the interest point C from the grid a, the residence information is recorded by the grid a, and the record is not made by the grid B, and if the user does not exist on the grid line, the record is made according to the grid to which the user belongs, for example, the interest point C exists on the grid line between the grid a and the grid B, a part of the interest point C is on the grid a, another part of the interest point C belongs to the grid B, the residence of the information of the interest point C is recorded by both the grid a and the grid B, and the residence of the interest point C in the grid B is recorded by the grid B, grid a does not make a record.
A third aspect of the present invention provides a computer storage medium, in which a big data-based consumption habit classification method program is stored, and when being executed by a processor, the method realizes the steps of any one of the above big data-based consumption habit classification methods.
The invention discloses a consumption habit classification method, a system and a storage medium based on big data, wherein the method comprises the following steps: acquiring signaling big data path information, base station information, work participation table information and point of interest (poi) data information, and acquiring track information of a user through the signaling big data path information; then, the track information, the base station information, the employee information and the poi data information of the user are sent to a preset consumption habit classification model to be stored; obtaining a consumption interest value of the user through a consumption habit classification model, and obtaining an interest point type with the highest average interest of the user within a month according to the consumption interest value of the user; and taking the interest point type with the highest average interest of the user in the month as the consumption habit classification of the user. According to the invention, based on a large signaling data path, a consumption habit classification model is combined with the base station information, the work participation table information and the poi data information, so that the consumption habit classification of the user is more convenient and accurate.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units; can be located in one place or distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, all the functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer readable storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Alternatively, the integrated unit of the present invention may be stored in a computer-readable storage medium if it is implemented in the form of a software functional module and sold or used as a separate product. Based on such understanding, the technical solutions of the embodiments of the present invention or portions thereof contributing to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic or optical disk, or various other media capable of storing program code.

Claims (5)

1. A big data-based consumption habit classification method is characterized by comprising the following steps:
acquiring signaling big data path information;
obtaining user track information according to the signaling big data path information;
acquiring base station information, employee participation table information and poi data information;
sending the user track information, the base station information, the employee information and the poi data information to a preset consumption habit classification model for storage;
obtaining a consumption interest value of the user through a consumption habit classification model, and obtaining an interest point type with the highest average interest of the user within a month according to the consumption interest value of the user;
taking the interest point type with the highest average interest of the user in the month as the consumption habit classification of the user and displaying the interest point type at the terminal;
carrying out longitude and latitude conversion on the poi data to adapt to the industrial parameters and the signaling data;
establishing a relevant link between the poi data and the grid to obtain the number of various interest points in the corresponding grid and the total number information of all consumption interest points in the grid;
the converted poi grid parameters are transmitted to a preset database to be stored;
matching the base station with the grid to obtain grid position information corresponding to the base station;
sending the grid position information corresponding to the base station to a preset database for storage;
acquiring a dwell point in a user track and time information at the dwell point;
judging whether the time of the user at the residence point is lower than a first preset threshold value or not, and if so, obtaining the residence information that the residence point is unreasonable consumption; if not, obtaining the residence point as reasonable consumption residence information;
sending the unreasonable consumption residence to a preset temporary track table for storage;
and sending the reasonable consumption residence to a preset grid residence information table for storage.
2. The big data based consumption habit classification method according to the claim 1, characterized by comprising:
judging whether the resident grid contains the poi data or not, and if not, obtaining that the resident grid is a non-consumption resident grid;
the non-consuming resident grid is deleted.
3. The big data-based consumption habit classification method according to claim 1, comprising:
acquiring user work and residence meter information;
obtaining the information of the work place and the residence place of the user according to the information of the work and the residence form of the user;
converting the user workplace and the user residence into a grid code;
and deleting the working grid and the residential grid in the grid resident information table.
4. The consumption habit classification system based on big data is characterized by comprising a memory and a processor, wherein the memory stores a consumption habit classification method program based on big data, and the consumption habit classification method program based on big data realizes the following steps when being executed by the processor:
acquiring signaling big data path information;
obtaining user track information according to the signaling big data path information;
acquiring base station information, employee participation table information and poi data information;
sending the user track information, the base station information, the employee information and the poi data information to a preset consumption habit classification model for storage;
obtaining a consumption interest value of the user through a consumption habit classification model, and obtaining an interest point type with the highest average interest of the user within a month according to the consumption interest value of the user;
taking the interest point type with the highest average interest of the user in the month as the consumption habit classification of the user and displaying the interest point type at the terminal;
carrying out longitude and latitude conversion on the poi data to adapt to the industrial parameters and the signaling data;
establishing a relevant link between the poi data and the grid to obtain the number of various interest points in the corresponding grid and the total number information of all consumption interest points in the grid;
the converted poi grid parameters are transmitted to a preset database to be stored;
matching the base station with the grid to obtain grid position information corresponding to the base station;
sending the grid position information corresponding to the base station to a preset database for storage;
acquiring a dwell point in a user track and time information at the dwell point;
judging whether the time of the user at the residence point is lower than a first preset threshold value, if so, obtaining the unreasonable consumption residence information of the residence point; if not, obtaining the residence point as reasonable consumption residence information;
sending the unreasonable consumption residence to a preset temporary track table for storage;
and sending the reasonable consumption residence to a preset grid residence information table for storage.
5. A computer storage medium, characterized in that the computer storage medium stores a big data based consumption habit classification method program, and when the big data based consumption habit classification method program is executed by a processor, the steps of a big data based consumption habit classification method according to any one of claims 1 to 3 are implemented.
CN202210658815.3A 2022-06-13 2022-06-13 Consumption habit classification method, system and storage medium based on big data Active CN114741612B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210658815.3A CN114741612B (en) 2022-06-13 2022-06-13 Consumption habit classification method, system and storage medium based on big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210658815.3A CN114741612B (en) 2022-06-13 2022-06-13 Consumption habit classification method, system and storage medium based on big data

Publications (2)

Publication Number Publication Date
CN114741612A CN114741612A (en) 2022-07-12
CN114741612B true CN114741612B (en) 2022-09-02

Family

ID=82287432

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210658815.3A Active CN114741612B (en) 2022-06-13 2022-06-13 Consumption habit classification method, system and storage medium based on big data

Country Status (1)

Country Link
CN (1) CN114741612B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115222540A (en) * 2022-07-14 2022-10-21 北京融信数联科技有限公司 Business circle consumption data analysis method, system and readable storage medium
CN115409434B (en) * 2022-11-02 2023-03-24 北京融信数联科技有限公司 Regional demographic method, system and storage medium based on signaling big data
CN116151891A (en) * 2023-04-18 2023-05-23 北京大也智慧数据科技服务有限公司 Message pushing method and device based on signaling data

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100853379B1 (en) * 2007-07-20 2008-08-21 에스케이에너지 주식회사 Method for transforming based position image file and service server thereof
CN108898445A (en) * 2018-07-12 2018-11-27 智慧足迹数据科技有限公司 The analysis method and device of customer consumption ability
CN109409936A (en) * 2018-09-28 2019-03-01 深圳壹账通智能科技有限公司 Customer consumption portrait generation method, device, equipment and readable storage medium storing program for executing
CN112543427A (en) * 2020-12-01 2021-03-23 江苏欣网视讯软件技术有限公司 Method and system for analyzing and identifying urban traffic corridor based on signaling track and big data
CN112788524A (en) * 2020-12-28 2021-05-11 中国移动通信集团江苏有限公司 Object query method, device, equipment and storage medium
CN113806601A (en) * 2021-11-18 2021-12-17 中国测绘科学研究院 Peripheral interest point retrieval method and storage medium

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10389828B2 (en) * 2017-04-28 2019-08-20 Sap Se Enhanced data collection and analysis facility
CN110955738B (en) * 2018-09-26 2023-10-20 北京融信数联科技有限公司 Figure portrayal describing method based on signaling data combined with scene information
CN110956188A (en) * 2018-09-26 2020-04-03 北京融信数联科技有限公司 Population behavior track digital coding method based on mobile communication signaling data
US10804707B2 (en) * 2018-10-18 2020-10-13 General Electric Company Systems and methods for dynamic management of wind turbines providing reactive power
CN111209261B (en) * 2020-01-02 2020-11-03 邑客得(上海)信息技术有限公司 User travel track extraction method and system based on signaling big data
CN111209487B (en) * 2020-01-02 2020-10-27 平安科技(深圳)有限公司 User data analysis method, server, and computer-readable storage medium
CN111597279B (en) * 2020-03-31 2023-07-25 平安科技(深圳)有限公司 Information prediction method based on deep learning and related equipment
CN111615054B (en) * 2020-05-25 2021-04-13 和智信(山东)大数据科技有限公司 Population analysis method and device
CN111582948B (en) * 2020-05-25 2023-04-18 北京航空航天大学 Individual behavior analysis method based on mobile phone signaling data and POI (Point of interest)

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100853379B1 (en) * 2007-07-20 2008-08-21 에스케이에너지 주식회사 Method for transforming based position image file and service server thereof
CN108898445A (en) * 2018-07-12 2018-11-27 智慧足迹数据科技有限公司 The analysis method and device of customer consumption ability
CN109409936A (en) * 2018-09-28 2019-03-01 深圳壹账通智能科技有限公司 Customer consumption portrait generation method, device, equipment and readable storage medium storing program for executing
CN112543427A (en) * 2020-12-01 2021-03-23 江苏欣网视讯软件技术有限公司 Method and system for analyzing and identifying urban traffic corridor based on signaling track and big data
CN112788524A (en) * 2020-12-28 2021-05-11 中国移动通信集团江苏有限公司 Object query method, device, equipment and storage medium
CN113806601A (en) * 2021-11-18 2021-12-17 中国测绘科学研究院 Peripheral interest point retrieval method and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Regional commercial center identification based on POI big data in China;Gang Hou等;《Arabian Journal of Geosciences》;20210710;第1-14页 *
基于手机信令数据的城市商圈消费者洞察研究——以天津为例;王毅;《中国优秀硕士学位论文全文数据库经济与管理科学辑》;20220315(第3期);第J155-58页 *

Also Published As

Publication number Publication date
CN114741612A (en) 2022-07-12

Similar Documents

Publication Publication Date Title
CN114741612B (en) Consumption habit classification method, system and storage medium based on big data
CN111464950B (en) Method for extracting travel stop point by using mobile phone signaling data
CN111222744A (en) Method for determining built environment and rail passenger flow distribution relation based on signaling data
CN102013163A (en) Method for bus origin-destination (OD) investigation by using mobile phone base station data and operating vehicle global position system (GPS) data
KR20180101472A (en) Method and device for identifying the type of geographic location in which a user is located
US11966424B2 (en) Method and apparatus for dividing region, storage medium, and electronic device
CN105160173B (en) Safety evaluation method and device
CN107529135A (en) User Activity type identification method based on smart machine data
CN115409434B (en) Regional demographic method, system and storage medium based on signaling big data
US20120218150A1 (en) Management server, population information calculation management server, non-populated area management method, and population information calculation method
CN111479321A (en) Grid construction method and device, electronic equipment and storage medium
CN115866547A (en) Fixed area tourist counting method, system and storage medium based on signaling data
CN111600993A (en) Method and device for stroke reminding according to short message
CN115759640A (en) Public service information processing system and method for smart city
CN109978264B (en) Urban population distribution prediction method based on spatio-temporal information
CN110493363A (en) A kind of discrimination system and method for smart phone random MAC address
CN109166012B (en) Method and device for classifying users in travel reservation class and pushing information
CN111092764B (en) Real-time dynamic affinity relation analysis method and system
JP2012054921A (en) Mobile apparatus distribution calculation system and mobile apparatus distribution calculation method
CN113094388A (en) Method and related device for detecting user workplace and residence
CN116017333A (en) Population identification method, system and storage medium based on big data signaling processing
CN114626900A (en) Intelligent management system based on feature recognition and big data analysis
CN110428627B (en) Bus trip potential area identification method and system
CN117493981B (en) Tourist classification method and device and electronic equipment
CN112036940A (en) Data processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant