CN114048391B - Interest activity recommendation method based on geographic grid - Google Patents
Interest activity recommendation method based on geographic grid Download PDFInfo
- Publication number
- CN114048391B CN114048391B CN202210034325.6A CN202210034325A CN114048391B CN 114048391 B CN114048391 B CN 114048391B CN 202210034325 A CN202210034325 A CN 202210034325A CN 114048391 B CN114048391 B CN 114048391B
- Authority
- CN
- China
- Prior art keywords
- user
- activity
- time
- preference
- grid
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9537—Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
An interest activity recommendation method based on geographic grids comprises the steps of dividing a research area into regular grids, establishing a personal interest grid area for each user through check-in frequency and preference deviation ratio parameters, and deducing spatial activity preference of the user; capturing other similar user activity preferences by adopting a non-negative tensor resolution method, and cooperatively establishing the time activity preferences of the user; and fusing the spatial activity preference and the temporal activity preference by adopting a context-aware fusion method to jointly determine the interest activities recommended to the user. The model improves sparsity of check-in data of the user based on a geographic grid and a tensor decomposition method, carries out quantitative analysis on interest activities of the user, improves accuracy of interest activity recommendation, and enables recommendation results to meet personalized requirements of the user.
Description
Technical Field
The application belongs to the technical field of position recommendation, and particularly relates to an interest activity recommendation method based on a geographic grid.
Background
With the rapid development of Location-based Social networks (LBSNs) and mobile end devices, mining potential user personal preferences, activity tracks, and lifestyle patterns from accumulated mass user data and sign-in data becomes a core link for Location services. Location recommendation becomes an important technical means of this link. At present, due to the influence of problems such as data sparsity and cold start, the accuracy of recommendation results obtained by a recommendation algorithm for points of interest may be low. Also, in many cases, people do not usually need a very precise location. Therefore, the research on the interest activity recommendation algorithm and the application is carried forward, and the purpose is to better understand the movement behavior of the user and predict the activities possibly participated by the user, so that the personalized and intelligent service requirements of the user are met.
The check-in behavior of the user presents a specific spatio-temporal distribution pattern, and the modeling of the spatio-temporal behavior of the user based on historical check-in data of the user in the location social network is challenging and mainly expressed in the following aspects. Firstly, check-in data is usually high-dimensional and sparse, and is represented by a user-time-position-activity four-dimensional quadruplet, and it is complex and difficult to directly find out regularity of sparse high-dimensional data; secondly, the check-in behavior of the user on the social media is influenced by the user, which is different from the continuously sampled user activity data, and the check-in behavior is not continuously sampled at equal intervals and is complex and changeable in space and time; the check-in activity of a user is related to the context in which the user is located, i.e., the check-in behavior of the user is generally affected by the location and time of the user. Therefore, how to mine the user activity preference by combining the temporal and spatial contexts of the user becomes a technical problem which needs to be solved urgently in the prior art.
Disclosure of Invention
The invention aims to provide an interest activity recommendation method based on a geographic grid, which finds user interest activities and improves recommendation performance. And respectively modeling the spatial activity preference and the time activity preference of the user by using the technologies such as the geography grid, tensor decomposition and the like, thereby reducing the complexity of the problem.
An interest activity recommendation method based on a geographic grid comprises the following steps:
constructing a user spatial activity preference model based on the geographic grid step S110:
dividing a city area into a plurality of geographic grids, mapping user check-in information, calculating check-in frequency and category preference deviation ratio of a user in each grid, acquiring an interest grid set of the user, and deducing the position of the userComputing a spatial activity preference distribution of the user and using the nullInter-activity preference distribution, calculating the recommendation success rate of the spatial activity preference model on the grid, and constructing a spatial success rate matrix;
a step S120 of constructing a user time activity preference model by using a non-negative tensor decomposition method:
constructing a three-dimensional tensor of user-time-class according to the user check-in record, wherein elements in the tensor represent the user in a time periodSelecting an activityThe sign-in frequency is obtained by a recovery tensor to describe the time activity preference of the user based on a non-negative tensor decomposition algorithm according to the three-dimensional tensor; deducing and calculating activity preference distribution of the user at the current time according to the tensor decomposition result; based on time activity preference distribution, calculating the recommendation success rate of a time activity preference model on grids, and constructing a time success rate matrix;
a temporal preference and spatial preference fusion step S130:
recommendation list generation substep S131: and deducing the user activity preference according to the success rate matrix, comparing element values in the space success rate matrix and the time success rate matrix, and selecting a model result with a higher value as a final recommendation result.
Optionally, the step S110 of constructing a user spatial activity preference model based on a geographic grid includes the following sub-steps:
user sign-in information mapping substep S111: dividing the urban area into a plurality of regular grids with the same size, mapping the sign-in information of the user to the grids, and obtaining the number attribute of the grids;
the user interest mesh acquisition sub-step S112:
calculating the check-in frequency of the user in each gridAnd class preference bias ratioUsing the frequency of attendanceAnd class preference bias ratioTo characterize the user's preference on the grid by setting a frequency thresholdAnd a preference deviation ratio thresholdScreening to obtain an interest grid set of the user,
wherein the check-in frequencyIs represented as a userIn thatThe number of check-ins in (a) is a proportion of the total number of check-ins,
in the formula (I), the compound is shown in the specification,representing a userIn thatThe number of check-ins of (c),representing a set of grids visited by a user;
the preference deviation ratioFor measuring usersIn a gridClass preference in (1), assuming meshAll of them sharePOI of individual category, which the user accessesA categoryThe POI of (1), the calculation formula is as follows:
in the formula (I), the compound is shown in the specification,representation gridThe total number of categories of POIs present in,indicates the user isThe number of sign-insThe ratio of the total number of check-in times in the table,representing the maximum entropy of the user in the check-in category, which assumes that the user is in all categoriesThe likelihood of a check-in is the same,representing a userIn a gridA category of checked-in;
after calculating the check-in frequency and the preference deviation ratio of each grid, introducing a frequency threshold valueAnd a preference deviation ratio thresholdObtaining a set of grids of interest to a userSet of gridsThe formed area is a user interested area;
spatial activity preference calculation substep S113:
at a known current location of the userIn the case of (2), the grid is evaluated using spatial proximityTo pairThe following weight function is used:
in the formula (I), the compound is shown in the specification,indicating the current location of the userAnd a gridThe distance of the center point of (a),
then, the user presence is calculated by using a weighting methodFor all categories of activity preferences, assume that the userIs provided withIndividual interest gridThen the userIn positionIs empty ofThe inter-activity preferences are:
in the formula (I), the compound is shown in the specification,representing a userIn thatInner pairThe frequency of the check-in of (c),all activity categories are shown, user isIs aligned withThe spatial activity preference of the user isIs preferred inThe sum of the geographic influences of (c);
the spatial success rate matrix construction sub-step S114:
based on spatial activity preference distribution, calculating the recommendation success rate of a spatial activity preference model on grids, and further constructing a spatial success rate matrix for each userEach row of the matrix represents one hourInterval of roomEach column representing a gridAnd calculating the recommendation success rate of the spatial activity preference model on the grid by using the check-in record based on the calculation method of the spatial activity preference in the substep S113.
Optionally, the substep S114 of constructing the spatial success rate matrix specifically includes:
building a spatial success rate matrix for each userInitializing the matrixAssigning matrix elements to 0, for any userSequentially taken out from the verification data setIs signed inIndicating a check-in record in which, among other things,representing latitude and longitude;A time stamp is represented which is a time stamp,representing activity categories,The grid representing the user' S current location is calculated based on the spatial activity preference calculation method in substep S113At the current positionSorting the categories according to the spatial preference scores of all the categories from large to small to obtain the category with the highest scoreWhen is coming into contact withLocated in interest grid setIn the middle, ifIs equal toWill beTo corresponding elementThe number of the bits is increased by 1,to representTo (1) aThe rows of the image data are, in turn,is shown asAnd column, and analogy, calculating the space success rate matrix of all users.
Optionally, the step S120 of constructing the user time activity preference model by using a non-negative tensor decomposition method specifically includes:
three-dimensional tensor construction sub-step S121:
constructing a three-dimensional tensor of user-time-activity from the user check-in record, expressed asThe elements in the tensor represent the userIn a period of timeSelecting an activityThe number of check-ins;
user time activity preference acquisition substep S122:
obtaining user-time-class preference values for a given tensor using a non-negative tensor decomposition method,The value of each element in (a) is calculated as follows:
in the formula (I), the compound is shown in the specification,factor matrixes representing users, time and categories, wherein the matrix sizes are respectively,The number of features involved in the decomposition process is controlled for the potential spatial dimension,are respectively asThe elements of (a) and (b),
adding a non-negative constraint to a least square-based decomposition algorithm in a CP decomposition model to obtain a recovery tensor which is used for describing the time activity preference of a user;
activity preference inference substep S123:
deducing and calculating the activity preference of the user at the current time based on the tensor decomposition resultNormalization from the activity dimension:
for a given userAnd timeThe sum of all class preference metrics is normalized to 1, in the recovery tensorAll elements have a value range ofNormalized element valueTreating as a userAt the time ofAccess categoriesProbability of, userAt the time ofThe time activity preference of (a) is expressed as:
time success rate matrix construction substep S124:
based on time activity preference distribution, calculating the recommendation success rate of a time activity preference model on grids, and further constructing a time success rate matrixEach row of the matrix represents a time rangeEach column representing a gridCalculation of time activity preference based on substep S123 using check-in recordsThe method calculates the recommendation success rate of the time activity preference model on the grid.
Optionally, the substep S124 of constructing the time success rate matrix specifically includes:
initializing a matrixAssigning matrix elements to 0, in the validation dataset, for any userTaken out in sequenceIs represented as a check-in record ofWherein, in the step (A),representing latitude and longitude,A time stamp is represented which is a time stamp,representing activity categories,The grid representing the user' S current location is calculated using the time activity preference calculation method in substep S123At the current timeSorting the categories according to the time preference scores of all the categories from large to small to obtain the category with the highest scoreWhen is coming into contact withLocated in interest grid setIf the time preference model predictsIs equal toWill beTo corresponding elementThe number of the bits is increased by 1,to representTo (1) aThe rows of the image data are, in turn,is shown asAnd column, and so on, calculating the time success rate matrix of all users.
Optionally, after the sub-step S131 of generating the recommendation list, there is further provided
Precision verification substep S132:
evaluating the performance of the recommendation model using accuracy for the test data set based on the recommendation listThe accuracy is calculated as follows:
in the formula (I), the compound is shown in the specification,indicating the length of the recommendation list and,indicating that the user is in a check-in record in the test dataset,representing a userAt the time ofAnd a locationTop k item (Top-k) activities with the highest score,representing the number of check-in records in the test dataset.
Optionally, the method divides a user check-in data set, sorts the check-in history of each user according to check-in time, and divides the check-in history of each user into a training data set, a verification data set, and a test data set according to a certain proportion, wherein steps S111, S112, S113, S121, S122, and S123 use the training data set to construct a model, steps S114 and S124 use the verification data set to calculate a success rate, and steps S131 and S132 use the test data set to perform model verification.
Optionally, in the substep S121 of constructing the three-dimensional tensor, three dimensions of the three-dimensional tensor are a user dimension, a time dimension, and an activity dimension, respectively, where the user dimension represents each user as an independent dimension, the activity dimension is represented by a POI category, and the time dimension is represented by a time period divided according to a certain time interval.
Optionally, in the sub-step S122 of obtaining the time activity preference of the user, the tensor decomposition parameter includes a potential spatial dimension, and the size of the potential spatial dimension affects the tensor decomposition time and the recommendation precision.
The invention further discloses a storage medium for storing computer executable instructions, which is characterized in that:
the computer-executable instructions, when executed by a processor, perform the above-described geographic grid-based point of interest activity recommendation method.
The method and the device respectively consider the space and time characteristics of the user sign-in activities in the position social network, and reduce the complexity of the problems. The method comprises the steps of improving sparsity of check-in data of a user based on a geographic grid and a tensor decomposition method, determining interest activities recommended to the user according to prediction probabilities of time distribution and spatial distribution by calculating spatial and temporal activity distribution of the user, carrying out quantitative analysis on interest areas of the user, improving accuracy of interest activity recommendation in a position social network, and enabling recommendation results to meet personalized requirements of the user.
Drawings
FIG. 1 is a flowchart of a method for recommending interest activities based on a geographic grid according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a method for recommending interest activities based on a geographic grid according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating the recommended precision for different potential dimensions in the NCP (Nonnegative Candecamp/Parafac) model, in accordance with a specific embodiment of the present invention;
FIG. 4 is an example of a user time activity preference calculation based on the NCP decomposition model in accordance with a specific embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
The invention is characterized in that: the sign-in activity of the user is modeled by using the sign-in historical records of the user in the position social network, and a personalized city interest activity recommendation method is provided. The method is based on the geography grid, combines the geography preference and the time preference of the user to the city area, analyzes the time-space behavior of the user in the city sign-in, and recommends more appropriate interest activities to the target user.
Referring to fig. 1, a flowchart of a method for recommending interest activities based on a geographic grid is shown, and fig. 2, a specific flowchart of the recommendation method is shown.
An interest activity recommendation method based on a geographic grid comprises the following steps:
constructing a user spatial activity preference model based on the geographic grid step S110:
dividing a city area into a plurality of geographic grids, mapping user check-in information, calculating check-in frequency and category preference deviation ratio of a user in each grid, acquiring an interest grid set of the user, and deducing the position of the userThe spatial activity preference of the user is calculated, the spatial activity preference distribution of the user is utilized, the recommendation success rate of the spatial activity preference model on the grid is calculated, and a spatial success rate matrix is constructed.
Specifically, the step S110 may include the following sub-steps:
user sign-in information mapping substep S111: the city area is divided into a plurality of regular grids with the same size, the sign-in information of the user is mapped to the grids, and the number attribute of the grids is obtained.
In particular, maximum and minimum longitude and latitude values for the area of interest are obtained, in accordance withThe meter interval divides the research area into a plurality of regular grids with the same size, and all grids are numbered, and the numbering starts from 0. Obtaining any sign-in record of user, matching sign-in position to grid, judging grid where sign-in point is located and its number。
Experimental data Foursquare data was used, ranging spatially from (40.54085247, -74.28476645) to (40.99833172, -73.6738252). The check-in record is represented asWherein, in the step (A),which represents the number of the user,which indicates the number of the POI number,the latitude is represented by the number of lines,which represents the longitude of the vehicle,the time of the check-in is represented,representing POI categories. Adding to check-in recordsAttribute, then check-in record is represented as。
Specifically, check-in records of users in new york from 2012 month 4 to month 2013 month 2 are recorded in the Foursquare data, and in order to alleviate the influence of data sparsity, users who check in less than 10 times and points of interest who are accessed less than 10 times are removed. Through data preprocessing, the obtained Foursquare data set has 1083 users, 38333 interest points and 227428 total check-in times. The points of interest are divided into 9 major categories, 215 subclasses, and the invention adopts the subclasses as the activity categories.
The invention can divide the user check-in data set, sort the check-in history of each user according to the check-in time, and divide the check-in history into the training data set, the verification data set and the test data set according to a certain proportion, such as 8:1: 1.
The user interest mesh acquisition sub-step S112:
calculating the check-in frequency of the user in each gridAnd class preference bias ratioUsing the frequency of attendanceAnd class preference bias ratioTo characterize the user's preference on the grid by setting a frequency thresholdAnd a preference deviation ratio thresholdAnd screening to obtain an interest grid set of the user.
First, the check-in frequency of the user in the grid is a direct evaluation index of the popularity of the grid, and the more times a region is checked-in, the more attractive the interest points in the region are to the user. Meanwhile, since the user usually only accesses a few categories, not all categories, in the frequently checked-in area, the check-in of the user on the grid is diversified.
Thus, the present invention primarily passes through the check-in frequencyAnd class preference bias ratioTo evaluate the user's preference in the grid,
frequency of check-inIs represented as a userIn thatThe number of check-ins in (a) is a proportion of the total number of check-ins,
in the formula (I), the compound is shown in the specification,representing a userIn thatThe number of check-ins of (c),representing a set of grids visited by a user;
preference deviation ratioFor measuring usersIn a gridClass preference in (1), assuming meshAll of them sharePOI (POI of interest) of a category to which a user has accessA categoryPOI, preference bias ratioMeasure and make a best ofIn thatSign-in class distribution entropy ofAnd the fractional difference between the maximum entropy of the category distribution, the calculation formula is as follows:
in the formula (I), the compound is shown in the specification,representation gridThe total number of categories of POIs present in,indicates the user isThe number of sign-insThe ratio of the total number of check-in times in the table,representing the maximum entropy of the user in the check-in category, which assumes that the user is in all categoriesThe likelihood of a check-in is the same,representing a userIn a gridCategory of checked-in.
After calculating the check-in frequency and the preference deviation ratio of each grid, introducing a frequency threshold valueAnd a preference deviation ratio thresholdObtaining a set of grids of interest to a userSet of gridsThe formed area is the user interested area.
The specific method for acquiring the region of interest of the user comprises the following steps: firstly, the check-in number of the user in all areas is calculated, and the grid set where the user checks in is obtainedThen sequentially scanning the grids visited by the userCalculating the check-in frequencyIf the grid is one that is frequently visited by the user (i.e., the check-in frequency is greater than or equal to the threshold value)) Calculating a preference deviation ratio(ii) a If it is notGreater than or equal toThen, thenAdding the interest grids as users into the interest grid setPerforming the following steps; sequentially traversing all check-in grids of the user to obtain an interest grid set of the user。Determines the activity of the user on the grid,representing the degree of deviation of the user's activity preference in the grid.Andthe larger the value is,the smaller the set.
Spatial activity preference calculation substep S113:
interest-based grid setThe spatial activity preference of the user at the current location is inferred. At a known current location of the userIn the case of (2), the influence of the individual meshes in the set is first calculated, i.e. the meshes are evaluated using spatial proximityTo pairThen the activity preferences of all grids are calculated using a weighting method.
In particular, at a known current location of the userIn the case of (2), the grid is evaluated using spatial proximityTo pairAccording to the conclusions from the existing research, the following weight functions are adopted:
in the formula (I), the compound is shown in the specification,indicating the current location of the userAnd a gridThe further away,smaller values indicate less spatial appeal of the grid to the user.
Then, the user presence is calculated by using a weighting methodFor all categories of activity preferences, assume that the userIs provided withIndividual interest gridThen the userIn positionThe spatial activity preference of (a) is:
in the formula (I), the compound is shown in the specification,representing a userIn thatInner pairThe frequency of the check-in of (c),all activity categories are shown, user isIs aligned withThe spatial activity preference of the user isIs preferred inThe sum of the geographic influences of (c).
The spatial success rate matrix construction sub-step S114:
based on the spatial activity preference distribution, calculating the recommendation success rate of the spatial activity preference model on the grid by using a verification data set, and further constructing a spatial success rate matrix for each userEach row of the matrix represents a time rangeEach column representing a gridAnd calculating the recommendation success rate of the spatial activity preference model on the grid by using the check-in records in the verification set based on the calculation method of the spatial activity preference in the substep S113.
In particular, useRepresenting a userA spatial success rate matrix of (1), each row of the matrix representing a time stampEach column representing a grid. Considering the periodicity of the check-in points of the user in the check-in on the day, the check-in on the working day and the check-in on the non-working day, the time is divided into 24 time periods by 1 hour interval, each day is divided into 168 time intervals, and each time interval represents a time stamp. Given time,The calculation of (c) is shown in the following formula:
in the formula (I), the compound is shown in the specification,for the corresponding week, respectivelyIndicating that on a monday through a sunday,representing hours. For example, times 2012-04-2307: 10:18,the number of the carbon atoms is 1,is 7.
First, the matrix is initializedThe matrix element is assigned a value of 0. In the authentication dataset, for any userTaken out in sequenceIs signed in, is shown asFor convenience of expression, useIndicating that the check-in record (where,representing latitude and longitude;Represents a time stamp in accordance withCalculating to obtain;representing activity categories;Indicating the grid that the user is currently on). Based on the calculation method of the spatial Activity preference in substep S113, a calculation is madeAt the current positionSorting the categories according to the spatial preference scores of all the categories from large to small to obtain the category with the highest scoreWhen is coming into contact withLocated in interest grid setIn the middle, ifIs equal toWill beTo corresponding elementThe number of the bits is increased by 1,to representTo (1) aThe rows of the image data are, in turn,is shown asAnd column, and analogy, calculating the space success rate matrix of all users.
A step S120 of constructing a user time activity preference model by using a non-negative tensor decomposition method:
constructing a three-dimensional tensor of user-time-class according to the user check-in record, wherein elements in the tensor represent the user in a time periodSelecting an activityThe sign-in frequency is obtained by a recovery tensor to describe the time activity preference of the user based on a non-negative tensor decomposition algorithm according to the three-dimensional tensor; deducing activity preference distribution of the user at the current time according to the tensor decomposition result; computing temporal activity preference model on grid based on temporal activity preference distributionAnd recommending the success rate, and further constructing a time success rate matrix.
Specifically, the method comprises the following substeps:
three-dimensional tensor construction sub-step S121:
constructing a three-dimensional tensor of user-time-activity from the user check-in record, expressed asIn the time dimension, considering the periodicity of the check-in points of the users in the check-in on the day, the check-in on the working day and the check-in on the non-working day, the time is divided into 24 time periods every day and 168 time intervals every week according to the interval of 1 hour. In the activity dimension, 251 sub-categories of POIs are used to represent user activity. In the present embodiment, there are 1083 active users in total, 215 categories. Thus, the tensor is represented as. The elements in the tensor represent the userIn a period of timeSelecting an activityThe number of sign-ins if the user is not in the time periodAccess overactivityThen, thenIs 0. The purpose of tensor resolution is to decomposeThe 0 value in (1) is assigned by decomposition algorithmA predetermined value.
Therefore, in the sub-step, three dimensions of the three-dimensional tensor are a user dimension, a time dimension and an activity dimension respectively, the user dimension represents each user as one dimension, the activity dimension is represented by a POI category, and the time dimension is represented by a time period divided according to a certain time interval.
User time activity preference acquisition substep S122:
since the sign-in probability of the user cannot be negative, the negative values in the recovery tensor are meaningless for the preference of the user, and therefore the constructed tensor is decomposed into three first-order tensor sums by adopting a non-negative CP decomposition model.
Obtaining user-time-class preference values for a given tensor using a non-negative tensor decomposition method,The value of each element in (a) is calculated as follows:
in the formula (I), the compound is shown in the specification,factor matrixes representing users, time and categories, wherein the matrix sizes are respectively。The number of features involved in the decomposition process is controlled for the potential spatial dimension,are respectively asOf (2) is used.
Adding a non-negative constraint to the least squares based decomposition algorithm in the CP decomposition model yields a recovery tensor describing the temporal activity preference of the user.
Preferably, CP decomposition (CANDECOMP/PARAFAC) decomposes the tensor into three factor matrices (i.e., user, time and class factor matrices) and optimizes the tensor using an alternating least squares methodAnd the originalLoss function between tensors.
The tensor decomposition parameters include potential spatial dimensions that can significantly affect recommendation performance, particularly tensor decomposition time and recommendation accuracy. The present embodiment also takes into account the impact of the potential spatial dimension on the accuracy of the recommendation. The change in recommended performance during the change in potential spatial dimension from 8 to 128 is given in fig. 3. It can be seen that the larger the dimension, the better the recommended performance, but the growth rate gradually slows.
In one example, the restored tensor is obtained according to the model construction idea, and the nonnegative tensor decomposition can be completed by using a Tensorly open source code packet of Python. Tensorly is an open source code packet which can perform tensor decomposition, tensor learning and tensor algebra, and a non _ negative _ parafacc function provided by Tensorly can realize NCP decomposition.
Activity preference inference substep S123:
and deducing activity preference of the user at the current time based on the tensor decomposition result. In order to infer the userAt the time ofClass bias ofPreferably, i.e. the user is atTemporal access categoriesPossibility of (1) toNormalization from the activity dimension:
thus, for a given userAnd timeThe sum of all class preference metrics is normalized to 1 and the value range of all elements in the recovery tensor isNormalization enables temporal and spatial preferences to be fused, normalized element valuesTreating as a userAt the time ofAccess categoriesProbability of, userAt the time ofThe time activity preference of (a) is expressed as:
an example of a user time activity preference calculation based on the NCP decomposition model is shown in FIG. 4, which shows a userAt any time periodInternal access activityThe probability of (c).
Time success rate matrix construction substep S124:
based on the time activity preference distribution, calculating the recommendation success rate of the time activity preference model on the grid by using the verification data set, and further constructing a time success rate matrixEach row of the matrix represents a time rangeEach column representing a gridAnd calculating the recommendation success rate of the time activity preference model on the grid by using the check-in records in the verification set based on the time activity preference calculation method in the substep S123.
Representing a userTime success rate matrix. As described in S114Each row of the matrix represents a time stampEach column representing a grid。
First, the matrix is initializedAssigning matrix elements to 0, in the validation dataset, for any userTaken out in sequenceIs represented as a check-in record of(wherein,representing latitude and longitude;A presentation time stamp;representing activity categories;Indicating the grid that the user is currently on). Calculating using the method of calculating the time activity preference in substep S123At the current timeSorting the categories according to the time preference scores of all the categories from large to small to obtain the category with the highest score. When in useLocated in interest grid setIf the time preference model predictsIs equal toWill beTo corresponding elementThe number of the bits is increased by 1,to representTo (1) aThe rows of the image data are, in turn,is shown asAnd column, and so on, calculating the time success rate matrix of all users.
A temporal preference and spatial preference fusion step S130:
in the case of a given user's spatial and temporal context, a fusion method needs to be adopted to fuse the temporal and spatial activity preferences, and methods such as linear weighting, multiplication and the like are common fusion methods. However, since the performance of the spatial and temporal models varies with time and place, it is difficult to dynamically assign these two weights according to the user context.
Recommendation list generation substep S131: and deducing the user activity preference according to the success rate matrix, comparing element values in the space success rate matrix and the time success rate matrix, and selecting a model result with a higher value as a final recommendation result.
Specifically, in the test data set, the user check-in records are sequentially taken outThe following judgment is made: first for a given userAnd its context (i.e. time)And position). Comparison ofAndand selecting the model with higher value as the final preference. If the two are equal, the result predicted by the space activity preference model is adopted.
Experimental results show that the spatial activity preference model can better capture activity preference of the user.
Furthermore, the accuracy verification is carried out on the interest activity recommendation method based on the geographic grid through experiments.
Precision verification substep S132:
evaluating the performance of the recommendation model using accuracy for the test data set based on the recommendation listThe accuracy is calculated as follows:
in the formula (I), the compound is shown in the specification,indicating the length of the recommendation list and,indicating that the user is in a check-in record in the test set,representing a userAt the time ofAnd a locationTop k item (Top-k) activities with the highest score,indicating the number of check-in records in the test set.
The effect of the recommended quantity on the accuracy is also considered in this experiment, and table 1 shows the variation of the recommended accuracy of several recommendation methods when the recommended quantity is 1, 5 and 10. The comparison methods include MFT (most frequently visited activity within a time period), CP (CP decomposition), NCP (non-negative CP decomposition), MFA (most frequently visited activity by a user), SPM (spatial preference model), STUAP (geographic grid-based interest activity recommendation method). It can be seen that the interest activity recommendation method of the present invention has the highest accuracy under the condition that the recommendation number is the same.
Table 1 recommended performance comparison experiment
The experimental data set comprises a training set, a verification set and a test set, wherein the training data set is adopted to construct a model in steps S111, S112, S113, S121, S122 and S123, the verification data set is adopted to calculate the success rate in steps S114 and S124, and the test data set is adopted to verify the model in steps S131 and S132.
The present invention further discloses a storage medium for storing computer-executable instructions which, when executed by a processor, perform the above-mentioned method for recommending point of interest activities based on a geographic grid.
In conclusion, the spatial and temporal characteristics of the user check-in activities in the location social network are considered respectively, and the complexity of the problem is reduced. The method comprises the steps of improving sparsity of check-in data of a user based on a geographic grid and a tensor decomposition method, determining interest activities recommended to the user according to prediction probabilities of time distribution and spatial distribution by calculating spatial and temporal activity distribution of the user, carrying out quantitative analysis on interest areas of the user, improving accuracy of interest activity recommendation in a position social network, and enabling recommendation results to meet personalized requirements of the user.
It will be apparent to those skilled in the art that the various elements or steps of the invention described above may be implemented using a general purpose computing device, they may be centralized on a single computing device, or alternatively, they may be implemented using program code that is executable by a computing device, such that they may be stored in a memory device and executed by a computing device, or they may be separately fabricated into various integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
While the invention has been described in further detail with reference to specific preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (10)
1. An interest activity recommendation method based on a geographic grid comprises the following steps:
constructing a user spatial activity preference model based on the geographic grid step S110:
dividing a city area into a plurality of geographic grids, mapping user check-in information, calculating check-in frequency and category preference deviation ratio of a user in each grid, acquiring an interest grid set of the user, and deducing the position of the userCalculating the spatial activity preference distribution of the user, calculating the recommendation success rate of a spatial activity preference model on the grid by using the spatial activity preference distribution, and constructing a spatial success rate matrix;
a step S120 of constructing a user time activity preference model by using a non-negative tensor decomposition method:
constructing three of user-time-categories from user check-in recordsDimension tensor, the elements of which represent the user over a period of timeSelecting an activityThe sign-in frequency is obtained by a recovery tensor to describe the time activity preference of the user based on a non-negative tensor decomposition algorithm according to the three-dimensional tensor; deducing and calculating activity preference distribution of the user at the current time according to the tensor decomposition result; based on time activity preference distribution, calculating the recommendation success rate of a time activity preference model on grids, and constructing a time success rate matrix;
a temporal preference and spatial preference fusion step S130:
recommendation list generation substep S131: and deducing the user activity preference according to the success rate matrix, comparing element values in the space success rate matrix and the time success rate matrix, and selecting a model result with a higher value as a final recommendation result.
2. The interest activity recommendation method of claim 1, wherein:
the step S110 of constructing a user spatial activity preference model based on a geographic grid includes the following sub-steps:
user sign-in information mapping substep S111: dividing the urban area into a plurality of regular grids with the same size, mapping the sign-in information of the user to the grids, and obtaining the number attribute of the grids;
the user interest mesh acquisition sub-step S112:
calculating the check-in frequency of the user in each gridAnd class preference bias ratioUsing said check-in frequencyAnd the class preference deviation ratioTo characterize the user's preference on the grid by setting a frequency thresholdAnd a preference deviation ratio thresholdScreening to obtain an interest grid set of the user,
wherein the check-in frequencyIs represented as a userIn thatThe number of check-ins in (a) is a proportion of the total number of check-ins,
in the formula (I), the compound is shown in the specification,representing a userIn thatThe number of check-ins of (c),representing a set of grids visited by a user;
the preference deviation ratioFor measuring usersIn a gridClass preference in (1), assuming meshAll of them sharePOI (POI of interest) of a category to which a user has accessA categoryThe POI of (1), the calculation formula is as follows:
in the formula (I), the compound is shown in the specification,representation gridThe total number of categories of POIs present in,indicates the user isThe number of sign-insThe ratio of the total number of check-in times in the table,representing the maximum entropy of the user in the check-in category, which assumes that the user is in all categoriesThe likelihood of a check-in is the same,representing a userIn a gridA category of checked-in;
after calculating the check-in frequency and the preference deviation ratio of each grid, introducing a frequency threshold valueAnd a preference deviation ratio thresholdObtaining a set of grids of interest to a userSet of gridsThe formed area is a user interested area;
spatial activity preference calculation substep S113:
at a known current location of the userIn the case of (2), the grid is evaluated using spatial proximityTo pairThe following weight function is used:
in the formula (I), the compound is shown in the specification,indicating the current location of the userAnd a gridThe distance of the center point of (a),
then, the user presence is calculated by using a weighting methodFor all categories of activity preferences, set userIs provided withIndividual interest gridThen the userIn positionThe spatial activity preference of (a) is:
in the formula (I), the compound is shown in the specification,representing a userIn thatInner pairThe frequency of the check-in of (c),all activity categories are shown, user isIs aligned withThe spatial activity preference of the user isIs preferred inThe sum of the geographic influences of (c);
the spatial success rate matrix construction sub-step S114:
based on spatial activity preference distribution, calculating the recommendation success rate of a spatial activity preference model on grids, and further constructing a spatial success rate matrix for each userEach row of the matrix represents a time rangeEach column representing a gridAnd calculating the recommendation success rate of the spatial activity preference model on the grid by using the check-in record based on the calculation method of the spatial activity preference in the substep S113.
3. The interest activity recommendation method of claim 2, wherein:
the substep S114 of constructing the spatial success rate matrix specifically includes:
building a spatial success rate matrix for each userInitializing the matrixAssigning matrix elements to 0, for any userSequentially taken out from the verification data setFor signing in recordIt is shown that, among others,representing latitude and longitude;A time stamp is represented which is a time stamp,representing activity categories,The grid representing the user' S current location is calculated based on the spatial activity preference calculation method in substep S113At the current positionSorting the categories according to the spatial preference scores of all the categories from large to small to obtain the category with the highest scoreWhen is coming into contact withLocated in interest grid setIn the middle, ifIs equal toWill beTo corresponding elementThe number of the bits is increased by 1,to representTo (1) aThe rows of the image data are, in turn,to representAnd thirdly, calculating a space success rate matrix of all users by analogy.
4. The interest activity recommendation method of claim 3, wherein:
the step S120 of constructing the user time activity preference model by using the non-negative tensor decomposition method specifically includes:
three-dimensional tensor construction sub-step S121:
constructing a three-dimensional tensor of user-time-activity from the user check-in record,is shown asThe elements in the tensor represent the userIn a period of timeSelecting an activityThe number of check-ins;
user time activity preference acquisition substep S122:
obtaining user-time-class preference values for a given tensor using a non-negative tensor decomposition method,The value of each element in (a) is calculated as follows:
in the formula (I), the compound is shown in the specification,factor matrixes respectively representing users, time and categories, wherein the matrix size is,The number of features involved in the decomposition process is controlled for the potential spatial dimension,are respectively asThe elements of (a) and (b),
adding a non-negative constraint to a least square-based decomposition algorithm in a CP decomposition model to obtain a recovery tensor which is used for describing the time activity preference of a user;
activity preference inference substep S123:
deducing and calculating the activity preference of the user at the current time based on the tensor decomposition resultNormalization from the activity dimension:
for a given userAnd timeThe sum of all class preference metrics is normalized to 1 and the value range of all elements in the recovery tensor isNormalized element valueTreating as a userAt the time ofAccess categoriesProbability of, userAt the time ofThe time activity preference of (a) is expressed as:
time success rate matrix construction substep S124:
based on time activity preference distribution, calculating the recommendation success rate of a time activity preference model on grids, and further constructing a time success rate matrixEach row of the matrix represents a time rangeEach column representing a gridAnd calculating the recommendation success rate of the time activity preference model on the grid by using the check-in record based on the time activity preference calculation method in the substep S123.
5. The interest activity recommendation method of claim 4, wherein:
the time success rate matrix construction substep S124 specifically comprises:
initializing a matrixAssigning matrix elements to 0, in the validation dataset, for any userTaken out in sequenceIs represented as a check-in record ofWherein, in the step (A),representing latitude and longitude,A time stamp is represented which is a time stamp,representing activity categories,The grid representing the user' S current location is calculated using the time activity preference calculation method in substep S123At the current timeTime preference scores for all categories, in terms of score pairs from large to smallSorting the categories to obtain the category with the highest scoreWhen is coming into contact withLocated in interest grid setIf the time preference model predictsIs equal toWill beTo corresponding elementThe number of the bits is increased by 1,to representTo (1) aThe rows of the image data are, in turn,to representAnd thirdly, calculating a time success rate matrix of all users by analogy.
6. The interest activity recommendation method of claim 5, wherein:
after the recommendation list generation substep S131, there is also provided
Precision verification substep S132:
evaluating the performance of the recommendation model using accuracy for the test data set based on the recommendation listThe accuracy is calculated as follows:
in the formula (I), the compound is shown in the specification,indicating the length of the recommendation list and,indicating that the user is in a check-in record in the test dataset,representing a userAt the time ofAnd a locationTop k item (Top-k) activities with the highest score,representing the number of check-in records in the test dataset.
7. The interest activity recommendation method of claim 5, wherein:
the interest activity recommendation method divides a user sign-in data set, sorts sign-in historical records of each user according to sign-in time, and divides the sign-in historical records into a training data set, a verification data set and a test data set according to a certain proportion, steps S111, S112, S113, S121, S122 and S123 adopt the training data set to construct a model, steps S114 and S124 adopt the verification data set to calculate success rate, and steps S131 and S132 adopt the test data set to verify the model.
8. The interest activity recommendation method of claim 5, wherein:
in the sub-step S121 of constructing the three-dimensional tensor, three dimensions of the three-dimensional tensor are a user dimension, a time dimension, and an activity dimension, respectively, where the user dimension represents each user as one dimension, the activity dimension is represented by a category of POI, and the time dimension is represented by a time period divided according to a certain time interval.
9. The interest activity recommendation method of claim 5, wherein:
in the user time activity preference acquisition sub-step S122, the tensor decomposition parameters include potential spatial dimensions, the size of which affects tensor decomposition time and recommendation accuracy.
10. A storage medium for storing computer-executable instructions, characterized in that:
the computer-executable instructions, when executed by a processor, perform the method for recommending point of interest activities based on a geographic grid of any of claims 1-9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210034325.6A CN114048391B (en) | 2022-01-13 | 2022-01-13 | Interest activity recommendation method based on geographic grid |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210034325.6A CN114048391B (en) | 2022-01-13 | 2022-01-13 | Interest activity recommendation method based on geographic grid |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114048391A CN114048391A (en) | 2022-02-15 |
CN114048391B true CN114048391B (en) | 2022-04-19 |
Family
ID=80196433
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210034325.6A Active CN114048391B (en) | 2022-01-13 | 2022-01-13 | Interest activity recommendation method based on geographic grid |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114048391B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115495665B (en) * | 2022-11-16 | 2023-04-25 | 中南大学 | Surface coverage updating crowdsourcing task recommendation method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108460101A (en) * | 2018-02-05 | 2018-08-28 | 山东师范大学 | Point of interest of the facing position social networks based on geographical location regularization recommends method |
CN109492166A (en) * | 2018-08-06 | 2019-03-19 | 北京理工大学 | Continuous point of interest recommended method based on time interval mode of registering |
CN112905905A (en) * | 2021-01-22 | 2021-06-04 | 杭州电子科技大学 | Interest point-area joint recommendation method in location social network |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8719198B2 (en) * | 2010-05-04 | 2014-05-06 | Microsoft Corporation | Collaborative location and activity recommendations |
-
2022
- 2022-01-13 CN CN202210034325.6A patent/CN114048391B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108460101A (en) * | 2018-02-05 | 2018-08-28 | 山东师范大学 | Point of interest of the facing position social networks based on geographical location regularization recommends method |
CN109492166A (en) * | 2018-08-06 | 2019-03-19 | 北京理工大学 | Continuous point of interest recommended method based on time interval mode of registering |
CN112905905A (en) * | 2021-01-22 | 2021-06-04 | 杭州电子科技大学 | Interest point-area joint recommendation method in location social network |
Non-Patent Citations (1)
Title |
---|
基于位置社会网络的双重细粒度兴趣点推荐;廖国琼等;《计算机研究与发展》;20171130;第54卷(第11期);2600-2610 * |
Also Published As
Publication number | Publication date |
---|---|
CN114048391A (en) | 2022-02-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10235683B2 (en) | Analyzing mobile-device location histories to characterize consumer behavior | |
CN105532030B (en) | For analyzing the devices, systems, and methods of the movement of target entity | |
US8423494B2 (en) | Complex situation analysis system that generates a social contact network, uses edge brokers and service brokers, and dynamically adds brokers | |
CN107230108A (en) | The processing method and processing device of business datum | |
Wu et al. | Density-based place clustering using geo-social network data | |
CN106776925B (en) | Method, server and system for predicting gender of mobile terminal user | |
EP2875623A1 (en) | Method and system for traffic estimation | |
CN114048391B (en) | Interest activity recommendation method based on geographic grid | |
CN113158038A (en) | Interest point recommendation method and system based on STA-TCN neural network framework | |
Chen et al. | A temporal recommendation mechanism based on signed network of user interest changes | |
EP3192061B1 (en) | Measuring and diagnosing noise in urban environment | |
Tanton | Spatial microsimulation: developments and potential future directions | |
CN111259268A (en) | POI recommendation model construction method and system | |
CN116188052A (en) | Method and device for throwing shared vehicle, computer equipment and storage medium | |
Doan et al. | Attractiveness versus competition: towards an unified model for user visitation | |
Liao et al. | A mobility model for synthetic travel demand from sparse traces | |
Zeng et al. | LGSA: A next POI prediction method by using local and global interest with spatiotemporal awareness | |
CN112883292A (en) | User behavior recommendation model establishment and position recommendation method based on spatio-temporal information | |
Hong et al. | Revealing behavioral impact on mobility prediction networks through causal interventions | |
Mazzamurro et al. | Dynamic spatial cluster process model of geo-tagged tweets in london | |
Hu et al. | Implementation and optimization of real-time fine-grained air quality sensing networks in smart city | |
Su et al. | Point-of-interest recommendation based on geographical influence and extended pairwise ranking | |
Doumèche et al. | Human spatial dynamics for electricity demand forecasting: the case of France during the 2022 energy crisis | |
CN113010803B (en) | Prediction method for user access position in geographic sensitive dynamic social environment | |
Xie et al. | Modeling Human Mobility Based on Temporal Characteristics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |