CN114048391B - Interest activity recommendation method based on geographic grid - Google Patents

Interest activity recommendation method based on geographic grid Download PDF

Info

Publication number
CN114048391B
CN114048391B CN202210034325.6A CN202210034325A CN114048391B CN 114048391 B CN114048391 B CN 114048391B CN 202210034325 A CN202210034325 A CN 202210034325A CN 114048391 B CN114048391 B CN 114048391B
Authority
CN
China
Prior art keywords
user
activity
time
preference
grid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210034325.6A
Other languages
Chinese (zh)
Other versions
CN114048391A (en
Inventor
仇阿根
赵习枝
张志然
陶坤旺
张福浩
陈颂
陈才
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chinese Academy of Surveying and Mapping
Original Assignee
Chinese Academy of Surveying and Mapping
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chinese Academy of Surveying and Mapping filed Critical Chinese Academy of Surveying and Mapping
Priority to CN202210034325.6A priority Critical patent/CN114048391B/en
Publication of CN114048391A publication Critical patent/CN114048391A/en
Application granted granted Critical
Publication of CN114048391B publication Critical patent/CN114048391B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

An interest activity recommendation method based on geographic grids comprises the steps of dividing a research area into regular grids, establishing a personal interest grid area for each user through check-in frequency and preference deviation ratio parameters, and deducing spatial activity preference of the user; capturing other similar user activity preferences by adopting a non-negative tensor resolution method, and cooperatively establishing the time activity preferences of the user; and fusing the spatial activity preference and the temporal activity preference by adopting a context-aware fusion method to jointly determine the interest activities recommended to the user. The model improves sparsity of check-in data of the user based on a geographic grid and a tensor decomposition method, carries out quantitative analysis on interest activities of the user, improves accuracy of interest activity recommendation, and enables recommendation results to meet personalized requirements of the user.

Description

Interest activity recommendation method based on geographic grid
Technical Field
The application belongs to the technical field of position recommendation, and particularly relates to an interest activity recommendation method based on a geographic grid.
Background
With the rapid development of Location-based Social networks (LBSNs) and mobile end devices, mining potential user personal preferences, activity tracks, and lifestyle patterns from accumulated mass user data and sign-in data becomes a core link for Location services. Location recommendation becomes an important technical means of this link. At present, due to the influence of problems such as data sparsity and cold start, the accuracy of recommendation results obtained by a recommendation algorithm for points of interest may be low. Also, in many cases, people do not usually need a very precise location. Therefore, the research on the interest activity recommendation algorithm and the application is carried forward, and the purpose is to better understand the movement behavior of the user and predict the activities possibly participated by the user, so that the personalized and intelligent service requirements of the user are met.
The check-in behavior of the user presents a specific spatio-temporal distribution pattern, and the modeling of the spatio-temporal behavior of the user based on historical check-in data of the user in the location social network is challenging and mainly expressed in the following aspects. Firstly, check-in data is usually high-dimensional and sparse, and is represented by a user-time-position-activity four-dimensional quadruplet, and it is complex and difficult to directly find out regularity of sparse high-dimensional data; secondly, the check-in behavior of the user on the social media is influenced by the user, which is different from the continuously sampled user activity data, and the check-in behavior is not continuously sampled at equal intervals and is complex and changeable in space and time; the check-in activity of a user is related to the context in which the user is located, i.e., the check-in behavior of the user is generally affected by the location and time of the user. Therefore, how to mine the user activity preference by combining the temporal and spatial contexts of the user becomes a technical problem which needs to be solved urgently in the prior art.
Disclosure of Invention
The invention aims to provide an interest activity recommendation method based on a geographic grid, which finds user interest activities and improves recommendation performance. And respectively modeling the spatial activity preference and the time activity preference of the user by using the technologies such as the geography grid, tensor decomposition and the like, thereby reducing the complexity of the problem.
An interest activity recommendation method based on a geographic grid comprises the following steps:
constructing a user spatial activity preference model based on the geographic grid step S110:
dividing a city area into a plurality of geographic grids, mapping user check-in information, calculating check-in frequency and category preference deviation ratio of a user in each grid, acquiring an interest grid set of the user, and deducing the position of the user
Figure 615121DEST_PATH_IMAGE001
Computing a spatial activity preference distribution of the user and using the nullInter-activity preference distribution, calculating the recommendation success rate of the spatial activity preference model on the grid, and constructing a spatial success rate matrix;
a step S120 of constructing a user time activity preference model by using a non-negative tensor decomposition method:
constructing a three-dimensional tensor of user-time-class according to the user check-in record, wherein elements in the tensor represent the user in a time period
Figure 684708DEST_PATH_IMAGE002
Selecting an activity
Figure 7105DEST_PATH_IMAGE003
The sign-in frequency is obtained by a recovery tensor to describe the time activity preference of the user based on a non-negative tensor decomposition algorithm according to the three-dimensional tensor; deducing and calculating activity preference distribution of the user at the current time according to the tensor decomposition result; based on time activity preference distribution, calculating the recommendation success rate of a time activity preference model on grids, and constructing a time success rate matrix;
a temporal preference and spatial preference fusion step S130:
recommendation list generation substep S131: and deducing the user activity preference according to the success rate matrix, comparing element values in the space success rate matrix and the time success rate matrix, and selecting a model result with a higher value as a final recommendation result.
Optionally, the step S110 of constructing a user spatial activity preference model based on a geographic grid includes the following sub-steps:
user sign-in information mapping substep S111: dividing the urban area into a plurality of regular grids with the same size, mapping the sign-in information of the user to the grids, and obtaining the number attribute of the grids;
the user interest mesh acquisition sub-step S112:
calculating the check-in frequency of the user in each grid
Figure 820340DEST_PATH_IMAGE005
And class preference bias ratio
Figure 364323DEST_PATH_IMAGE007
Using the frequency of attendance
Figure 463866DEST_PATH_IMAGE005
And class preference bias ratio
Figure 414505DEST_PATH_IMAGE007
To characterize the user's preference on the grid by setting a frequency threshold
Figure 641218DEST_PATH_IMAGE009
And a preference deviation ratio threshold
Figure 462543DEST_PATH_IMAGE011
Screening to obtain an interest grid set of the user,
wherein the check-in frequency
Figure 732988DEST_PATH_IMAGE005
Is represented as a user
Figure 639764DEST_PATH_IMAGE012
In that
Figure 374895DEST_PATH_IMAGE013
The number of check-ins in (a) is a proportion of the total number of check-ins,
Figure 175361DEST_PATH_IMAGE015
in the formula (I), the compound is shown in the specification,
Figure 23231DEST_PATH_IMAGE017
representing a user
Figure 292669DEST_PATH_IMAGE012
In that
Figure 985819DEST_PATH_IMAGE018
The number of check-ins of (c),
Figure 109633DEST_PATH_IMAGE019
representing a set of grids visited by a user;
the preference deviation ratio
Figure 236726DEST_PATH_IMAGE007
For measuring users
Figure 852516DEST_PATH_IMAGE012
In a grid
Figure 473990DEST_PATH_IMAGE013
Class preference in (1), assuming mesh
Figure 124414DEST_PATH_IMAGE013
All of them share
Figure 923874DEST_PATH_IMAGE021
POI of individual category, which the user accesses
Figure 292538DEST_PATH_IMAGE022
A category
Figure 452124DEST_PATH_IMAGE023
The POI of (1), the calculation formula is as follows:
Figure 957055DEST_PATH_IMAGE025
in the formula (I), the compound is shown in the specification,
Figure 417162DEST_PATH_IMAGE021
representation grid
Figure 7543DEST_PATH_IMAGE013
The total number of categories of POIs present in,
Figure 970820DEST_PATH_IMAGE027
indicates the user is
Figure 392574DEST_PATH_IMAGE028
The number of sign-ins
Figure 737099DEST_PATH_IMAGE013
The ratio of the total number of check-in times in the table,
Figure 611514DEST_PATH_IMAGE029
representing the maximum entropy of the user in the check-in category, which assumes that the user is in all categories
Figure 316165DEST_PATH_IMAGE030
The likelihood of a check-in is the same,
Figure 592426DEST_PATH_IMAGE032
representing a user
Figure 29223DEST_PATH_IMAGE012
In a grid
Figure 968098DEST_PATH_IMAGE013
A category of checked-in;
after calculating the check-in frequency and the preference deviation ratio of each grid, introducing a frequency threshold value
Figure 148544DEST_PATH_IMAGE009
And a preference deviation ratio threshold
Figure 872786DEST_PATH_IMAGE011
Obtaining a set of grids of interest to a user
Figure 480485DEST_PATH_IMAGE034
Set of grids
Figure 142541DEST_PATH_IMAGE034
The formed area is a user interested area;
spatial activity preference calculation substep S113:
at a known current location of the user
Figure 126678DEST_PATH_IMAGE001
In the case of (2), the grid is evaluated using spatial proximity
Figure 705427DEST_PATH_IMAGE035
To pair
Figure 484027DEST_PATH_IMAGE001
The following weight function is used:
Figure 134845DEST_PATH_IMAGE036
in the formula (I), the compound is shown in the specification,
Figure 657093DEST_PATH_IMAGE037
indicating the current location of the user
Figure 90348DEST_PATH_IMAGE001
And a grid
Figure 39850DEST_PATH_IMAGE035
The distance of the center point of (a),
then, the user presence is calculated by using a weighting method
Figure 676498DEST_PATH_IMAGE001
For all categories of activity preferences, assume that the user
Figure 2438DEST_PATH_IMAGE012
Is provided with
Figure 290199DEST_PATH_IMAGE038
Individual interest grid
Figure 676181DEST_PATH_IMAGE039
Then the user
Figure 298662DEST_PATH_IMAGE012
In position
Figure 162712DEST_PATH_IMAGE001
Is empty ofThe inter-activity preferences are:
Figure 570560DEST_PATH_IMAGE041
in the formula (I), the compound is shown in the specification,
Figure 861864DEST_PATH_IMAGE042
representing a user
Figure 473105DEST_PATH_IMAGE012
In that
Figure 140847DEST_PATH_IMAGE013
Inner pair
Figure 403201DEST_PATH_IMAGE028
The frequency of the check-in of (c),
Figure 865406DEST_PATH_IMAGE003
all activity categories are shown, user is
Figure 465408DEST_PATH_IMAGE001
Is aligned with
Figure 936841DEST_PATH_IMAGE028
The spatial activity preference of the user is
Figure 53701DEST_PATH_IMAGE034
Is preferred in
Figure 686808DEST_PATH_IMAGE001
The sum of the geographic influences of (c);
the spatial success rate matrix construction sub-step S114:
based on spatial activity preference distribution, calculating the recommendation success rate of a spatial activity preference model on grids, and further constructing a spatial success rate matrix for each user
Figure 272641DEST_PATH_IMAGE044
Each row of the matrix represents one hourInterval of room
Figure 282186DEST_PATH_IMAGE045
Each column representing a grid
Figure 253553DEST_PATH_IMAGE046
And calculating the recommendation success rate of the spatial activity preference model on the grid by using the check-in record based on the calculation method of the spatial activity preference in the substep S113.
Optionally, the substep S114 of constructing the spatial success rate matrix specifically includes:
building a spatial success rate matrix for each user
Figure 323140DEST_PATH_IMAGE044
Initializing the matrix
Figure 629225DEST_PATH_IMAGE047
Assigning matrix elements to 0, for any user
Figure 442460DEST_PATH_IMAGE012
Sequentially taken out from the verification data set
Figure 268334DEST_PATH_IMAGE012
Is signed in
Figure 508822DEST_PATH_IMAGE048
Indicating a check-in record in which, among other things,
Figure 6931DEST_PATH_IMAGE001
representing latitude and longitude
Figure 155015DEST_PATH_IMAGE049
Figure 100975DEST_PATH_IMAGE002
A time stamp is represented which is a time stamp,
Figure 512365DEST_PATH_IMAGE003
representing activity categories
Figure 999234DEST_PATH_IMAGE050
Figure 278906DEST_PATH_IMAGE035
The grid representing the user' S current location is calculated based on the spatial activity preference calculation method in substep S113
Figure 954738DEST_PATH_IMAGE012
At the current position
Figure 333766DEST_PATH_IMAGE001
Sorting the categories according to the spatial preference scores of all the categories from large to small to obtain the category with the highest score
Figure 603205DEST_PATH_IMAGE051
When is coming into contact with
Figure 296354DEST_PATH_IMAGE035
Located in interest grid set
Figure 216906DEST_PATH_IMAGE052
In the middle, if
Figure 704519DEST_PATH_IMAGE051
Is equal to
Figure 959789DEST_PATH_IMAGE003
Will be
Figure 456629DEST_PATH_IMAGE053
To corresponding element
Figure 434949DEST_PATH_IMAGE054
The number of the bits is increased by 1,
Figure 234409DEST_PATH_IMAGE002
to represent
Figure 603074DEST_PATH_IMAGE053
To (1) a
Figure 762660DEST_PATH_IMAGE002
The rows of the image data are, in turn,
Figure 533169DEST_PATH_IMAGE035
is shown as
Figure 219977DEST_PATH_IMAGE035
And column, and analogy, calculating the space success rate matrix of all users.
Optionally, the step S120 of constructing the user time activity preference model by using a non-negative tensor decomposition method specifically includes:
three-dimensional tensor construction sub-step S121:
constructing a three-dimensional tensor of user-time-activity from the user check-in record, expressed as
Figure DEST_PATH_IMAGE056
The elements in the tensor represent the user
Figure 403833DEST_PATH_IMAGE012
In a period of time
Figure 117842DEST_PATH_IMAGE002
Selecting an activity
Figure 742859DEST_PATH_IMAGE003
The number of check-ins;
user time activity preference acquisition substep S122:
obtaining user-time-class preference values for a given tensor using a non-negative tensor decomposition method
Figure 398968DEST_PATH_IMAGE056
Figure 211066DEST_PATH_IMAGE057
The value of each element in (a) is calculated as follows:
Figure 227302DEST_PATH_IMAGE059
in the formula (I), the compound is shown in the specification,
Figure 441245DEST_PATH_IMAGE060
factor matrixes representing users, time and categories, wherein the matrix sizes are respectively
Figure 268256DEST_PATH_IMAGE061
Figure 833229DEST_PATH_IMAGE062
The number of features involved in the decomposition process is controlled for the potential spatial dimension,
Figure 889041DEST_PATH_IMAGE063
are respectively as
Figure 223071DEST_PATH_IMAGE060
The elements of (a) and (b),
adding a non-negative constraint to a least square-based decomposition algorithm in a CP decomposition model to obtain a recovery tensor which is used for describing the time activity preference of a user;
activity preference inference substep S123:
deducing and calculating the activity preference of the user at the current time based on the tensor decomposition result
Figure 955403DEST_PATH_IMAGE057
Normalization from the activity dimension:
Figure 7673DEST_PATH_IMAGE064
for a given user
Figure 368640DEST_PATH_IMAGE012
And time
Figure 557176DEST_PATH_IMAGE002
The sum of all class preference metrics is normalized to 1, in the recovery tensorAll elements have a value range of
Figure 194831DEST_PATH_IMAGE065
Normalized element value
Figure 999976DEST_PATH_IMAGE067
Treating as a user
Figure 600853DEST_PATH_IMAGE012
At the time of
Figure 34108DEST_PATH_IMAGE002
Access categories
Figure 249189DEST_PATH_IMAGE068
Probability of, user
Figure 853214DEST_PATH_IMAGE012
At the time of
Figure 38208DEST_PATH_IMAGE002
The time activity preference of (a) is expressed as:
Figure 466915DEST_PATH_IMAGE070
time success rate matrix construction substep S124:
based on time activity preference distribution, calculating the recommendation success rate of a time activity preference model on grids, and further constructing a time success rate matrix
Figure 728263DEST_PATH_IMAGE072
Each row of the matrix represents a time range
Figure 976842DEST_PATH_IMAGE073
Each column representing a grid
Figure 965527DEST_PATH_IMAGE074
Calculation of time activity preference based on substep S123 using check-in recordsThe method calculates the recommendation success rate of the time activity preference model on the grid.
Optionally, the substep S124 of constructing the time success rate matrix specifically includes:
initializing a matrix
Figure 248741DEST_PATH_IMAGE076
Assigning matrix elements to 0, in the validation dataset, for any user
Figure 916876DEST_PATH_IMAGE012
Taken out in sequence
Figure 652751DEST_PATH_IMAGE012
Is represented as a check-in record of
Figure DEST_PATH_IMAGE077
Wherein, in the step (A),
Figure 913968DEST_PATH_IMAGE001
representing latitude and longitude
Figure 927054DEST_PATH_IMAGE078
Figure 389259DEST_PATH_IMAGE002
A time stamp is represented which is a time stamp,
Figure 737064DEST_PATH_IMAGE003
representing activity categories
Figure 942918DEST_PATH_IMAGE050
Figure 309046DEST_PATH_IMAGE035
The grid representing the user' S current location is calculated using the time activity preference calculation method in substep S123
Figure 207732DEST_PATH_IMAGE012
At the current time
Figure 246095DEST_PATH_IMAGE002
Sorting the categories according to the time preference scores of all the categories from large to small to obtain the category with the highest score
Figure DEST_PATH_IMAGE079
When is coming into contact with
Figure 334268DEST_PATH_IMAGE035
Located in interest grid set
Figure DEST_PATH_IMAGE081
If the time preference model predicts
Figure 40055DEST_PATH_IMAGE079
Is equal to
Figure 220894DEST_PATH_IMAGE003
Will be
Figure 684237DEST_PATH_IMAGE076
To corresponding element
Figure 356527DEST_PATH_IMAGE082
The number of the bits is increased by 1,
Figure 323346DEST_PATH_IMAGE002
to represent
Figure 439200DEST_PATH_IMAGE076
To (1) a
Figure 186576DEST_PATH_IMAGE002
The rows of the image data are, in turn,
Figure 334661DEST_PATH_IMAGE035
is shown as
Figure 733150DEST_PATH_IMAGE035
And column, and so on, calculating the time success rate matrix of all users.
Optionally, after the sub-step S131 of generating the recommendation list, there is further provided
Precision verification substep S132:
evaluating the performance of the recommendation model using accuracy for the test data set based on the recommendation list
Figure DEST_PATH_IMAGE083
The accuracy is calculated as follows:
Figure 738015DEST_PATH_IMAGE084
in the formula (I), the compound is shown in the specification,
Figure DEST_PATH_IMAGE085
indicating the length of the recommendation list and,
Figure 988999DEST_PATH_IMAGE086
indicating that the user is in a check-in record in the test dataset,
Figure 878458DEST_PATH_IMAGE088
representing a user
Figure 944503DEST_PATH_IMAGE012
At the time of
Figure 526794DEST_PATH_IMAGE002
And a location
Figure 285978DEST_PATH_IMAGE001
Top k item (Top-k) activities with the highest score,
Figure DEST_PATH_IMAGE089
representing the number of check-in records in the test dataset.
Optionally, the method divides a user check-in data set, sorts the check-in history of each user according to check-in time, and divides the check-in history of each user into a training data set, a verification data set, and a test data set according to a certain proportion, wherein steps S111, S112, S113, S121, S122, and S123 use the training data set to construct a model, steps S114 and S124 use the verification data set to calculate a success rate, and steps S131 and S132 use the test data set to perform model verification.
Optionally, in the substep S121 of constructing the three-dimensional tensor, three dimensions of the three-dimensional tensor are a user dimension, a time dimension, and an activity dimension, respectively, where the user dimension represents each user as an independent dimension, the activity dimension is represented by a POI category, and the time dimension is represented by a time period divided according to a certain time interval.
Optionally, in the sub-step S122 of obtaining the time activity preference of the user, the tensor decomposition parameter includes a potential spatial dimension, and the size of the potential spatial dimension affects the tensor decomposition time and the recommendation precision.
The invention further discloses a storage medium for storing computer executable instructions, which is characterized in that:
the computer-executable instructions, when executed by a processor, perform the above-described geographic grid-based point of interest activity recommendation method.
The method and the device respectively consider the space and time characteristics of the user sign-in activities in the position social network, and reduce the complexity of the problems. The method comprises the steps of improving sparsity of check-in data of a user based on a geographic grid and a tensor decomposition method, determining interest activities recommended to the user according to prediction probabilities of time distribution and spatial distribution by calculating spatial and temporal activity distribution of the user, carrying out quantitative analysis on interest areas of the user, improving accuracy of interest activity recommendation in a position social network, and enabling recommendation results to meet personalized requirements of the user.
Drawings
FIG. 1 is a flowchart of a method for recommending interest activities based on a geographic grid according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a method for recommending interest activities based on a geographic grid according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating the recommended precision for different potential dimensions in the NCP (Nonnegative Candecamp/Parafac) model, in accordance with a specific embodiment of the present invention;
FIG. 4 is an example of a user time activity preference calculation based on the NCP decomposition model in accordance with a specific embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
The invention is characterized in that: the sign-in activity of the user is modeled by using the sign-in historical records of the user in the position social network, and a personalized city interest activity recommendation method is provided. The method is based on the geography grid, combines the geography preference and the time preference of the user to the city area, analyzes the time-space behavior of the user in the city sign-in, and recommends more appropriate interest activities to the target user.
Referring to fig. 1, a flowchart of a method for recommending interest activities based on a geographic grid is shown, and fig. 2, a specific flowchart of the recommendation method is shown.
An interest activity recommendation method based on a geographic grid comprises the following steps:
constructing a user spatial activity preference model based on the geographic grid step S110:
dividing a city area into a plurality of geographic grids, mapping user check-in information, calculating check-in frequency and category preference deviation ratio of a user in each grid, acquiring an interest grid set of the user, and deducing the position of the user
Figure 572603DEST_PATH_IMAGE001
The spatial activity preference of the user is calculated, the spatial activity preference distribution of the user is utilized, the recommendation success rate of the spatial activity preference model on the grid is calculated, and a spatial success rate matrix is constructed.
Specifically, the step S110 may include the following sub-steps:
user sign-in information mapping substep S111: the city area is divided into a plurality of regular grids with the same size, the sign-in information of the user is mapped to the grids, and the number attribute of the grids is obtained.
In particular, maximum and minimum longitude and latitude values for the area of interest are obtained, in accordance with
Figure 102942DEST_PATH_IMAGE090
The meter interval divides the research area into a plurality of regular grids with the same size, and all grids are numbered, and the numbering starts from 0. Obtaining any sign-in record of user, matching sign-in position to grid, judging grid where sign-in point is located and its number
Figure DEST_PATH_IMAGE091
Experimental data Foursquare data was used, ranging spatially from (40.54085247, -74.28476645) to (40.99833172, -73.6738252). The check-in record is represented as
Figure 200342DEST_PATH_IMAGE092
Wherein, in the step (A),
Figure DEST_PATH_IMAGE093
which represents the number of the user,
Figure 675186DEST_PATH_IMAGE094
which indicates the number of the POI number,
Figure DEST_PATH_IMAGE095
the latitude is represented by the number of lines,
Figure 280348DEST_PATH_IMAGE096
which represents the longitude of the vehicle,
Figure DEST_PATH_IMAGE097
the time of the check-in is represented,
Figure 524248DEST_PATH_IMAGE098
representing POI categories. Adding to check-in records
Figure 261391DEST_PATH_IMAGE091
Attribute, then check-in record is represented as
Figure 754689DEST_PATH_IMAGE099
Specifically, check-in records of users in new york from 2012 month 4 to month 2013 month 2 are recorded in the Foursquare data, and in order to alleviate the influence of data sparsity, users who check in less than 10 times and points of interest who are accessed less than 10 times are removed. Through data preprocessing, the obtained Foursquare data set has 1083 users, 38333 interest points and 227428 total check-in times. The points of interest are divided into 9 major categories, 215 subclasses, and the invention adopts the subclasses as the activity categories.
The invention can divide the user check-in data set, sort the check-in history of each user according to the check-in time, and divide the check-in history into the training data set, the verification data set and the test data set according to a certain proportion, such as 8:1: 1.
The user interest mesh acquisition sub-step S112:
calculating the check-in frequency of the user in each grid
Figure 789641DEST_PATH_IMAGE101
And class preference bias ratio
Figure 936982DEST_PATH_IMAGE103
Using the frequency of attendance
Figure 766398DEST_PATH_IMAGE101
And class preference bias ratio
Figure 746992DEST_PATH_IMAGE103
To characterize the user's preference on the grid by setting a frequency threshold
Figure 585635DEST_PATH_IMAGE105
And a preference deviation ratio threshold
Figure 820438DEST_PATH_IMAGE107
And screening to obtain an interest grid set of the user.
First, the check-in frequency of the user in the grid is a direct evaluation index of the popularity of the grid, and the more times a region is checked-in, the more attractive the interest points in the region are to the user. Meanwhile, since the user usually only accesses a few categories, not all categories, in the frequently checked-in area, the check-in of the user on the grid is diversified.
Thus, the present invention primarily passes through the check-in frequency
Figure 86335DEST_PATH_IMAGE101
And class preference bias ratio
Figure 288646DEST_PATH_IMAGE103
To evaluate the user's preference in the grid,
frequency of check-in
Figure 196559DEST_PATH_IMAGE101
Is represented as a user
Figure 784404DEST_PATH_IMAGE012
In that
Figure 486781DEST_PATH_IMAGE013
The number of check-ins in (a) is a proportion of the total number of check-ins,
Figure 910809DEST_PATH_IMAGE109
in the formula (I), the compound is shown in the specification,
Figure 356834DEST_PATH_IMAGE111
representing a user
Figure 566229DEST_PATH_IMAGE012
In that
Figure 173928DEST_PATH_IMAGE018
The number of check-ins of (c),
Figure 350832DEST_PATH_IMAGE019
representing a set of grids visited by a user;
preference deviation ratio
Figure 69389DEST_PATH_IMAGE103
For measuring users
Figure 900335DEST_PATH_IMAGE012
In a grid
Figure 678935DEST_PATH_IMAGE013
Class preference in (1), assuming mesh
Figure 343135DEST_PATH_IMAGE013
All of them share
Figure 865383DEST_PATH_IMAGE112
POI (POI of interest) of a category to which a user has access
Figure 19677DEST_PATH_IMAGE022
A category
Figure 93812DEST_PATH_IMAGE023
POI, preference bias ratio
Figure 432259DEST_PATH_IMAGE103
Measure and make a best of
Figure 758198DEST_PATH_IMAGE012
In that
Figure 45960DEST_PATH_IMAGE018
Sign-in class distribution entropy of
Figure DEST_PATH_IMAGE113
And the fractional difference between the maximum entropy of the category distribution, the calculation formula is as follows:
Figure DEST_PATH_IMAGE115
in the formula (I), the compound is shown in the specification,
Figure 244991DEST_PATH_IMAGE117
representation grid
Figure 883783DEST_PATH_IMAGE013
The total number of categories of POIs present in,
Figure 339646DEST_PATH_IMAGE119
indicates the user is
Figure 357280DEST_PATH_IMAGE028
The number of sign-ins
Figure 38797DEST_PATH_IMAGE013
The ratio of the total number of check-in times in the table,
Figure 774672DEST_PATH_IMAGE029
representing the maximum entropy of the user in the check-in category, which assumes that the user is in all categories
Figure 583359DEST_PATH_IMAGE030
The likelihood of a check-in is the same,
Figure 189921DEST_PATH_IMAGE121
representing a user
Figure 307919DEST_PATH_IMAGE012
In a grid
Figure 531090DEST_PATH_IMAGE013
Category of checked-in.
After calculating the check-in frequency and the preference deviation ratio of each grid, introducing a frequency threshold value
Figure 110844DEST_PATH_IMAGE105
And a preference deviation ratio threshold
Figure 103071DEST_PATH_IMAGE107
Obtaining a set of grids of interest to a user
Figure 126391DEST_PATH_IMAGE122
Set of grids
Figure 571279DEST_PATH_IMAGE122
The formed area is the user interested area.
The specific method for acquiring the region of interest of the user comprises the following steps: firstly, the check-in number of the user in all areas is calculated, and the grid set where the user checks in is obtained
Figure DEST_PATH_IMAGE123
Then sequentially scanning the grids visited by the user
Figure 925031DEST_PATH_IMAGE124
Calculating the check-in frequency
Figure DEST_PATH_IMAGE125
If the grid is one that is frequently visited by the user (i.e., the check-in frequency is greater than or equal to the threshold value)
Figure 630819DEST_PATH_IMAGE126
) Calculating a preference deviation ratio
Figure 811658DEST_PATH_IMAGE127
(ii) a If it is not
Figure 743841DEST_PATH_IMAGE127
Greater than or equal to
Figure 416131DEST_PATH_IMAGE128
Then, then
Figure 382950DEST_PATH_IMAGE129
Adding the interest grids as users into the interest grid set
Figure 233226DEST_PATH_IMAGE130
Performing the following steps; sequentially traversing all check-in grids of the user to obtain an interest grid set of the user
Figure 918285DEST_PATH_IMAGE130
Figure 394266DEST_PATH_IMAGE131
Determines the activity of the user on the grid,
Figure 215591DEST_PATH_IMAGE132
representing the degree of deviation of the user's activity preference in the grid.
Figure 882DEST_PATH_IMAGE131
And
Figure 704396DEST_PATH_IMAGE132
the larger the value is,
Figure 593855DEST_PATH_IMAGE130
the smaller the set.
Spatial activity preference calculation substep S113:
interest-based grid set
Figure 659900DEST_PATH_IMAGE130
The spatial activity preference of the user at the current location is inferred. At a known current location of the user
Figure 976611DEST_PATH_IMAGE001
In the case of (2), the influence of the individual meshes in the set is first calculated, i.e. the meshes are evaluated using spatial proximity
Figure 246050DEST_PATH_IMAGE035
To pair
Figure 939199DEST_PATH_IMAGE001
Then the activity preferences of all grids are calculated using a weighting method.
In particular, at a known current location of the user
Figure 594172DEST_PATH_IMAGE001
In the case of (2), the grid is evaluated using spatial proximity
Figure 347364DEST_PATH_IMAGE035
To pair
Figure 605563DEST_PATH_IMAGE001
According to the conclusions from the existing research, the following weight functions are adopted:
Figure 836825DEST_PATH_IMAGE036
in the formula (I), the compound is shown in the specification,
Figure 611883DEST_PATH_IMAGE037
indicating the current location of the user
Figure 535976DEST_PATH_IMAGE001
And a grid
Figure 780007DEST_PATH_IMAGE035
The further away,
Figure 611697DEST_PATH_IMAGE133
smaller values indicate less spatial appeal of the grid to the user.
Then, the user presence is calculated by using a weighting method
Figure 116627DEST_PATH_IMAGE001
For all categories of activity preferences, assume that the user
Figure 336256DEST_PATH_IMAGE012
Is provided with
Figure 192217DEST_PATH_IMAGE038
Individual interest grid
Figure 139182DEST_PATH_IMAGE134
Then the user
Figure 764198DEST_PATH_IMAGE012
In position
Figure 889149DEST_PATH_IMAGE001
The spatial activity preference of (a) is:
Figure 966826DEST_PATH_IMAGE135
in the formula (I), the compound is shown in the specification,
Figure 484527DEST_PATH_IMAGE136
representing a user
Figure 964050DEST_PATH_IMAGE012
In that
Figure 197585DEST_PATH_IMAGE013
Inner pair
Figure 887192DEST_PATH_IMAGE028
The frequency of the check-in of (c),
Figure 802058DEST_PATH_IMAGE003
all activity categories are shown, user is
Figure 778498DEST_PATH_IMAGE001
Is aligned with
Figure 386197DEST_PATH_IMAGE028
The spatial activity preference of the user is
Figure 563100DEST_PATH_IMAGE137
Is preferred in
Figure 281658DEST_PATH_IMAGE001
The sum of the geographic influences of (c).
The spatial success rate matrix construction sub-step S114:
based on the spatial activity preference distribution, calculating the recommendation success rate of the spatial activity preference model on the grid by using a verification data set, and further constructing a spatial success rate matrix for each user
Figure 611139DEST_PATH_IMAGE138
Each row of the matrix represents a time range
Figure 124160DEST_PATH_IMAGE045
Each column representing a grid
Figure 788359DEST_PATH_IMAGE046
And calculating the recommendation success rate of the spatial activity preference model on the grid by using the check-in records in the verification set based on the calculation method of the spatial activity preference in the substep S113.
In particular, use
Figure 310608DEST_PATH_IMAGE139
Representing a user
Figure 993131DEST_PATH_IMAGE012
A spatial success rate matrix of (1), each row of the matrix representing a time stamp
Figure 942632DEST_PATH_IMAGE002
Each column representing a grid
Figure 500652DEST_PATH_IMAGE140
. Considering the periodicity of the check-in points of the user in the check-in on the day, the check-in on the working day and the check-in on the non-working day, the time is divided into 24 time periods by 1 hour interval, each day is divided into 168 time intervals, and each time interval represents a time stamp
Figure 951225DEST_PATH_IMAGE002
. Given time
Figure 114353DEST_PATH_IMAGE141
Figure 110122DEST_PATH_IMAGE002
The calculation of (c) is shown in the following formula:
Figure 624280DEST_PATH_IMAGE142
in the formula (I), the compound is shown in the specification,
Figure DEST_PATH_IMAGE143
for the corresponding week, respectively
Figure 816227DEST_PATH_IMAGE144
Indicating that on a monday through a sunday,
Figure DEST_PATH_IMAGE145
representing hours. For example, times 2012-04-2307: 10:18,
Figure 667816DEST_PATH_IMAGE143
the number of the carbon atoms is 1,
Figure 349333DEST_PATH_IMAGE145
is 7.
First, the matrix is initialized
Figure 85208DEST_PATH_IMAGE146
The matrix element is assigned a value of 0. In the authentication dataset, for any user
Figure 362736DEST_PATH_IMAGE012
Taken out in sequence
Figure 500457DEST_PATH_IMAGE012
Is signed in, is shown as
Figure DEST_PATH_IMAGE147
For convenience of expression, use
Figure 821716DEST_PATH_IMAGE148
Indicating that the check-in record (where,
Figure 153210DEST_PATH_IMAGE001
representing latitude and longitude
Figure DEST_PATH_IMAGE149
Figure 218117DEST_PATH_IMAGE002
Represents a time stamp in accordance with
Figure 210344DEST_PATH_IMAGE141
Calculating to obtain;
Figure 718817DEST_PATH_IMAGE003
representing activity categories
Figure 429284DEST_PATH_IMAGE150
Figure 563462DEST_PATH_IMAGE035
Indicating the grid that the user is currently on). Based on the calculation method of the spatial Activity preference in substep S113, a calculation is made
Figure 410196DEST_PATH_IMAGE012
At the current position
Figure 856614DEST_PATH_IMAGE001
Sorting the categories according to the spatial preference scores of all the categories from large to small to obtain the category with the highest score
Figure DEST_PATH_IMAGE151
When is coming into contact with
Figure 382273DEST_PATH_IMAGE035
Located in interest grid set
Figure 195508DEST_PATH_IMAGE152
In the middle, if
Figure 37693DEST_PATH_IMAGE151
Is equal to
Figure 12603DEST_PATH_IMAGE003
Will be
Figure DEST_PATH_IMAGE153
To corresponding element
Figure 556717DEST_PATH_IMAGE154
The number of the bits is increased by 1,
Figure 16386DEST_PATH_IMAGE002
to represent
Figure 103290DEST_PATH_IMAGE153
To (1) a
Figure 373735DEST_PATH_IMAGE002
The rows of the image data are, in turn,
Figure 280511DEST_PATH_IMAGE035
is shown as
Figure 310915DEST_PATH_IMAGE035
And column, and analogy, calculating the space success rate matrix of all users.
A step S120 of constructing a user time activity preference model by using a non-negative tensor decomposition method:
constructing a three-dimensional tensor of user-time-class according to the user check-in record, wherein elements in the tensor represent the user in a time period
Figure 986747DEST_PATH_IMAGE002
Selecting an activity
Figure 693672DEST_PATH_IMAGE003
The sign-in frequency is obtained by a recovery tensor to describe the time activity preference of the user based on a non-negative tensor decomposition algorithm according to the three-dimensional tensor; deducing activity preference distribution of the user at the current time according to the tensor decomposition result; computing temporal activity preference model on grid based on temporal activity preference distributionAnd recommending the success rate, and further constructing a time success rate matrix.
Specifically, the method comprises the following substeps:
three-dimensional tensor construction sub-step S121:
constructing a three-dimensional tensor of user-time-activity from the user check-in record, expressed as
Figure 87744DEST_PATH_IMAGE156
In the time dimension, considering the periodicity of the check-in points of the users in the check-in on the day, the check-in on the working day and the check-in on the non-working day, the time is divided into 24 time periods every day and 168 time intervals every week according to the interval of 1 hour. In the activity dimension, 251 sub-categories of POIs are used to represent user activity. In the present embodiment, there are 1083 active users in total, 215 categories. Thus, the tensor is represented as
Figure DEST_PATH_IMAGE157
. The elements in the tensor represent the user
Figure 360987DEST_PATH_IMAGE012
In a period of time
Figure 547117DEST_PATH_IMAGE002
Selecting an activity
Figure 34731DEST_PATH_IMAGE003
The number of sign-ins if the user is not in the time period
Figure 791465DEST_PATH_IMAGE158
Access overactivity
Figure DEST_PATH_IMAGE159
Then, then
Figure DEST_PATH_IMAGE161
Is 0. The purpose of tensor resolution is to decompose
Figure 85043DEST_PATH_IMAGE162
The 0 value in (1) is assigned by decomposition algorithmA predetermined value.
Therefore, in the sub-step, three dimensions of the three-dimensional tensor are a user dimension, a time dimension and an activity dimension respectively, the user dimension represents each user as one dimension, the activity dimension is represented by a POI category, and the time dimension is represented by a time period divided according to a certain time interval.
User time activity preference acquisition substep S122:
since the sign-in probability of the user cannot be negative, the negative values in the recovery tensor are meaningless for the preference of the user, and therefore the constructed tensor is decomposed into three first-order tensor sums by adopting a non-negative CP decomposition model.
Obtaining user-time-class preference values for a given tensor using a non-negative tensor decomposition method
Figure 374948DEST_PATH_IMAGE156
Figure 33462DEST_PATH_IMAGE162
The value of each element in (a) is calculated as follows:
Figure DEST_PATH_IMAGE163
in the formula (I), the compound is shown in the specification,
Figure 995602DEST_PATH_IMAGE060
factor matrixes representing users, time and categories, wherein the matrix sizes are respectively
Figure 905921DEST_PATH_IMAGE061
Figure 410851DEST_PATH_IMAGE062
The number of features involved in the decomposition process is controlled for the potential spatial dimension,
Figure 630480DEST_PATH_IMAGE063
are respectively as
Figure 220861DEST_PATH_IMAGE060
Of (2) is used.
Adding a non-negative constraint to the least squares based decomposition algorithm in the CP decomposition model yields a recovery tensor describing the temporal activity preference of the user.
Preferably, CP decomposition (CANDECOMP/PARAFAC) decomposes the tensor into three factor matrices (i.e., user, time and class factor matrices) and optimizes the tensor using an alternating least squares method
Figure 436335DEST_PATH_IMAGE162
And the original
Figure 326931DEST_PATH_IMAGE164
Loss function between tensors.
The tensor decomposition parameters include potential spatial dimensions that can significantly affect recommendation performance, particularly tensor decomposition time and recommendation accuracy. The present embodiment also takes into account the impact of the potential spatial dimension on the accuracy of the recommendation. The change in recommended performance during the change in potential spatial dimension from 8 to 128 is given in fig. 3. It can be seen that the larger the dimension, the better the recommended performance, but the growth rate gradually slows.
In one example, the restored tensor is obtained according to the model construction idea, and the nonnegative tensor decomposition can be completed by using a Tensorly open source code packet of Python. Tensorly is an open source code packet which can perform tensor decomposition, tensor learning and tensor algebra, and a non _ negative _ parafacc function provided by Tensorly can realize NCP decomposition.
Activity preference inference substep S123:
and deducing activity preference of the user at the current time based on the tensor decomposition result. In order to infer the user
Figure 451882DEST_PATH_IMAGE012
At the time of
Figure 795138DEST_PATH_IMAGE002
Class bias ofPreferably, i.e. the user is at
Figure 781680DEST_PATH_IMAGE002
Temporal access categories
Figure 526782DEST_PATH_IMAGE003
Possibility of (1) to
Figure 88214DEST_PATH_IMAGE162
Normalization from the activity dimension:
Figure DEST_PATH_IMAGE165
thus, for a given user
Figure 495930DEST_PATH_IMAGE012
And time
Figure 676376DEST_PATH_IMAGE002
The sum of all class preference metrics is normalized to 1 and the value range of all elements in the recovery tensor is
Figure 135039DEST_PATH_IMAGE065
Normalization enables temporal and spatial preferences to be fused, normalized element values
Figure DEST_PATH_IMAGE167
Treating as a user
Figure 352525DEST_PATH_IMAGE012
At the time of
Figure 873636DEST_PATH_IMAGE002
Access categories
Figure 982406DEST_PATH_IMAGE068
Probability of, user
Figure 436521DEST_PATH_IMAGE012
At the time of
Figure 869250DEST_PATH_IMAGE002
The time activity preference of (a) is expressed as:
Figure 143237DEST_PATH_IMAGE168
an example of a user time activity preference calculation based on the NCP decomposition model is shown in FIG. 4, which shows a user
Figure 790119DEST_PATH_IMAGE012
At any time period
Figure 833161DEST_PATH_IMAGE002
Internal access activity
Figure 658029DEST_PATH_IMAGE003
The probability of (c).
Time success rate matrix construction substep S124:
based on the time activity preference distribution, calculating the recommendation success rate of the time activity preference model on the grid by using the verification data set, and further constructing a time success rate matrix
Figure 684891DEST_PATH_IMAGE170
Each row of the matrix represents a time range
Figure 135464DEST_PATH_IMAGE073
Each column representing a grid
Figure 298592DEST_PATH_IMAGE074
And calculating the recommendation success rate of the time activity preference model on the grid by using the check-in records in the verification set based on the time activity preference calculation method in the substep S123.
Figure 58475DEST_PATH_IMAGE172
Representing a user
Figure 41475DEST_PATH_IMAGE012
Time success rate matrix. As described in S114
Figure DEST_PATH_IMAGE173
Each row of the matrix represents a time stamp
Figure 499001DEST_PATH_IMAGE002
Each column representing a grid
Figure 923160DEST_PATH_IMAGE074
First, the matrix is initialized
Figure 214464DEST_PATH_IMAGE172
Assigning matrix elements to 0, in the validation dataset, for any user
Figure 74973DEST_PATH_IMAGE012
Taken out in sequence
Figure 742714DEST_PATH_IMAGE012
Is represented as a check-in record of
Figure 460528DEST_PATH_IMAGE077
(wherein,
Figure 47367DEST_PATH_IMAGE001
representing latitude and longitude
Figure 270538DEST_PATH_IMAGE078
Figure 617337DEST_PATH_IMAGE002
A presentation time stamp;
Figure 609564DEST_PATH_IMAGE003
representing activity categories
Figure 367304DEST_PATH_IMAGE050
Figure 77771DEST_PATH_IMAGE035
Indicating the grid that the user is currently on). Calculating using the method of calculating the time activity preference in substep S123
Figure 461217DEST_PATH_IMAGE012
At the current time
Figure 307950DEST_PATH_IMAGE002
Sorting the categories according to the time preference scores of all the categories from large to small to obtain the category with the highest score
Figure 502171DEST_PATH_IMAGE079
. When in use
Figure 434355DEST_PATH_IMAGE035
Located in interest grid set
Figure 122957DEST_PATH_IMAGE174
If the time preference model predicts
Figure 620934DEST_PATH_IMAGE079
Is equal to
Figure 861422DEST_PATH_IMAGE003
Will be
Figure 405536DEST_PATH_IMAGE172
To corresponding element
Figure 756883DEST_PATH_IMAGE082
The number of the bits is increased by 1,
Figure 955040DEST_PATH_IMAGE002
to represent
Figure 366429DEST_PATH_IMAGE172
To (1) a
Figure 397839DEST_PATH_IMAGE002
The rows of the image data are, in turn,
Figure 631506DEST_PATH_IMAGE035
is shown as
Figure 104075DEST_PATH_IMAGE035
And column, and so on, calculating the time success rate matrix of all users.
A temporal preference and spatial preference fusion step S130:
in the case of a given user's spatial and temporal context, a fusion method needs to be adopted to fuse the temporal and spatial activity preferences, and methods such as linear weighting, multiplication and the like are common fusion methods. However, since the performance of the spatial and temporal models varies with time and place, it is difficult to dynamically assign these two weights according to the user context.
Recommendation list generation substep S131: and deducing the user activity preference according to the success rate matrix, comparing element values in the space success rate matrix and the time success rate matrix, and selecting a model result with a higher value as a final recommendation result.
Specifically, in the test data set, the user check-in records are sequentially taken out
Figure 14263DEST_PATH_IMAGE077
The following judgment is made: first for a given user
Figure 408335DEST_PATH_IMAGE012
And its context (i.e. time)
Figure 475386DEST_PATH_IMAGE002
And position
Figure 271303DEST_PATH_IMAGE001
). Comparison of
Figure 883550DEST_PATH_IMAGE175
And
Figure 764919DEST_PATH_IMAGE177
and selecting the model with higher value as the final preference. If the two are equal, the result predicted by the space activity preference model is adopted.
Experimental results show that the spatial activity preference model can better capture activity preference of the user.
Furthermore, the accuracy verification is carried out on the interest activity recommendation method based on the geographic grid through experiments.
Precision verification substep S132:
evaluating the performance of the recommendation model using accuracy for the test data set based on the recommendation list
Figure 137125DEST_PATH_IMAGE178
The accuracy is calculated as follows:
Figure DEST_PATH_IMAGE179
in the formula (I), the compound is shown in the specification,
Figure 381025DEST_PATH_IMAGE180
indicating the length of the recommendation list and,
Figure DEST_PATH_IMAGE181
indicating that the user is in a check-in record in the test set,
Figure 150791DEST_PATH_IMAGE182
representing a user
Figure 253876DEST_PATH_IMAGE012
At the time of
Figure 413462DEST_PATH_IMAGE002
And a location
Figure 183972DEST_PATH_IMAGE001
Top k item (Top-k) activities with the highest score,
Figure DEST_PATH_IMAGE183
indicating the number of check-in records in the test set.
The effect of the recommended quantity on the accuracy is also considered in this experiment, and table 1 shows the variation of the recommended accuracy of several recommendation methods when the recommended quantity is 1, 5 and 10. The comparison methods include MFT (most frequently visited activity within a time period), CP (CP decomposition), NCP (non-negative CP decomposition), MFA (most frequently visited activity by a user), SPM (spatial preference model), STUAP (geographic grid-based interest activity recommendation method). It can be seen that the interest activity recommendation method of the present invention has the highest accuracy under the condition that the recommendation number is the same.
Table 1 recommended performance comparison experiment
Figure 888754DEST_PATH_IMAGE184
The experimental data set comprises a training set, a verification set and a test set, wherein the training data set is adopted to construct a model in steps S111, S112, S113, S121, S122 and S123, the verification data set is adopted to calculate the success rate in steps S114 and S124, and the test data set is adopted to verify the model in steps S131 and S132.
The present invention further discloses a storage medium for storing computer-executable instructions which, when executed by a processor, perform the above-mentioned method for recommending point of interest activities based on a geographic grid.
In conclusion, the spatial and temporal characteristics of the user check-in activities in the location social network are considered respectively, and the complexity of the problem is reduced. The method comprises the steps of improving sparsity of check-in data of a user based on a geographic grid and a tensor decomposition method, determining interest activities recommended to the user according to prediction probabilities of time distribution and spatial distribution by calculating spatial and temporal activity distribution of the user, carrying out quantitative analysis on interest areas of the user, improving accuracy of interest activity recommendation in a position social network, and enabling recommendation results to meet personalized requirements of the user.
It will be apparent to those skilled in the art that the various elements or steps of the invention described above may be implemented using a general purpose computing device, they may be centralized on a single computing device, or alternatively, they may be implemented using program code that is executable by a computing device, such that they may be stored in a memory device and executed by a computing device, or they may be separately fabricated into various integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
While the invention has been described in further detail with reference to specific preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. An interest activity recommendation method based on a geographic grid comprises the following steps:
constructing a user spatial activity preference model based on the geographic grid step S110:
dividing a city area into a plurality of geographic grids, mapping user check-in information, calculating check-in frequency and category preference deviation ratio of a user in each grid, acquiring an interest grid set of the user, and deducing the position of the user
Figure 542874DEST_PATH_IMAGE001
Calculating the spatial activity preference distribution of the user, calculating the recommendation success rate of a spatial activity preference model on the grid by using the spatial activity preference distribution, and constructing a spatial success rate matrix;
a step S120 of constructing a user time activity preference model by using a non-negative tensor decomposition method:
constructing three of user-time-categories from user check-in recordsDimension tensor, the elements of which represent the user over a period of time
Figure 991172DEST_PATH_IMAGE002
Selecting an activity
Figure 302068DEST_PATH_IMAGE003
The sign-in frequency is obtained by a recovery tensor to describe the time activity preference of the user based on a non-negative tensor decomposition algorithm according to the three-dimensional tensor; deducing and calculating activity preference distribution of the user at the current time according to the tensor decomposition result; based on time activity preference distribution, calculating the recommendation success rate of a time activity preference model on grids, and constructing a time success rate matrix;
a temporal preference and spatial preference fusion step S130:
recommendation list generation substep S131: and deducing the user activity preference according to the success rate matrix, comparing element values in the space success rate matrix and the time success rate matrix, and selecting a model result with a higher value as a final recommendation result.
2. The interest activity recommendation method of claim 1, wherein:
the step S110 of constructing a user spatial activity preference model based on a geographic grid includes the following sub-steps:
user sign-in information mapping substep S111: dividing the urban area into a plurality of regular grids with the same size, mapping the sign-in information of the user to the grids, and obtaining the number attribute of the grids;
the user interest mesh acquisition sub-step S112:
calculating the check-in frequency of the user in each grid
Figure 494015DEST_PATH_IMAGE005
And class preference bias ratio
Figure 842475DEST_PATH_IMAGE007
Using said check-in frequency
Figure 196096DEST_PATH_IMAGE005
And the class preference deviation ratio
Figure 259867DEST_PATH_IMAGE007
To characterize the user's preference on the grid by setting a frequency threshold
Figure 989926DEST_PATH_IMAGE009
And a preference deviation ratio threshold
Figure 189963DEST_PATH_IMAGE011
Screening to obtain an interest grid set of the user,
wherein the check-in frequency
Figure 714485DEST_PATH_IMAGE005
Is represented as a user
Figure 999973DEST_PATH_IMAGE012
In that
Figure 268143DEST_PATH_IMAGE013
The number of check-ins in (a) is a proportion of the total number of check-ins,
Figure 588266DEST_PATH_IMAGE015
in the formula (I), the compound is shown in the specification,
Figure 283690DEST_PATH_IMAGE017
representing a user
Figure 322053DEST_PATH_IMAGE012
In that
Figure 128335DEST_PATH_IMAGE019
The number of check-ins of (c),
Figure 302964DEST_PATH_IMAGE021
representing a set of grids visited by a user;
the preference deviation ratio
Figure 169289DEST_PATH_IMAGE007
For measuring users
Figure 429369DEST_PATH_IMAGE012
In a grid
Figure 30553DEST_PATH_IMAGE013
Class preference in (1), assuming mesh
Figure 59689DEST_PATH_IMAGE013
All of them share
Figure 362494DEST_PATH_IMAGE023
POI (POI of interest) of a category to which a user has access
Figure 844291DEST_PATH_IMAGE024
A category
Figure 257955DEST_PATH_IMAGE025
The POI of (1), the calculation formula is as follows:
Figure 407177DEST_PATH_IMAGE027
in the formula (I), the compound is shown in the specification,
Figure 880883DEST_PATH_IMAGE023
representation grid
Figure 849976DEST_PATH_IMAGE013
The total number of categories of POIs present in,
Figure 801752DEST_PATH_IMAGE029
indicates the user is
Figure 71059DEST_PATH_IMAGE030
The number of sign-ins
Figure 715667DEST_PATH_IMAGE013
The ratio of the total number of check-in times in the table,
Figure 172056DEST_PATH_IMAGE031
representing the maximum entropy of the user in the check-in category, which assumes that the user is in all categories
Figure 927523DEST_PATH_IMAGE032
The likelihood of a check-in is the same,
Figure 520178DEST_PATH_IMAGE034
representing a user
Figure 335687DEST_PATH_IMAGE012
In a grid
Figure 547881DEST_PATH_IMAGE013
A category of checked-in;
after calculating the check-in frequency and the preference deviation ratio of each grid, introducing a frequency threshold value
Figure 372618DEST_PATH_IMAGE009
And a preference deviation ratio threshold
Figure 85359DEST_PATH_IMAGE011
Obtaining a set of grids of interest to a user
Figure 806190DEST_PATH_IMAGE035
Set of grids
Figure 237172DEST_PATH_IMAGE035
The formed area is a user interested area;
spatial activity preference calculation substep S113:
at a known current location of the user
Figure 600020DEST_PATH_IMAGE001
In the case of (2), the grid is evaluated using spatial proximity
Figure 432847DEST_PATH_IMAGE036
To pair
Figure 324579DEST_PATH_IMAGE001
The following weight function is used:
Figure 242857DEST_PATH_IMAGE038
in the formula (I), the compound is shown in the specification,
Figure 409396DEST_PATH_IMAGE039
indicating the current location of the user
Figure 831150DEST_PATH_IMAGE001
And a grid
Figure 159363DEST_PATH_IMAGE036
The distance of the center point of (a),
then, the user presence is calculated by using a weighting method
Figure 33778DEST_PATH_IMAGE001
For all categories of activity preferences, set user
Figure 4008DEST_PATH_IMAGE012
Is provided with
Figure 280269DEST_PATH_IMAGE040
Individual interest grid
Figure 47892DEST_PATH_IMAGE042
Then the user
Figure 675183DEST_PATH_IMAGE012
In position
Figure 917945DEST_PATH_IMAGE001
The spatial activity preference of (a) is:
Figure 314291DEST_PATH_IMAGE044
in the formula (I), the compound is shown in the specification,
Figure 984307DEST_PATH_IMAGE045
representing a user
Figure 98894DEST_PATH_IMAGE012
In that
Figure 879768DEST_PATH_IMAGE013
Inner pair
Figure 396200DEST_PATH_IMAGE030
The frequency of the check-in of (c),
Figure 237117DEST_PATH_IMAGE003
all activity categories are shown, user is
Figure 839000DEST_PATH_IMAGE001
Is aligned with
Figure 423565DEST_PATH_IMAGE030
The spatial activity preference of the user is
Figure 794503DEST_PATH_IMAGE047
Is preferred in
Figure 806322DEST_PATH_IMAGE001
The sum of the geographic influences of (c);
the spatial success rate matrix construction sub-step S114:
based on spatial activity preference distribution, calculating the recommendation success rate of a spatial activity preference model on grids, and further constructing a spatial success rate matrix for each user
Figure 629921DEST_PATH_IMAGE049
Each row of the matrix represents a time range
Figure 752598DEST_PATH_IMAGE050
Each column representing a grid
Figure 246552DEST_PATH_IMAGE051
And calculating the recommendation success rate of the spatial activity preference model on the grid by using the check-in record based on the calculation method of the spatial activity preference in the substep S113.
3. The interest activity recommendation method of claim 2, wherein:
the substep S114 of constructing the spatial success rate matrix specifically includes:
building a spatial success rate matrix for each user
Figure 429271DEST_PATH_IMAGE049
Initializing the matrix
Figure 5746DEST_PATH_IMAGE052
Assigning matrix elements to 0, for any user
Figure 932114DEST_PATH_IMAGE012
Sequentially taken out from the verification data set
Figure 12065DEST_PATH_IMAGE012
For signing in record
Figure 631266DEST_PATH_IMAGE053
It is shown that, among others,
Figure 429457DEST_PATH_IMAGE001
representing latitude and longitude
Figure 159516DEST_PATH_IMAGE054
Figure 93974DEST_PATH_IMAGE002
A time stamp is represented which is a time stamp,
Figure 884075DEST_PATH_IMAGE003
representing activity categories
Figure 169563DEST_PATH_IMAGE055
Figure 437733DEST_PATH_IMAGE036
The grid representing the user' S current location is calculated based on the spatial activity preference calculation method in substep S113
Figure 492277DEST_PATH_IMAGE012
At the current position
Figure 453280DEST_PATH_IMAGE001
Sorting the categories according to the spatial preference scores of all the categories from large to small to obtain the category with the highest score
Figure 960485DEST_PATH_IMAGE056
When is coming into contact with
Figure 32346DEST_PATH_IMAGE036
Located in interest grid set
Figure 221624DEST_PATH_IMAGE057
In the middle, if
Figure 87948DEST_PATH_IMAGE056
Is equal to
Figure 348028DEST_PATH_IMAGE003
Will be
Figure 958001DEST_PATH_IMAGE058
To corresponding element
Figure 987137DEST_PATH_IMAGE059
The number of the bits is increased by 1,
Figure 24363DEST_PATH_IMAGE002
to represent
Figure 37319DEST_PATH_IMAGE058
To (1) a
Figure 450983DEST_PATH_IMAGE002
The rows of the image data are, in turn,
Figure 334625DEST_PATH_IMAGE036
to represent
Figure 808332DEST_PATH_IMAGE036
And thirdly, calculating a space success rate matrix of all users by analogy.
4. The interest activity recommendation method of claim 3, wherein:
the step S120 of constructing the user time activity preference model by using the non-negative tensor decomposition method specifically includes:
three-dimensional tensor construction sub-step S121:
constructing a three-dimensional tensor of user-time-activity from the user check-in record,is shown as
Figure 777425DEST_PATH_IMAGE061
The elements in the tensor represent the user
Figure 729200DEST_PATH_IMAGE012
In a period of time
Figure 732928DEST_PATH_IMAGE002
Selecting an activity
Figure 111957DEST_PATH_IMAGE003
The number of check-ins;
user time activity preference acquisition substep S122:
obtaining user-time-class preference values for a given tensor using a non-negative tensor decomposition method
Figure 568346DEST_PATH_IMAGE061
Figure 323812DEST_PATH_IMAGE062
The value of each element in (a) is calculated as follows:
Figure 184977DEST_PATH_IMAGE064
in the formula (I), the compound is shown in the specification,
Figure 266065DEST_PATH_IMAGE065
factor matrixes respectively representing users, time and categories, wherein the matrix size is
Figure 209750DEST_PATH_IMAGE066
Figure 503329DEST_PATH_IMAGE067
The number of features involved in the decomposition process is controlled for the potential spatial dimension,
Figure 216070DEST_PATH_IMAGE068
are respectively as
Figure 202480DEST_PATH_IMAGE065
The elements of (a) and (b),
adding a non-negative constraint to a least square-based decomposition algorithm in a CP decomposition model to obtain a recovery tensor which is used for describing the time activity preference of a user;
activity preference inference substep S123:
deducing and calculating the activity preference of the user at the current time based on the tensor decomposition result
Figure 633462DEST_PATH_IMAGE062
Normalization from the activity dimension:
Figure 730731DEST_PATH_IMAGE069
for a given user
Figure 297978DEST_PATH_IMAGE012
And time
Figure 455290DEST_PATH_IMAGE002
The sum of all class preference metrics is normalized to 1 and the value range of all elements in the recovery tensor is
Figure 373567DEST_PATH_IMAGE070
Normalized element value
Figure 274527DEST_PATH_IMAGE072
Treating as a user
Figure 696281DEST_PATH_IMAGE012
At the time of
Figure 24495DEST_PATH_IMAGE002
Access categories
Figure 164489DEST_PATH_IMAGE073
Probability of, user
Figure 872069DEST_PATH_IMAGE012
At the time of
Figure 413909DEST_PATH_IMAGE002
The time activity preference of (a) is expressed as:
Figure 913024DEST_PATH_IMAGE075
time success rate matrix construction substep S124:
based on time activity preference distribution, calculating the recommendation success rate of a time activity preference model on grids, and further constructing a time success rate matrix
Figure 805893DEST_PATH_IMAGE077
Each row of the matrix represents a time range
Figure DEST_PATH_IMAGE078
Each column representing a grid
Figure 314235DEST_PATH_IMAGE079
And calculating the recommendation success rate of the time activity preference model on the grid by using the check-in record based on the time activity preference calculation method in the substep S123.
5. The interest activity recommendation method of claim 4, wherein:
the time success rate matrix construction substep S124 specifically comprises:
initializing a matrix
Figure 976160DEST_PATH_IMAGE081
Assigning matrix elements to 0, in the validation dataset, for any user
Figure 646176DEST_PATH_IMAGE012
Taken out in sequence
Figure 760763DEST_PATH_IMAGE012
Is represented as a check-in record of
Figure 807216DEST_PATH_IMAGE082
Wherein, in the step (A),
Figure 323648DEST_PATH_IMAGE001
representing latitude and longitude
Figure 164565DEST_PATH_IMAGE083
Figure 766448DEST_PATH_IMAGE002
A time stamp is represented which is a time stamp,
Figure 351013DEST_PATH_IMAGE003
representing activity categories
Figure 721951DEST_PATH_IMAGE055
Figure 2279DEST_PATH_IMAGE036
The grid representing the user' S current location is calculated using the time activity preference calculation method in substep S123
Figure 91457DEST_PATH_IMAGE012
At the current time
Figure 479713DEST_PATH_IMAGE002
Time preference scores for all categories, in terms of score pairs from large to smallSorting the categories to obtain the category with the highest score
Figure DEST_PATH_IMAGE084
When is coming into contact with
Figure 767475DEST_PATH_IMAGE036
Located in interest grid set
Figure 950195DEST_PATH_IMAGE085
If the time preference model predicts
Figure 792249DEST_PATH_IMAGE084
Is equal to
Figure 984196DEST_PATH_IMAGE003
Will be
Figure 64147DEST_PATH_IMAGE081
To corresponding element
Figure DEST_PATH_IMAGE086
The number of the bits is increased by 1,
Figure 480085DEST_PATH_IMAGE002
to represent
Figure 543856DEST_PATH_IMAGE081
To (1) a
Figure 265126DEST_PATH_IMAGE002
The rows of the image data are, in turn,
Figure 465163DEST_PATH_IMAGE036
to represent
Figure 520844DEST_PATH_IMAGE036
And thirdly, calculating a time success rate matrix of all users by analogy.
6. The interest activity recommendation method of claim 5, wherein:
after the recommendation list generation substep S131, there is also provided
Precision verification substep S132:
evaluating the performance of the recommendation model using accuracy for the test data set based on the recommendation list
Figure 540752DEST_PATH_IMAGE087
The accuracy is calculated as follows:
Figure DEST_PATH_IMAGE088
in the formula (I), the compound is shown in the specification,
Figure 871239DEST_PATH_IMAGE089
indicating the length of the recommendation list and,
Figure DEST_PATH_IMAGE090
indicating that the user is in a check-in record in the test dataset,
Figure DEST_PATH_IMAGE092
representing a user
Figure 253679DEST_PATH_IMAGE012
At the time of
Figure 214682DEST_PATH_IMAGE002
And a location
Figure 987466DEST_PATH_IMAGE001
Top k item (Top-k) activities with the highest score,
Figure 324906DEST_PATH_IMAGE093
representing the number of check-in records in the test dataset.
7. The interest activity recommendation method of claim 5, wherein:
the interest activity recommendation method divides a user sign-in data set, sorts sign-in historical records of each user according to sign-in time, and divides the sign-in historical records into a training data set, a verification data set and a test data set according to a certain proportion, steps S111, S112, S113, S121, S122 and S123 adopt the training data set to construct a model, steps S114 and S124 adopt the verification data set to calculate success rate, and steps S131 and S132 adopt the test data set to verify the model.
8. The interest activity recommendation method of claim 5, wherein:
in the sub-step S121 of constructing the three-dimensional tensor, three dimensions of the three-dimensional tensor are a user dimension, a time dimension, and an activity dimension, respectively, where the user dimension represents each user as one dimension, the activity dimension is represented by a category of POI, and the time dimension is represented by a time period divided according to a certain time interval.
9. The interest activity recommendation method of claim 5, wherein:
in the user time activity preference acquisition sub-step S122, the tensor decomposition parameters include potential spatial dimensions, the size of which affects tensor decomposition time and recommendation accuracy.
10. A storage medium for storing computer-executable instructions, characterized in that:
the computer-executable instructions, when executed by a processor, perform the method for recommending point of interest activities based on a geographic grid of any of claims 1-9.
CN202210034325.6A 2022-01-13 2022-01-13 Interest activity recommendation method based on geographic grid Active CN114048391B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210034325.6A CN114048391B (en) 2022-01-13 2022-01-13 Interest activity recommendation method based on geographic grid

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210034325.6A CN114048391B (en) 2022-01-13 2022-01-13 Interest activity recommendation method based on geographic grid

Publications (2)

Publication Number Publication Date
CN114048391A CN114048391A (en) 2022-02-15
CN114048391B true CN114048391B (en) 2022-04-19

Family

ID=80196433

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210034325.6A Active CN114048391B (en) 2022-01-13 2022-01-13 Interest activity recommendation method based on geographic grid

Country Status (1)

Country Link
CN (1) CN114048391B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115495665B (en) * 2022-11-16 2023-04-25 中南大学 Surface coverage updating crowdsourcing task recommendation method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108460101A (en) * 2018-02-05 2018-08-28 山东师范大学 Point of interest of the facing position social networks based on geographical location regularization recommends method
CN109492166A (en) * 2018-08-06 2019-03-19 北京理工大学 Continuous point of interest recommended method based on time interval mode of registering
CN112905905A (en) * 2021-01-22 2021-06-04 杭州电子科技大学 Interest point-area joint recommendation method in location social network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8719198B2 (en) * 2010-05-04 2014-05-06 Microsoft Corporation Collaborative location and activity recommendations

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108460101A (en) * 2018-02-05 2018-08-28 山东师范大学 Point of interest of the facing position social networks based on geographical location regularization recommends method
CN109492166A (en) * 2018-08-06 2019-03-19 北京理工大学 Continuous point of interest recommended method based on time interval mode of registering
CN112905905A (en) * 2021-01-22 2021-06-04 杭州电子科技大学 Interest point-area joint recommendation method in location social network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于位置社会网络的双重细粒度兴趣点推荐;廖国琼等;《计算机研究与发展》;20171130;第54卷(第11期);2600-2610 *

Also Published As

Publication number Publication date
CN114048391A (en) 2022-02-15

Similar Documents

Publication Publication Date Title
US10235683B2 (en) Analyzing mobile-device location histories to characterize consumer behavior
CN105532030B (en) For analyzing the devices, systems, and methods of the movement of target entity
US8423494B2 (en) Complex situation analysis system that generates a social contact network, uses edge brokers and service brokers, and dynamically adds brokers
CN107230108A (en) The processing method and processing device of business datum
Wu et al. Density-based place clustering using geo-social network data
CN106776925B (en) Method, server and system for predicting gender of mobile terminal user
EP2875623A1 (en) Method and system for traffic estimation
CN114048391B (en) Interest activity recommendation method based on geographic grid
CN113158038A (en) Interest point recommendation method and system based on STA-TCN neural network framework
Chen et al. A temporal recommendation mechanism based on signed network of user interest changes
EP3192061B1 (en) Measuring and diagnosing noise in urban environment
Tanton Spatial microsimulation: developments and potential future directions
CN111259268A (en) POI recommendation model construction method and system
CN116188052A (en) Method and device for throwing shared vehicle, computer equipment and storage medium
Doan et al. Attractiveness versus competition: towards an unified model for user visitation
Liao et al. A mobility model for synthetic travel demand from sparse traces
Zeng et al. LGSA: A next POI prediction method by using local and global interest with spatiotemporal awareness
CN112883292A (en) User behavior recommendation model establishment and position recommendation method based on spatio-temporal information
Hong et al. Revealing behavioral impact on mobility prediction networks through causal interventions
Mazzamurro et al. Dynamic spatial cluster process model of geo-tagged tweets in london
Hu et al. Implementation and optimization of real-time fine-grained air quality sensing networks in smart city
Su et al. Point-of-interest recommendation based on geographical influence and extended pairwise ranking
Doumèche et al. Human spatial dynamics for electricity demand forecasting: the case of France during the 2022 energy crisis
CN113010803B (en) Prediction method for user access position in geographic sensitive dynamic social environment
Xie et al. Modeling Human Mobility Based on Temporal Characteristics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant