CN114048391B

CN114048391B - Interest activity recommendation method based on geographic grid

Info

Publication number: CN114048391B
Application number: CN202210034325.6A
Authority: CN
Inventors: 仇阿根; 赵习枝; 张志然; 陶坤旺; 张福浩; 陈颂; 陈才
Original assignee: Chinese Academy of Surveying and Mapping
Current assignee: Chinese Academy of Surveying and Mapping
Priority date: 2022-01-13
Filing date: 2022-01-13
Publication date: 2022-04-19
Anticipated expiration: 2042-01-13
Also published as: CN114048391A

Abstract

An interest activity recommendation method based on geographic grids comprises the steps of dividing a research area into regular grids, establishing a personal interest grid area for each user through check-in frequency and preference deviation ratio parameters, and deducing spatial activity preference of the user; capturing other similar user activity preferences by adopting a non-negative tensor resolution method, and cooperatively establishing the time activity preferences of the user; and fusing the spatial activity preference and the temporal activity preference by adopting a context-aware fusion method to jointly determine the interest activities recommended to the user. The model improves sparsity of check-in data of the user based on a geographic grid and a tensor decomposition method, carries out quantitative analysis on interest activities of the user, improves accuracy of interest activity recommendation, and enables recommendation results to meet personalized requirements of the user.

Description

Interest activity recommendation method based on geographic grid

Technical Field

The application belongs to the technical field of position recommendation, and particularly relates to an interest activity recommendation method based on a geographic grid.

Background

With the rapid development of Location-based Social networks (LBSNs) and mobile end devices, mining potential user personal preferences, activity tracks, and lifestyle patterns from accumulated mass user data and sign-in data becomes a core link for Location services. Location recommendation becomes an important technical means of this link. At present, due to the influence of problems such as data sparsity and cold start, the accuracy of recommendation results obtained by a recommendation algorithm for points of interest may be low. Also, in many cases, people do not usually need a very precise location. Therefore, the research on the interest activity recommendation algorithm and the application is carried forward, and the purpose is to better understand the movement behavior of the user and predict the activities possibly participated by the user, so that the personalized and intelligent service requirements of the user are met.

The check-in behavior of the user presents a specific spatio-temporal distribution pattern, and the modeling of the spatio-temporal behavior of the user based on historical check-in data of the user in the location social network is challenging and mainly expressed in the following aspects. Firstly, check-in data is usually high-dimensional and sparse, and is represented by a user-time-position-activity four-dimensional quadruplet, and it is complex and difficult to directly find out regularity of sparse high-dimensional data; secondly, the check-in behavior of the user on the social media is influenced by the user, which is different from the continuously sampled user activity data, and the check-in behavior is not continuously sampled at equal intervals and is complex and changeable in space and time; the check-in activity of a user is related to the context in which the user is located, i.e., the check-in behavior of the user is generally affected by the location and time of the user. Therefore, how to mine the user activity preference by combining the temporal and spatial contexts of the user becomes a technical problem which needs to be solved urgently in the prior art.

Disclosure of Invention

The invention aims to provide an interest activity recommendation method based on a geographic grid, which finds user interest activities and improves recommendation performance. And respectively modeling the spatial activity preference and the time activity preference of the user by using the technologies such as the geography grid, tensor decomposition and the like, thereby reducing the complexity of the problem.

An interest activity recommendation method based on a geographic grid comprises the following steps:

constructing a user spatial activity preference model based on the geographic grid step S110:

dividing a city area into a plurality of geographic grids, mapping user check-in information, calculating check-in frequency and category preference deviation ratio of a user in each grid, acquiring an interest grid set of the user, and deducing the position of the user

Computing a spatial activity preference distribution of the user and using the nullInter-activity preference distribution, calculating the recommendation success rate of the spatial activity preference model on the grid, and constructing a spatial success rate matrix;

a step S120 of constructing a user time activity preference model by using a non-negative tensor decomposition method:

constructing a three-dimensional tensor of user-time-class according to the user check-in record, wherein elements in the tensor represent the user in a time period

Selecting an activity

The sign-in frequency is obtained by a recovery tensor to describe the time activity preference of the user based on a non-negative tensor decomposition algorithm according to the three-dimensional tensor; deducing and calculating activity preference distribution of the user at the current time according to the tensor decomposition result; based on time activity preference distribution, calculating the recommendation success rate of a time activity preference model on grids, and constructing a time success rate matrix;

a temporal preference and spatial preference fusion step S130:

recommendation list generation substep S131: and deducing the user activity preference according to the success rate matrix, comparing element values in the space success rate matrix and the time success rate matrix, and selecting a model result with a higher value as a final recommendation result.

Optionally, the step S110 of constructing a user spatial activity preference model based on a geographic grid includes the following sub-steps:

user sign-in information mapping substep S111: dividing the urban area into a plurality of regular grids with the same size, mapping the sign-in information of the user to the grids, and obtaining the number attribute of the grids;

the user interest mesh acquisition sub-step S112:

calculating the check-in frequency of the user in each grid

And class preference bias ratio

Using the frequency of attendance

And class preference bias ratio

To characterize the user's preference on the grid by setting a frequency threshold

And a preference deviation ratio threshold

Screening to obtain an interest grid set of the user,

wherein the check-in frequency

Is represented as a user

In that

The number of check-ins in (a) is a proportion of the total number of check-ins,

in the formula (I), the compound is shown in the specification,

representing a user

In that

The number of check-ins of (c),

representing a set of grids visited by a user;

the preference deviation ratio

For measuring users

In a grid

Class preference in (1), assuming mesh

All of them share

POI of individual category, which the user accesses

A category

The POI of (1), the calculation formula is as follows:

in the formula (I), the compound is shown in the specification,

representation grid

The total number of categories of POIs present in,

indicates the user is

The number of sign-ins

The ratio of the total number of check-in times in the table,

representing the maximum entropy of the user in the check-in category, which assumes that the user is in all categories

The likelihood of a check-in is the same,

representing a user

In a grid

A category of checked-in;

after calculating the check-in frequency and the preference deviation ratio of each grid, introducing a frequency threshold value

And a preference deviation ratio threshold

Obtaining a set of grids of interest to a user

Set of grids

The formed area is a user interested area;

spatial activity preference calculation substep S113:

at a known current location of the user

In the case of (2), the grid is evaluated using spatial proximity

To pair

The following weight function is used:

in the formula (I), the compound is shown in the specification,

indicating the current location of the user

And a grid

The distance of the center point of (a),

then, the user presence is calculated by using a weighting method

For all categories of activity preferences, assume that the user

Is provided with

Individual interest grid

Then the user

In position

Is empty ofThe inter-activity preferences are:

in the formula (I), the compound is shown in the specification,

representing a user

In that

Inner pair

The frequency of the check-in of (c),

all activity categories are shown, user is

Is aligned with

The spatial activity preference of the user is

Is preferred in

The sum of the geographic influences of (c);

the spatial success rate matrix construction sub-step S114:

based on spatial activity preference distribution, calculating the recommendation success rate of a spatial activity preference model on grids, and further constructing a spatial success rate matrix for each user

Each row of the matrix represents one hourInterval of room

Each column representing a grid

And calculating the recommendation success rate of the spatial activity preference model on the grid by using the check-in record based on the calculation method of the spatial activity preference in the substep S113.

Optionally, the substep S114 of constructing the spatial success rate matrix specifically includes:

building a spatial success rate matrix for each user

Initializing the matrix

Assigning matrix elements to 0, for any user

Sequentially taken out from the verification data set

Is signed in

Indicating a check-in record in which, among other things,

representing latitude and longitude

；

A time stamp is represented which is a time stamp,

representing activity categories

，

The grid representing the user' S current location is calculated based on the spatial activity preference calculation method in substep S113

At the current position

Sorting the categories according to the spatial preference scores of all the categories from large to small to obtain the category with the highest score

When is coming into contact with

Located in interest grid set

In the middle, if

Is equal to

Will be

To corresponding element

The number of the bits is increased by 1,

to represent

To (1) a

The rows of the image data are, in turn,

is shown as

And column, and analogy, calculating the space success rate matrix of all users.

Optionally, the step S120 of constructing the user time activity preference model by using a non-negative tensor decomposition method specifically includes:

three-dimensional tensor construction sub-step S121:

constructing a three-dimensional tensor of user-time-activity from the user check-in record, expressed as

The elements in the tensor represent the user

In a period of time

Selecting an activity

The number of check-ins;

user time activity preference acquisition substep S122:

obtaining user-time-class preference values for a given tensor using a non-negative tensor decomposition method

，

The value of each element in (a) is calculated as follows:

in the formula (I), the compound is shown in the specification,

factor matrixes representing users, time and categories, wherein the matrix sizes are respectively

，

The number of features involved in the decomposition process is controlled for the potential spatial dimension,

are respectively as

The elements of (a) and (b),

adding a non-negative constraint to a least square-based decomposition algorithm in a CP decomposition model to obtain a recovery tensor which is used for describing the time activity preference of a user;

activity preference inference substep S123:

deducing and calculating the activity preference of the user at the current time based on the tensor decomposition result

Normalization from the activity dimension:

，

for a given user

And time

The sum of all class preference metrics is normalized to 1, in the recovery tensorAll elements have a value range of

Normalized element value

Treating as a user

At the time of

Access categories

Probability of, user

At the time of

The time activity preference of (a) is expressed as:

；

time success rate matrix construction substep S124:

based on time activity preference distribution, calculating the recommendation success rate of a time activity preference model on grids, and further constructing a time success rate matrix

Each row of the matrix represents a time range

Each column representing a grid

Calculation of time activity preference based on substep S123 using check-in recordsThe method calculates the recommendation success rate of the time activity preference model on the grid.

Optionally, the substep S124 of constructing the time success rate matrix specifically includes:

initializing a matrix

Assigning matrix elements to 0, in the validation dataset, for any user

Taken out in sequence

Is represented as a check-in record of

Wherein, in the step (A),

representing latitude and longitude

，

A time stamp is represented which is a time stamp,

representing activity categories

，

The grid representing the user' S current location is calculated using the time activity preference calculation method in substep S123

At the current time

Sorting the categories according to the time preference scores of all the categories from large to small to obtain the category with the highest score

When is coming into contact with

Located in interest grid set

If the time preference model predicts

Is equal to

Will be

To corresponding element

The number of the bits is increased by 1,

to represent

To (1) a

The rows of the image data are, in turn,

is shown as

And column, and so on, calculating the time success rate matrix of all users.

Optionally, after the sub-step S131 of generating the recommendation list, there is further provided

Precision verification substep S132:

evaluating the performance of the recommendation model using accuracy for the test data set based on the recommendation list

The accuracy is calculated as follows:

in the formula (I), the compound is shown in the specification,

indicating the length of the recommendation list and,

indicating that the user is in a check-in record in the test dataset,

representing a user

At the time of

And a location

Top k item (Top-k) activities with the highest score,

representing the number of check-in records in the test dataset.

Optionally, the method divides a user check-in data set, sorts the check-in history of each user according to check-in time, and divides the check-in history of each user into a training data set, a verification data set, and a test data set according to a certain proportion, wherein steps S111, S112, S113, S121, S122, and S123 use the training data set to construct a model, steps S114 and S124 use the verification data set to calculate a success rate, and steps S131 and S132 use the test data set to perform model verification.

Optionally, in the substep S121 of constructing the three-dimensional tensor, three dimensions of the three-dimensional tensor are a user dimension, a time dimension, and an activity dimension, respectively, where the user dimension represents each user as an independent dimension, the activity dimension is represented by a POI category, and the time dimension is represented by a time period divided according to a certain time interval.

Optionally, in the sub-step S122 of obtaining the time activity preference of the user, the tensor decomposition parameter includes a potential spatial dimension, and the size of the potential spatial dimension affects the tensor decomposition time and the recommendation precision.

The invention further discloses a storage medium for storing computer executable instructions, which is characterized in that:

the computer-executable instructions, when executed by a processor, perform the above-described geographic grid-based point of interest activity recommendation method.

The method and the device respectively consider the space and time characteristics of the user sign-in activities in the position social network, and reduce the complexity of the problems. The method comprises the steps of improving sparsity of check-in data of a user based on a geographic grid and a tensor decomposition method, determining interest activities recommended to the user according to prediction probabilities of time distribution and spatial distribution by calculating spatial and temporal activity distribution of the user, carrying out quantitative analysis on interest areas of the user, improving accuracy of interest activity recommendation in a position social network, and enabling recommendation results to meet personalized requirements of the user.

Drawings

FIG. 1 is a flowchart of a method for recommending interest activities based on a geographic grid according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating a method for recommending interest activities based on a geographic grid according to an embodiment of the present invention;

FIG. 3 is a diagram illustrating the recommended precision for different potential dimensions in the NCP (Nonnegative Candecamp/Parafac) model, in accordance with a specific embodiment of the present invention;

FIG. 4 is an example of a user time activity preference calculation based on the NCP decomposition model in accordance with a specific embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

The invention is characterized in that: the sign-in activity of the user is modeled by using the sign-in historical records of the user in the position social network, and a personalized city interest activity recommendation method is provided. The method is based on the geography grid, combines the geography preference and the time preference of the user to the city area, analyzes the time-space behavior of the user in the city sign-in, and recommends more appropriate interest activities to the target user.

Referring to fig. 1, a flowchart of a method for recommending interest activities based on a geographic grid is shown, and fig. 2, a specific flowchart of the recommendation method is shown.

The spatial activity preference of the user is calculated, the spatial activity preference distribution of the user is utilized, the recommendation success rate of the spatial activity preference model on the grid is calculated, and a spatial success rate matrix is constructed.

Specifically, the step S110 may include the following sub-steps:

user sign-in information mapping substep S111: the city area is divided into a plurality of regular grids with the same size, the sign-in information of the user is mapped to the grids, and the number attribute of the grids is obtained.

In particular, maximum and minimum longitude and latitude values for the area of interest are obtained, in accordance with

The meter interval divides the research area into a plurality of regular grids with the same size, and all grids are numbered, and the numbering starts from 0. Obtaining any sign-in record of user, matching sign-in position to grid, judging grid where sign-in point is located and its number

。

Experimental data Foursquare data was used, ranging spatially from (40.54085247, -74.28476645) to (40.99833172, -73.6738252). The check-in record is represented as

Wherein, in the step (A),

which represents the number of the user,

which indicates the number of the POI number,

the latitude is represented by the number of lines,

which represents the longitude of the vehicle,

the time of the check-in is represented,

representing POI categories. Adding to check-in records

Attribute, then check-in record is represented as

。

Specifically, check-in records of users in new york from 2012 month 4 to month 2013 month 2 are recorded in the Foursquare data, and in order to alleviate the influence of data sparsity, users who check in less than 10 times and points of interest who are accessed less than 10 times are removed. Through data preprocessing, the obtained Foursquare data set has 1083 users, 38333 interest points and 227428 total check-in times. The points of interest are divided into 9 major categories, 215 subclasses, and the invention adopts the subclasses as the activity categories.

The invention can divide the user check-in data set, sort the check-in history of each user according to the check-in time, and divide the check-in history into the training data set, the verification data set and the test data set according to a certain proportion, such as 8:1: 1.

The user interest mesh acquisition sub-step S112:

calculating the check-in frequency of the user in each grid

And class preference bias ratio

Using the frequency of attendance

And class preference bias ratio

And a preference deviation ratio threshold

And screening to obtain an interest grid set of the user.

First, the check-in frequency of the user in the grid is a direct evaluation index of the popularity of the grid, and the more times a region is checked-in, the more attractive the interest points in the region are to the user. Meanwhile, since the user usually only accesses a few categories, not all categories, in the frequently checked-in area, the check-in of the user on the grid is diversified.

Thus, the present invention primarily passes through the check-in frequency

And class preference bias ratio

To evaluate the user's preference in the grid,

frequency of check-in

Is represented as a user

In that

in the formula (I), the compound is shown in the specification,

representing a user

In that

The number of check-ins of (c),

representing a set of grids visited by a user;

preference deviation ratio

For measuring users

In a grid

Class preference in (1), assuming mesh

All of them share

POI (POI of interest) of a category to which a user has access

A category

POI, preference bias ratio

Measure and make a best of

In that

Sign-in class distribution entropy of

And the fractional difference between the maximum entropy of the category distribution, the calculation formula is as follows:

in the formula (I), the compound is shown in the specification,

representation grid

The total number of categories of POIs present in,

indicates the user is

The number of sign-ins

The ratio of the total number of check-in times in the table,

The likelihood of a check-in is the same,

representing a user

In a grid

Category of checked-in.

And a preference deviation ratio threshold

Obtaining a set of grids of interest to a user

Set of grids

The formed area is the user interested area.

The specific method for acquiring the region of interest of the user comprises the following steps: firstly, the check-in number of the user in all areas is calculated, and the grid set where the user checks in is obtained

Then sequentially scanning the grids visited by the user

Calculating the check-in frequency

If the grid is one that is frequently visited by the user (i.e., the check-in frequency is greater than or equal to the threshold value)

) Calculating a preference deviation ratio

(ii) a If it is not

Greater than or equal to

Then, then

Adding the interest grids as users into the interest grid set

Performing the following steps; sequentially traversing all check-in grids of the user to obtain an interest grid set of the user

。

Determines the activity of the user on the grid,

representing the degree of deviation of the user's activity preference in the grid.

And

the larger the value is,

the smaller the set.

Spatial activity preference calculation substep S113:

interest-based grid set

The spatial activity preference of the user at the current location is inferred. At a known current location of the user

In the case of (2), the influence of the individual meshes in the set is first calculated, i.e. the meshes are evaluated using spatial proximity

To pair

Then the activity preferences of all grids are calculated using a weighting method.

In particular, at a known current location of the user

In the case of (2), the grid is evaluated using spatial proximity

To pair

According to the conclusions from the existing research, the following weight functions are adopted:

in the formula (I), the compound is shown in the specification,

indicating the current location of the user

And a grid

The further away,

smaller values indicate less spatial appeal of the grid to the user.

Then, the user presence is calculated by using a weighting method

For all categories of activity preferences, assume that the user

Is provided with

Individual interest grid

Then the user

In position

The spatial activity preference of (a) is:

in the formula (I), the compound is shown in the specification,

representing a user

In that

Inner pair

The frequency of the check-in of (c),

all activity categories are shown, user is

Is aligned with

The spatial activity preference of the user is

Is preferred in

The sum of the geographic influences of (c).

The spatial success rate matrix construction sub-step S114:

based on the spatial activity preference distribution, calculating the recommendation success rate of the spatial activity preference model on the grid by using a verification data set, and further constructing a spatial success rate matrix for each user

Each row of the matrix represents a time range

Each column representing a grid

And calculating the recommendation success rate of the spatial activity preference model on the grid by using the check-in records in the verification set based on the calculation method of the spatial activity preference in the substep S113.

In particular, use

Representing a user

A spatial success rate matrix of (1), each row of the matrix representing a time stamp

Each column representing a grid

. Considering the periodicity of the check-in points of the user in the check-in on the day, the check-in on the working day and the check-in on the non-working day, the time is divided into 24 time periods by 1 hour interval, each day is divided into 168 time intervals, and each time interval represents a time stamp

. Given time

，

The calculation of (c) is shown in the following formula:

in the formula (I), the compound is shown in the specification,

for the corresponding week, respectively

Indicating that on a monday through a sunday,

representing hours. For example, times 2012-04-2307: 10:18,

the number of the carbon atoms is 1,

is 7.

First, the matrix is initialized

The matrix element is assigned a value of 0. In the authentication dataset, for any user

Taken out in sequence

Is signed in, is shown as

For convenience of expression, use

Indicating that the check-in record (where,

representing latitude and longitude

；

Represents a time stamp in accordance with

Calculating to obtain;

representing activity categories

；

Indicating the grid that the user is currently on). Based on the calculation method of the spatial Activity preference in substep S113, a calculation is made

At the current position

When is coming into contact with

Located in interest grid set

In the middle, if

Is equal to

Will be

To corresponding element

The number of the bits is increased by 1,

to represent

To (1) a

The rows of the image data are, in turn,

is shown as

Selecting an activity

The sign-in frequency is obtained by a recovery tensor to describe the time activity preference of the user based on a non-negative tensor decomposition algorithm according to the three-dimensional tensor; deducing activity preference distribution of the user at the current time according to the tensor decomposition result; computing temporal activity preference model on grid based on temporal activity preference distributionAnd recommending the success rate, and further constructing a time success rate matrix.

Specifically, the method comprises the following substeps:

three-dimensional tensor construction sub-step S121:

In the time dimension, considering the periodicity of the check-in points of the users in the check-in on the day, the check-in on the working day and the check-in on the non-working day, the time is divided into 24 time periods every day and 168 time intervals every week according to the interval of 1 hour. In the activity dimension, 251 sub-categories of POIs are used to represent user activity. In the present embodiment, there are 1083 active users in total, 215 categories. Thus, the tensor is represented as

. The elements in the tensor represent the user

In a period of time

Selecting an activity

The number of sign-ins if the user is not in the time period

Access overactivity

Then, then

Is 0. The purpose of tensor resolution is to decompose

The 0 value in (1) is assigned by decomposition algorithmA predetermined value.

Therefore, in the sub-step, three dimensions of the three-dimensional tensor are a user dimension, a time dimension and an activity dimension respectively, the user dimension represents each user as one dimension, the activity dimension is represented by a POI category, and the time dimension is represented by a time period divided according to a certain time interval.

User time activity preference acquisition substep S122:

since the sign-in probability of the user cannot be negative, the negative values in the recovery tensor are meaningless for the preference of the user, and therefore the constructed tensor is decomposed into three first-order tensor sums by adopting a non-negative CP decomposition model.

，

The value of each element in (a) is calculated as follows:

in the formula (I), the compound is shown in the specification,

。

are respectively as

Of (2) is used.

Adding a non-negative constraint to the least squares based decomposition algorithm in the CP decomposition model yields a recovery tensor describing the temporal activity preference of the user.

Preferably, CP decomposition (CANDECOMP/PARAFAC) decomposes the tensor into three factor matrices (i.e., user, time and class factor matrices) and optimizes the tensor using an alternating least squares method

And the original

Loss function between tensors.

The tensor decomposition parameters include potential spatial dimensions that can significantly affect recommendation performance, particularly tensor decomposition time and recommendation accuracy. The present embodiment also takes into account the impact of the potential spatial dimension on the accuracy of the recommendation. The change in recommended performance during the change in potential spatial dimension from 8 to 128 is given in fig. 3. It can be seen that the larger the dimension, the better the recommended performance, but the growth rate gradually slows.

In one example, the restored tensor is obtained according to the model construction idea, and the nonnegative tensor decomposition can be completed by using a Tensorly open source code packet of Python. Tensorly is an open source code packet which can perform tensor decomposition, tensor learning and tensor algebra, and a non _ negative _ parafacc function provided by Tensorly can realize NCP decomposition.

Activity preference inference substep S123:

and deducing activity preference of the user at the current time based on the tensor decomposition result. In order to infer the user

At the time of

Class bias ofPreferably, i.e. the user is at

Temporal access categories

Possibility of (1) to

Normalization from the activity dimension:

thus, for a given user

And time

The sum of all class preference metrics is normalized to 1 and the value range of all elements in the recovery tensor is

Normalization enables temporal and spatial preferences to be fused, normalized element values

Treating as a user

At the time of

Access categories

Probability of, user

At the time of

The time activity preference of (a) is expressed as:

an example of a user time activity preference calculation based on the NCP decomposition model is shown in FIG. 4, which shows a user

At any time period

Internal access activity

The probability of (c).

Time success rate matrix construction substep S124:

based on the time activity preference distribution, calculating the recommendation success rate of the time activity preference model on the grid by using the verification data set, and further constructing a time success rate matrix

Each row of the matrix represents a time range

Each column representing a grid

And calculating the recommendation success rate of the time activity preference model on the grid by using the check-in records in the verification set based on the time activity preference calculation method in the substep S123.

Representing a user

Time success rate matrix. As described in S114

Each row of the matrix represents a time stamp

Each column representing a grid

。

First, the matrix is initialized

Assigning matrix elements to 0, in the validation dataset, for any user

Taken out in sequence

Is represented as a check-in record of

(wherein,

representing latitude and longitude

；

A presentation time stamp;

representing activity categories

；

Indicating the grid that the user is currently on). Calculating using the method of calculating the time activity preference in substep S123

At the current time

. When in use

Located in interest grid set

If the time preference model predicts

Is equal to

Will be

To corresponding element

The number of the bits is increased by 1,

to represent

To (1) a

The rows of the image data are, in turn,

is shown as

And column, and so on, calculating the time success rate matrix of all users.

A temporal preference and spatial preference fusion step S130:

in the case of a given user's spatial and temporal context, a fusion method needs to be adopted to fuse the temporal and spatial activity preferences, and methods such as linear weighting, multiplication and the like are common fusion methods. However, since the performance of the spatial and temporal models varies with time and place, it is difficult to dynamically assign these two weights according to the user context.

Specifically, in the test data set, the user check-in records are sequentially taken out

The following judgment is made: first for a given user

And its context (i.e. time)

And position

). Comparison of

And

and selecting the model with higher value as the final preference. If the two are equal, the result predicted by the space activity preference model is adopted.

Experimental results show that the spatial activity preference model can better capture activity preference of the user.

Furthermore, the accuracy verification is carried out on the interest activity recommendation method based on the geographic grid through experiments.

Precision verification substep S132:

The accuracy is calculated as follows:

in the formula (I), the compound is shown in the specification,

indicating the length of the recommendation list and,

indicating that the user is in a check-in record in the test set,

representing a user

At the time of

And a location

Top k item (Top-k) activities with the highest score,

indicating the number of check-in records in the test set.

The effect of the recommended quantity on the accuracy is also considered in this experiment, and table 1 shows the variation of the recommended accuracy of several recommendation methods when the recommended quantity is 1, 5 and 10. The comparison methods include MFT (most frequently visited activity within a time period), CP (CP decomposition), NCP (non-negative CP decomposition), MFA (most frequently visited activity by a user), SPM (spatial preference model), STUAP (geographic grid-based interest activity recommendation method). It can be seen that the interest activity recommendation method of the present invention has the highest accuracy under the condition that the recommendation number is the same.

Table 1 recommended performance comparison experiment

The experimental data set comprises a training set, a verification set and a test set, wherein the training data set is adopted to construct a model in steps S111, S112, S113, S121, S122 and S123, the verification data set is adopted to calculate the success rate in steps S114 and S124, and the test data set is adopted to verify the model in steps S131 and S132.

The present invention further discloses a storage medium for storing computer-executable instructions which, when executed by a processor, perform the above-mentioned method for recommending point of interest activities based on a geographic grid.

In conclusion, the spatial and temporal characteristics of the user check-in activities in the location social network are considered respectively, and the complexity of the problem is reduced. The method comprises the steps of improving sparsity of check-in data of a user based on a geographic grid and a tensor decomposition method, determining interest activities recommended to the user according to prediction probabilities of time distribution and spatial distribution by calculating spatial and temporal activity distribution of the user, carrying out quantitative analysis on interest areas of the user, improving accuracy of interest activity recommendation in a position social network, and enabling recommendation results to meet personalized requirements of the user.

It will be apparent to those skilled in the art that the various elements or steps of the invention described above may be implemented using a general purpose computing device, they may be centralized on a single computing device, or alternatively, they may be implemented using program code that is executable by a computing device, such that they may be stored in a memory device and executed by a computing device, or they may be separately fabricated into various integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.

While the invention has been described in further detail with reference to specific preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. An interest activity recommendation method based on a geographic grid comprises the following steps:

Calculating the spatial activity preference distribution of the user, calculating the recommendation success rate of a spatial activity preference model on the grid by using the spatial activity preference distribution, and constructing a spatial success rate matrix;

constructing three of user-time-categories from user check-in recordsDimension tensor, the elements of which represent the user over a period of time

Selecting an activity

a temporal preference and spatial preference fusion step S130:

2. The interest activity recommendation method of claim 1, wherein:

the step S110 of constructing a user spatial activity preference model based on a geographic grid includes the following sub-steps:

the user interest mesh acquisition sub-step S112:

calculating the check-in frequency of the user in each grid

And class preference bias ratio

Using said check-in frequency

And the class preference deviation ratio

And a preference deviation ratio threshold

Screening to obtain an interest grid set of the user,

wherein the check-in frequency

Is represented as a user

In that

in the formula (I), the compound is shown in the specification,

representing a user

In that

The number of check-ins of (c),

representing a set of grids visited by a user;

the preference deviation ratio

For measuring users

In a grid

Class preference in (1), assuming mesh

All of them share

POI (POI of interest) of a category to which a user has access

A category

The POI of (1), the calculation formula is as follows:

in the formula (I), the compound is shown in the specification,

representation grid

The total number of categories of POIs present in,

indicates the user is

The number of sign-ins

The ratio of the total number of check-in times in the table,

The likelihood of a check-in is the same,

representing a user

In a grid

A category of checked-in;

And a preference deviation ratio threshold

Obtaining a set of grids of interest to a user

Set of grids

The formed area is a user interested area;

spatial activity preference calculation substep S113:

at a known current location of the user

In the case of (2), the grid is evaluated using spatial proximity

To pair

The following weight function is used:

in the formula (I), the compound is shown in the specification,

indicating the current location of the user

And a grid

The distance of the center point of (a),

then, the user presence is calculated by using a weighting method

For all categories of activity preferences, set user

Is provided with

Individual interest grid

Then the user

In position

The spatial activity preference of (a) is:

in the formula (I), the compound is shown in the specification,

representing a user

In that

Inner pair

The frequency of the check-in of (c),

all activity categories are shown, user is

Is aligned with

The spatial activity preference of the user is

Is preferred in

The sum of the geographic influences of (c);

the spatial success rate matrix construction sub-step S114:

Each row of the matrix represents a time range

Each column representing a grid

3. The interest activity recommendation method of claim 2, wherein:

the substep S114 of constructing the spatial success rate matrix specifically includes:

building a spatial success rate matrix for each user

Initializing the matrix

Assigning matrix elements to 0, for any user

Sequentially taken out from the verification data set

For signing in record

It is shown that, among others,

representing latitude and longitude

；

A time stamp is represented which is a time stamp,

representing activity categories

，

At the current position

When is coming into contact with

Located in interest grid set

In the middle, if

Is equal to

Will be

To corresponding element

The number of the bits is increased by 1,

to represent

To (1) a

The rows of the image data are, in turn,

to represent

And thirdly, calculating a space success rate matrix of all users by analogy.

4. The interest activity recommendation method of claim 3, wherein:

the step S120 of constructing the user time activity preference model by using the non-negative tensor decomposition method specifically includes:

three-dimensional tensor construction sub-step S121:

constructing a three-dimensional tensor of user-time-activity from the user check-in record,is shown as

The elements in the tensor represent the user

In a period of time

Selecting an activity

The number of check-ins;

user time activity preference acquisition substep S122:

，

The value of each element in (a) is calculated as follows:

in the formula (I), the compound is shown in the specification,

factor matrixes respectively representing users, time and categories, wherein the matrix size is

，

are respectively as

The elements of (a) and (b),

activity preference inference substep S123:

Normalization from the activity dimension:

，

for a given user

And time

Normalized element value

Treating as a user

At the time of

Access categories

Probability of, user

At the time of

The time activity preference of (a) is expressed as:

；

time success rate matrix construction substep S124:

Each row of the matrix represents a time range

Each column representing a grid

And calculating the recommendation success rate of the time activity preference model on the grid by using the check-in record based on the time activity preference calculation method in the substep S123.

5. The interest activity recommendation method of claim 4, wherein:

the time success rate matrix construction substep S124 specifically comprises:

initializing a matrix

Assigning matrix elements to 0, in the validation dataset, for any user

Taken out in sequence

Is represented as a check-in record of

Wherein, in the step (A),

representing latitude and longitude

，

A time stamp is represented which is a time stamp,

representing activity categories

，

At the current time

Time preference scores for all categories, in terms of score pairs from large to smallSorting the categories to obtain the category with the highest score

When is coming into contact with

Located in interest grid set

If the time preference model predicts

Is equal to

Will be

To corresponding element

The number of the bits is increased by 1,

to represent

To (1) a

The rows of the image data are, in turn,

to represent

And thirdly, calculating a time success rate matrix of all users by analogy.

6. The interest activity recommendation method of claim 5, wherein:

after the recommendation list generation substep S131, there is also provided

Precision verification substep S132:

The accuracy is calculated as follows:

in the formula (I), the compound is shown in the specification,

indicating the length of the recommendation list and,

indicating that the user is in a check-in record in the test dataset,

representing a user

At the time of

And a location

Top k item (Top-k) activities with the highest score,

representing the number of check-in records in the test dataset.

7. The interest activity recommendation method of claim 5, wherein:

the interest activity recommendation method divides a user sign-in data set, sorts sign-in historical records of each user according to sign-in time, and divides the sign-in historical records into a training data set, a verification data set and a test data set according to a certain proportion, steps S111, S112, S113, S121, S122 and S123 adopt the training data set to construct a model, steps S114 and S124 adopt the verification data set to calculate success rate, and steps S131 and S132 adopt the test data set to verify the model.

8. The interest activity recommendation method of claim 5, wherein:

in the sub-step S121 of constructing the three-dimensional tensor, three dimensions of the three-dimensional tensor are a user dimension, a time dimension, and an activity dimension, respectively, where the user dimension represents each user as one dimension, the activity dimension is represented by a category of POI, and the time dimension is represented by a time period divided according to a certain time interval.

9. The interest activity recommendation method of claim 5, wherein:

in the user time activity preference acquisition sub-step S122, the tensor decomposition parameters include potential spatial dimensions, the size of which affects tensor decomposition time and recommendation accuracy.

10. A storage medium for storing computer-executable instructions, characterized in that:

the computer-executable instructions, when executed by a processor, perform the method for recommending point of interest activities based on a geographic grid of any of claims 1-9.