CN113158089B - Social network position vectorization modeling method - Google Patents

Social network position vectorization modeling method Download PDF

Info

Publication number
CN113158089B
CN113158089B CN202110414543.8A CN202110414543A CN113158089B CN 113158089 B CN113158089 B CN 113158089B CN 202110414543 A CN202110414543 A CN 202110414543A CN 113158089 B CN113158089 B CN 113158089B
Authority
CN
China
Prior art keywords
matrix
location
latent
factor
factors
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110414543.8A
Other languages
Chinese (zh)
Other versions
CN113158089A (en
Inventor
蔡国永
陈心怡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin University of Electronic Technology
Original Assignee
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology filed Critical Guilin University of Electronic Technology
Priority to CN202110414543.8A priority Critical patent/CN113158089B/en
Publication of CN113158089A publication Critical patent/CN113158089A/en
Application granted granted Critical
Publication of CN113158089B publication Critical patent/CN113158089B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Business, Economics & Management (AREA)
  • Molecular Biology (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a social network position vectorization modeling method, belongs to the technical field of personalized recommendation, aims to solve the technical problem that various characteristics cannot be effectively fused into position modeling, and comprises the following steps: constructing a feature matrix of the position according to the feature information in the public data set; converting the feature matrix into latent factors of the features according to a decomposition method; splicing potential factors of the features to obtain a feature vector of the position
Figure DDA0003025246850000011
Obtaining potential factors for a location from a user-location graph
Figure DDA0003025246850000012
Latent factor according to location
Figure DDA0003025246850000013
And feature vector of position
Figure DDA0003025246850000014
Connected as a position vector zjAnd completing vectorization modeling. The method and the device can improve the quality of the recommendation result and relieve the data sparsity problem.

Description

Social network position vectorization modeling method
Technical Field
The invention relates to a social network position vectorization modeling method, and belongs to the technical field of personalized recommendation.
Background
Most of the existing position recommendation methods only use the sign-in information of users in the process of performing position modeling, and disregard rich position characteristic information existing in LBSs (location-based social networks); some methods utilize geographic features of the location, but only learn linear or low-order interactions between features, and cannot effectively integrate multiple features into the location modeling. Therefore, personalized position recommendation cannot be effectively provided, and the problem of data sparsity cannot be relieved; in order to solve the above problems, the present application provides a social network location vectorization modeling method.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, provides a social network location vectorization modeling method and solves the technical problem that various features cannot be effectively fused into location modeling.
In order to achieve the purpose, the invention is realized by adopting the following technical scheme:
the invention provides a social network location vectorization modeling method, which comprises the following steps:
constructing a feature matrix of the position according to the feature information in the public data set;
converting the feature matrix into latent factors of the features according to a decomposition method;
splicing potential factors of the features to obtain a feature vector of the position
Figure BDA0003025246830000011
Obtaining potential factors for a location from a user-location graph
Figure BDA0003025246830000012
Latent factor according to location
Figure BDA0003025246830000013
And position ofFeature vector
Figure BDA0003025246830000014
Connected as a position vector zjAnd completing vectorization modeling.
As an optional implementation, the constructing a feature matrix of a location according to the feature information in the public data set includes:
constructing a common access matrix of locations from check-in records of users, wherein common access counts between locations are pair-wise recorded;
constructing a geographical proximity matrix of the locations according to the physical distances between the locations;
constructing a category correlation matrix of the position according to the category information of the position;
and constructing an access time matrix of the position according to the check-in timestamp of the user.
As an alternative embodiment, the converting the matrix into the latent factors of the features according to the decomposition method includes:
converting the common access matrix into latent factors for the features by a decomposition method, comprising:
because the common access matrix O is symmetrical, the common access matrix O is decomposed into a low-rank potential factor matrix E by adopting a non-negative symmetrical matrix decomposition method as a formulaOAnd its transposed matrix
Figure BDA0003025246830000021
Dot product of these two matrices, the above formula is as follows:
Figure BDA0003025246830000022
wherein | · | purple sweetFFrobenius norm, λ, representing the matrixORepresenting a regularization term parameter;
after decomposition is complete, the latent factor matrix EOIs a common access latency factor for location v, denoted as
Figure BDA0003025246830000023
Transforming the geographic proximity matrix into latent factors for the features by a decomposition method, including;
because the geographic proximity matrix G is symmetrical, the geographic proximity matrix G is decomposed into a low-rank potential factor matrix E by adopting a nonnegative symmetrical matrix decomposition method as a formulaGAnd its transposed matrix
Figure BDA0003025246830000024
Dot product of these two matrices, the above formula is as follows:
Figure BDA0003025246830000025
wherein | · | purple sweetFFrobenius norm, λ, representing the matrixGIs a regularization term parameter;
after decomposition is complete, the latent factor matrix EGIs a common access latency factor for location v, denoted as
Figure BDA0003025246830000031
Transforming the class correlation matrix into latent factors for the features by a decomposition method, including;
because the class correlation matrix C is symmetrical, the class correlation matrix C is decomposed into a low-rank latent factor matrix E by adopting a nonnegative symmetrical matrix decomposition method as a formulaCAnd its transposed matrix
Figure BDA0003025246830000032
Dot product of these two matrices, the above formula is as follows:
Figure BDA0003025246830000033
wherein | · | purple sweetFFrobenius norm, λ, representing the matrixCIs a regularization term parameter;
after decomposition, the latent factor momentsArray ECIs a common access latency factor for location v, denoted as
Figure BDA0003025246830000034
Converting the access time matrix into latent factors of the features by a decomposition method, which includes;
because the access time matrix S is symmetrical, a nonnegative symmetrical matrix decomposition method is adopted as a formula to decompose the access time matrix S into a low-rank potential factor matrix ESAnd its transposed matrix
Figure BDA0003025246830000035
Dot product of these two matrices, the above formula is as follows:
Figure BDA0003025246830000036
wherein | · | purple sweetFFrobenius norm, λ, representing the matrixSIs a regularization term parameter;
after decomposition is complete, the latent factor matrix ESIs a common access latency factor for location v, denoted as
Figure BDA0003025246830000037
As an optional implementation manner, the potential factors are spliced to obtain a feature vector of the position
Figure BDA0003025246830000038
The method comprises the following steps:
feature vector
Figure BDA0003025246830000039
The expression of (a) is:
Figure BDA0003025246830000041
as an alternative embodimentThe potential factor of the position is obtained according to the user-position diagram
Figure BDA0003025246830000042
The method comprises the following steps:
defining potential factors
Figure BDA0003025246830000043
The expression of (a) is as follows:
Figure BDA0003025246830000044
where B (j) represents a set of users who have interacted with a location, t represents a user, ptIs the initial embedded vector of the user, W and b represent the weight and deviation of the two-layer neural network, sigma represents the nonlinear activation function, AggusersIs an aggregation function, and the expression is as follows:
Figure BDA0003025246830000045
wherein, mujtIndicates position VjAttention weight for interaction with user t;
attention weighting mu by two-layer neural networkjtPerforming parameterization, and weighting attention after parameterization
Figure BDA0003025246830000046
The expression of (a) is as follows:
Figure BDA0003025246830000047
wherein the content of the first and second substances,
Figure BDA0003025246830000048
and W1Representing weights of two-layer neural networks, b1And b2Representing the deviation of a two-layer neural network, ptIs the initial embedding vector of the user, qjIndicating a locationj, σ represents a non-linear activation function,
Figure BDA0003025246830000049
a join operator representing two vectors;
weighting the above notes by a Softmax function
Figure BDA00030252468300000410
Carrying out normalization processing to obtain the final attention weight mujtThe expression is as follows:
Figure BDA00030252468300000411
according to the attention weight mujtTo obtain the potential factor
Figure BDA00030252468300000412
The expression of (a) is as follows:
Figure BDA0003025246830000051
as an alternative embodiment, the potential factors according to location
Figure BDA0003025246830000052
And feature vector of position
Figure BDA0003025246830000053
Connected as a position vector zjThe method comprises the following steps:
position vector zjThe expression of (a) is:
Figure BDA0003025246830000054
wherein, sigma represents a nonlinear activation function, W and b represent weight and deviation of two layers of neural networks,
Figure BDA0003025246830000055
representing the join operator of two vectors.
Compared with the prior art, the invention has the following beneficial effects:
the invention provides a social network position vectorization modeling method, which is a method for integrating various position characteristic information in the position modeling process, respectively constructing matrixes for the characteristic information, converting the matrixes into potential vector representations by matrix decomposition and learning the joint influence of the potential vector representations on user behaviors; the position is modeled through the series of operations, so that the interaction between the user and the position can be captured, and meanwhile, the characteristic information of the position is fused, so that the technical effect of improving the quality of a recommendation result is achieved, and the problem of data sparsity is solved.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention;
FIG. 2 is a histogram of experimental results of a first set of Foursquare data sets of the present invention;
FIG. 3 is a histogram of experimental results of the first set of Gowalla datasets of the present invention;
FIG. 4 is a histogram of experimental results of a second set of Foursquare data sets of the present invention;
FIG. 5 is a histogram of experimental results of the Gowalla dataset of the second set of the invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.
The first embodiment is as follows:
as shown in fig. 1, the present invention provides a social network location vectorization modeling method, which includes the following steps:
step 1, constructing a position feature matrix according to feature information in a public data set.
Step 1.1, a common access matrix of positions is constructed according to check-in records of users, wherein common access counts among the positions are recorded in pairs.
And 1.2, constructing a geographical proximity matrix of the positions according to the physical distance between the positions.
And 1.3, constructing a category correlation matrix of the position according to the category information of the position.
And 1.4, constructing an access time matrix of the position according to the check-in timestamp of the user.
And 2, converting the characteristic matrix into potential factors of the characteristics according to a decomposition method.
Step 2.1, converting the common access matrix into latent factors of the characteristics by a decomposition method, wherein the latent factors comprise the following steps:
because the common access matrix O is symmetrical, the common access matrix O is decomposed into a low-rank potential factor matrix E by adopting a non-negative symmetrical matrix decomposition method as a formulaOAnd its transposed matrix
Figure BDA0003025246830000061
Dot product of these two matrices, the above formula is as follows:
Figure BDA0003025246830000062
wherein | · | purple sweetFFrobenius norm, λ, representing the matrixORepresenting a regularization term parameter;
after decomposition is complete, the latent factor matrix EOIs a common access latency factor for location v, denoted as
Figure BDA0003025246830000063
2.2, converting the geographical proximity matrix into potential factors of the characteristics by a decomposition method, wherein the potential factors comprise;
because the geographic proximity matrix G is symmetrical, the geographic proximity matrix G is decomposed into a low-rank potential factor matrix E by adopting a nonnegative symmetrical matrix decomposition method as a formulaGAnd its transposed matrix
Figure BDA0003025246830000064
Dot product of the two matrices, upperThe following formula:
Figure BDA0003025246830000071
wherein | · | purple sweetFFrobenius norm, λ, representing the matrixGIs a regularization term parameter;
after decomposition is complete, the latent factor matrix EGIs a common access latency factor for location v, denoted as
Figure BDA0003025246830000072
2.3, converting the category correlation matrix into potential factors of the characteristics by a decomposition method, wherein the potential factors comprise the category correlation matrix;
because the class correlation matrix C is symmetrical, the class correlation matrix C is decomposed into a low-rank latent factor matrix E by adopting a nonnegative symmetrical matrix decomposition method as a formulaCAnd its transposed matrix
Figure BDA0003025246830000073
Dot product of these two matrices, the above formula is as follows:
Figure BDA0003025246830000074
wherein | · | purple sweetFFrobenius norm, λ, representing the matrixCIs a regularization term parameter;
after decomposition is complete, the latent factor matrix ECIs a common access latency factor for location v, denoted as
Figure BDA0003025246830000075
Step 2.4, the access time matrix is converted into latent factors of the characteristics by a decomposition method, and the latent factors comprise the steps of;
because the access time matrix S is symmetrical, a nonnegative symmetrical matrix decomposition method is adopted as a formula to decompose the access time matrix S into low-rank latent factor momentsArray ESAnd its transposed matrix
Figure BDA0003025246830000076
Dot product of these two matrices, the above formula is as follows:
Figure BDA0003025246830000077
wherein | · | purple sweetFFrobenius norm, λ, representing the matrixSIs a regularization term parameter;
after decomposition is complete, the latent factor matrix ESIs a common access latency factor for location v, denoted as
Figure BDA0003025246830000078
Step 3, splicing potential factors of the features to obtain feature vectors of the positions
Figure BDA0003025246830000079
Feature vector
Figure BDA00030252468300000710
The expression of (a) is:
Figure BDA0003025246830000081
step 4, acquiring potential factors of the position according to the user-position diagram
Figure BDA0003025246830000082
Step 4.1, defining potential factors
Figure BDA0003025246830000083
The expression of (a) is as follows:
Figure BDA0003025246830000084
where B (j) represents a set of users who have interacted with a location, t represents a user, ptIs the initial embedded vector of the user, W and b represent the weight and deviation of the two-layer neural network, sigma represents the nonlinear activation function, AggusersIs an aggregation function, and the expression is as follows:
Figure BDA0003025246830000085
wherein, mujtIndicates position VjAttention weight for interaction with user t;
step 4.2, attention weight mu is paid through two layers of neural networksjtPerforming parameterization, and weighting attention after parameterization
Figure BDA0003025246830000086
The expression of (a) is as follows:
Figure BDA0003025246830000087
wherein the content of the first and second substances,
Figure BDA0003025246830000088
and W1Representing weights of two-layer neural networks, b1And b2Representing the deviation of a two-layer neural network, ptIs the initial embedding vector of the user, qjRepresents the embedded vector for location j, σ represents the nonlinear activation function,
Figure BDA0003025246830000089
a join operator representing two vectors;
step 4.3, weighting the attention by using Softmax function
Figure BDA00030252468300000810
Carrying out normalization processing to obtain the final attention weight mujtThe expression is as follows:
Figure BDA00030252468300000811
step 4.4, according to the attention weight mujtTo obtain the potential factor
Figure BDA00030252468300000812
The expression of (a) is as follows:
Figure BDA00030252468300000813
step 5, potential factor according to position
Figure BDA0003025246830000091
And feature vector of position
Figure BDA0003025246830000092
Connected as a position vector zjAnd completing vectorization modeling.
Position vector zjThe expression of (a) is:
Figure BDA0003025246830000093
wherein, sigma represents a nonlinear activation function, W and b represent weight and deviation of two layers of neural networks,
Figure BDA0003025246830000094
representing the join operator of two vectors.
And 6, obtaining the feature vector representation of the user according to the check-in record of the user. Given user u, the set of locations it has visited is VuFeature vector h of user uuIs represented as follows:
Figure BDA0003025246830000095
wherein, | VuAnd | is the number of locations visited by user u.
And 7, based on the position vectorization modeling method, the model applied to position recommendation, namely MFRec, is given below. The model obtains position vector representation z through the stepsjAnd user vector representation huConnected to input into a multi-layer perceptron, the activation function sigma of the output layer·For limiting the output to the range (0, 1), the expression is as follows:
Figure BDA0003025246830000096
the MFRec model is trained and optimized using the following objective function:
Figure BDA0003025246830000097
where F denotes all trainable model parameters and λ prevents overfitting. In terms of parameter setting, the user embedding vector p and the position embedding vector q are set to 64, the hidden layer is set to 64, the nonlinear activation function σ is set to ReLU, the batch size is set to 256, and the learning rate is set to 0.002. In the process of training the model, a packet loss method is used to prevent overfitting, and the packet loss rate is set to be 0.2. For all neural network methods, the model parameters were initialized randomly using a gaussian distribution with mean and standard deviation of 0 and 0.1, respectively.
Two representative public data sets were chosen for verification of this example, the first being a check-in record for Tokyo on Foursquare and the second being a check-in record for New York on Gowalla. To reduce the negative impact of the cyber navy and cold door locations on the experiment, we removed users who visited less than 3 locations and locations who visited less than 5 users on both data sets. The processed Foursquare data set has 2293 users, 7873 positions, 447,512 check-in records and 176 categories; the Gowalla dataset had 5426 users, 8065 locations, 349,203 check-in records, 268 categories, as shown in Table one:
watch 1
Data set User' s Position of Sign-in record Categories
Foursquare 2293 7873 447,521 176
Gowalla 5426 8065 349,203 268
For each data set, we randomly selected 70% of the historical interactions of each user to form a training set, then randomly selected 10% of the interactions as a validation set to optimize parameters, and the rest as a test set.
To evaluate the performance of the proposed method of this embodiment, two general measurement methods were chosen to evaluate the TOP-N position recommendation experiment results, accuracy (Precision) and Recall (Recall), respectively. Next, these two evaluation indexes are briefly described.
Accuracy @ k: given a user u and a test set
Figure BDA0003025246830000101
The TOP-N recommended accuracy formula is as follows:
Figure BDA0003025246830000102
wherein VnIs the TOP-N recommendation given by the algorithm.
Recall @ k: similar to accuracy @ k, the recall rate recommended by TOP-N is defined as:
Figure BDA0003025246830000111
comparative experiments were performed using a variant of the POIR-GNN model, POIR-GNN-L and POIR-GNN, which compares, using only the check-in record of the user when performing user modeling, regardless of the social information of the user. In the experiment, POIR-GNN-L is compared with the following position recommendation algorithms, and the effectiveness of the method is verified. These baseline algorithms are briefly summarized below:
WRMF: weighted regularized matrix decomposition, which is a proposed technique widely used in implicit feedback data sets, decomposes a user-location matrix to obtain vector representations of users and locations.
USG is a unified location recommendation framework that uses user-based collaborative filtering and naive Bayes to fuse user preferences for location with social and geographic impact.
GeoMF is a weighted matrix decomposition method for location recommendation that extends WRMF by introducing spatial clustering phenomena in LBSs into matrix decomposition, thereby improving recommendation performance.
ARMF is a hybrid recommendation framework that takes advantage of potential sign-on of friends of users to make accurate location recommendations, taking into account geographic and category correlations between users and locations in their recommendation process.
RecNet: the co-visit, geographical and category information of the location is converted into a feature vector representation of the location and the user by feature embedding, and then the embedded location and the user are input into the DNN.
Among these comparison algorithms, WRMF is a classical factorization method of implicit feedback data sets. USG, GeoMF, ARMF, and RecNet all take advantage of geographical effects or other features in lbs ns to improve recommendation performance, wherein RecNet also takes advantage of deep neural network technology.
From a real-world perspective, for a typical TOP-N recommended task, the large value of N is usually ignored, so we only give the results with N set to 5 and 10. The experimental results for all models on the Foursquare dataset are shown in fig. 2 and on the Gowalla dataset in fig. 3.
The experimental results of fig. 2 and fig. 3 both show that the POIR-GNN-L model containing the method proposed by the present application is superior to all baselines, and it is proved that the model effectively fuses four kinds of characteristic information, i.e. co-visit, geographical proximity, category information and visit time of a location, and successfully improves personalized recommendation of the location and the quality of recommendation thereof. We next performed detailed analysis and comparison of the recommendation performance of various methods:
(1) WRMF lags far behind other algorithms because it simply decomposes the user-location matrix without taking advantage of geographical impact and other features in lbs ns. Therefore, it may be very susceptible to data sparsity problems.
(2) The performance of USG and GeoMF is superior to WRMF, indicating that building a geographical impact model in lbs ns is very important for location recommendation.
(3) ARMF uses both geographic and category information for locations in LBSs, but its performance is still worse than POIR-GNN-L, indicating that POIR-GNN-L can combine various characteristics of a location more efficiently than ARMF.
(4) RecNet is the method that uses the most characteristic information of the position in all baselines, however, POIR-GNN-L performs better than RecNet, and shows that the visit time considering the position is also important in making position recommendations.
(5) The POIR-GNN-L is superior to all comparison methods on the two data sets, which shows that the data sparseness problem can be relieved by various kinds of position information in the LBSs, and the model can effectively fuse four kinds of characteristic information, namely common visit, geographical proximity, category information and visit time of the positions into vector representation of the positions.
To further demonstrate that the co-visit, geographical proximity, category information and visit time of the location proposed in the present application have a positive impact on improving personalized location recommendation, excluding the impact of the graphical neural network approach on the above comparative experiments, we performed comparative experiments on POIR-GNN-L and its four variants. Wherein, the POIR-GNN-L1 only uses the geographical information of the position in the position modeling process, the POIR-GNN-L2 only uses the common access information of the position in the position modeling process, the POIR-GNN-L3 only uses the category information of the position in the position modeling process, and the POIR-GNN-L4 only uses the access time of the position in the position modeling process. During the experiment, we set N to 5 and 10 as well, and the experimental results on the Foursquare dataset for all models are shown in fig. 4 and the experimental results on the Gowalla dataset are shown in fig. 5.
The experimental results of fig. 4 and fig. 5 both show that the POIR-GNN-L model is superior to all its variants, and the results prove that the method proposed by us can effectively fuse the four characteristics of co-visit mode, geographical distance, category information and visit time of the positions in the lbs ns, and can improve the quality of personalized position recommendation, and at the same time, has a positive effect on mitigating data sparsity. In addition, of these four features, geographic distance is more important to improve location recommendations.
The application researches how to effectively fuse various characteristics of the position and learns the common influence of the characteristics on the recommendation effect, so that accurate position recommendation is realized. The method comprises the following steps of firstly constructing a common access matrix of positions by using check-in records of users in LBSs, wherein common access counts among the positions are recorded in pairs, then constructing a geographical adjacent matrix of the positions according to physical distances among the positions, constructing a category correlation matrix of the positions according to category information of the positions, constructing an access time matrix of the positions by using check-in timestamps of the users, converting the access time matrix into potential vector representations through matrix decomposition, and learning the joint influence of the access time matrix and the potential vector representations on user behaviors to obtain a position vector representation fusing four characteristics of a common access mode, the geographical distances, the category information and the access time of the positions. Then, full experiments are carried out on 2 public data sets, the performance of the proposed model is verified to be superior to that of the compared most advanced position recommendation model, and the position modeling which is considered by the application and integrates four characteristics of the position co-visit mode, the geographic distance, the category information and the visit time has positive effects on position recommendation and data sparseness alleviation.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims (4)

1. A social network location vectorization modeling method is characterized by comprising the following steps:
constructing a feature matrix of the position according to the feature information in the public data set;
converting the feature matrix into latent factors of the features according to a decomposition method;
splicing potential factors of the features to obtain a feature vector of the position
Figure FDA0003545323790000011
Obtaining potential factors for a location from a user-location graph
Figure FDA0003545323790000012
Latent factor according to location
Figure FDA0003545323790000013
And feature vector of position
Figure FDA0003545323790000014
Connected as a position vector zjThus completing vectorization modeling;
wherein the constructing the feature matrix of the location comprises:
constructing a common access matrix of locations from check-in records of users, wherein common access counts between locations are pair-wise recorded;
constructing a geographical proximity matrix of the locations according to the physical distances between the locations;
constructing a category correlation matrix of the position according to the category information of the position;
constructing an access time matrix of the position according to the sign-in timestamp of the user;
wherein the transforming the matrix into latent factors of the features according to the decomposition method comprises:
converting the common access matrix into latent factors for the features by a decomposition method, comprising:
because the common access matrix O is symmetrical, the common access matrix O is decomposed into a low-rank potential factor matrix E by adopting a non-negative symmetrical matrix decomposition method as a formulaOAnd its transposed matrix
Figure FDA0003545323790000015
Dot product of these two matrices, the above formula is as follows:
Figure FDA0003545323790000016
wherein | · | purple sweetFFrobenius norm, λ, representing the matrixORepresenting a regularization term parameter;
after decomposition is complete, the latent factor matrix EOIs a common access latency factor for location v, denoted as
Figure FDA0003545323790000021
Transforming the geographic proximity matrix into latent factors for the features by a decomposition method, including;
because the geographic proximity matrix G is symmetrical, the geographic proximity matrix G is decomposed into a low-rank potential factor matrix E by adopting a nonnegative symmetrical matrix decomposition method as a formulaGAnd its transposed matrix
Figure FDA0003545323790000022
Dot product of these two matrices, the above formula is as follows:
Figure FDA0003545323790000023
wherein | · | purple sweetFFrobenius norm, λ, representing the matrixGIs a regularization term parameter;
after decomposition is complete, the latent factor matrix EGIs a common access latency factor for location v, denoted as
Figure FDA0003545323790000024
Transforming the class correlation matrix into latent factors for the features by a decomposition method, including;
because the class correlation matrix C is symmetrical, the class correlation matrix C is decomposed into a low-rank latent factor matrix E by adopting a nonnegative symmetrical matrix decomposition method as a formulaCAnd its transposed matrix
Figure FDA0003545323790000025
Dot product of these two matrices, the above formula is as follows:
Figure FDA0003545323790000026
wherein | · | purple sweetFFrobenius norm, λ, representing the matrixCIs a regularization term parameter;
after decomposition is complete, the latent factor matrix ECIs at position vCommon access potential factor, denoted as
Figure FDA0003545323790000027
Converting the access time matrix into latent factors of the features by a decomposition method, which includes;
because the access time matrix S is symmetrical, a nonnegative symmetrical matrix decomposition method is adopted as a formula to decompose the access time matrix S into a low-rank potential factor matrix ESAnd its transposed matrix
Figure FDA0003545323790000028
Dot product of these two matrices, the above formula is as follows:
Figure FDA0003545323790000029
wherein | · | purple sweetFFrobenius norm, λ, representing the matrixSIs a regularization term parameter;
after decomposition is complete, the latent factor matrix ESIs a common access latency factor for location v, denoted as
Figure FDA0003545323790000031
2. The social network location vectorization modeling method according to claim 1, wherein the potential factors are spliced to obtain a location feature vector
Figure FDA0003545323790000032
The method comprises the following steps:
feature vector
Figure FDA0003545323790000033
The expression of (a) is:
Figure FDA0003545323790000034
3. the social network location vectorization modeling method according to claim 1, wherein the potential factors of the location are obtained according to a user-location graph
Figure FDA0003545323790000035
The method comprises the following steps:
defining potential factors
Figure FDA0003545323790000036
The expression of (a) is as follows:
Figure FDA0003545323790000037
where B (j) represents a set of users who have interacted with a location, t represents a user, ptIs the initial embedded vector of the user, W and b represent the weight and deviation of the two-layer neural network, sigma represents the nonlinear activation function, AggusersIs an aggregation function, and the expression is as follows:
Figure FDA0003545323790000038
wherein, mujtIndicates position VjAttention weight for interaction with user t;
attention weighting mu through two layers of neural networksjtPerforming parameterization, and weighting attention after parameterization
Figure FDA0003545323790000039
The expression of (a) is as follows:
Figure FDA00035453237900000310
wherein the content of the first and second substances,
Figure FDA00035453237900000311
and W1Representing weights of two-layer neural networks, b1And b2Representing the deviation of a two-layer neural network, ptIs the initial embedding vector of the user, qjRepresents the embedded vector for location j, σ represents the nonlinear activation function,
Figure FDA0003545323790000041
a join operator representing two vectors;
weighting the above notes by a Softmax function
Figure FDA0003545323790000042
Carrying out normalization processing to obtain the final attention weight mujtThe expression is as follows:
Figure FDA0003545323790000043
according to the attention weight mujtTo obtain the potential factor
Figure FDA0003545323790000044
The expression of (a) is as follows:
Figure FDA0003545323790000045
4. the social network location vectorization modeling method according to claim 1, wherein the potential factors according to location
Figure FDA0003545323790000046
And feature vector of position
Figure FDA0003545323790000047
Connected as a position vector zjThe method comprises the following steps:
position vector zjThe expression of (a) is:
Figure FDA0003545323790000048
wherein, sigma represents a nonlinear activation function, W and b represent weight and deviation of two layers of neural networks,
Figure FDA0003545323790000049
representing the join operator of two vectors.
CN202110414543.8A 2021-04-16 2021-04-16 Social network position vectorization modeling method Active CN113158089B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110414543.8A CN113158089B (en) 2021-04-16 2021-04-16 Social network position vectorization modeling method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110414543.8A CN113158089B (en) 2021-04-16 2021-04-16 Social network position vectorization modeling method

Publications (2)

Publication Number Publication Date
CN113158089A CN113158089A (en) 2021-07-23
CN113158089B true CN113158089B (en) 2022-04-19

Family

ID=76868602

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110414543.8A Active CN113158089B (en) 2021-04-16 2021-04-16 Social network position vectorization modeling method

Country Status (1)

Country Link
CN (1) CN113158089B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105608174A (en) * 2015-12-21 2016-05-25 西北工业大学 Cross-modal node link clustering based community discovery method
CN106372072A (en) * 2015-07-20 2017-02-01 北京大学 Location-based recognition method for user relations in mobile social network
CN108460101A (en) * 2018-02-05 2018-08-28 山东师范大学 Point of interest of the facing position social networks based on geographical location regularization recommends method
CN111460277A (en) * 2020-02-19 2020-07-28 天津大学 Personalized recommendation method based on mobile social network tree-shaped transmission path
CN111563770A (en) * 2020-04-27 2020-08-21 杭州金智塔科技有限公司 Click rate estimation method based on feature differentiation learning

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120010867A1 (en) * 2002-12-10 2012-01-12 Jeffrey Scott Eder Personalized Medicine System
US11074495B2 (en) * 2013-02-28 2021-07-27 Z Advanced Computing, Inc. (Zac) System and method for extremely efficient image and pattern recognition and artificial intelligence platform

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106372072A (en) * 2015-07-20 2017-02-01 北京大学 Location-based recognition method for user relations in mobile social network
CN105608174A (en) * 2015-12-21 2016-05-25 西北工业大学 Cross-modal node link clustering based community discovery method
CN108460101A (en) * 2018-02-05 2018-08-28 山东师范大学 Point of interest of the facing position social networks based on geographical location regularization recommends method
CN111460277A (en) * 2020-02-19 2020-07-28 天津大学 Personalized recommendation method based on mobile social network tree-shaped transmission path
CN111563770A (en) * 2020-04-27 2020-08-21 杭州金智塔科技有限公司 Click rate estimation method based on feature differentiation learning

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Social network dominance based on analysis of asymmetry;Yuemeng Li 等;《2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)》;20161124;146-151 *
基于LBSNs的个性化位置推荐算法研究;陈心怡;《中国优秀硕士学位论文全文数据库 信息科技辑》;20220215(第02(2022)期);I138-1340 *
基于多因素的矩阵分解推荐算法研究与实现;张文博;《中国优秀硕士学位论文全文数据库 信息科技辑》;20190815(第08(2019)期);I138-1443 *
社交网络中基于地理位置特征的社团发现方法研究与实现;蒋江涛;《中国优秀硕士学位论文全文数据库 信息科技辑》;20150215(第02(2015)期);I139-139 *

Also Published As

Publication number Publication date
CN113158089A (en) 2021-07-23

Similar Documents

Publication Publication Date Title
US10810463B2 (en) Updating attribute data structures to indicate joint relationships among attributes and predictive outputs for training automated modeling systems
CN111859166B (en) Article scoring prediction method based on improved graph convolution neural network
CN111797321B (en) Personalized knowledge recommendation method and system for different scenes
CN110362738B (en) Deep learning-based individual recommendation method combining trust and influence
CN114048331A (en) Knowledge graph recommendation method and system based on improved KGAT model
CN108470052B (en) Anti-trust attack recommendation algorithm based on matrix completion
CN112800207B (en) Commodity information recommendation method and device and storage medium
CN113918834B (en) Graph convolution collaborative filtering recommendation method fusing social relations
CN112258262A (en) Conversation recommendation method based on convolution self-attention network
CN112380433A (en) Recommendation meta-learning method for cold-start user
Gong Deep belief network-based multifeature fusion music classification algorithm and simulation
Liu et al. Deep learning and collaborative filtering-based methods for students’ performance prediction and course recommendation
CN110910235A (en) Method for detecting abnormal behavior in credit based on user relationship network
CN116664253B (en) Project recommendation method based on generalized matrix decomposition and attention shielding
CN112905894A (en) Collaborative filtering recommendation method based on enhanced graph learning
CN113158089B (en) Social network position vectorization modeling method
WO2020093817A1 (en) Identity verification method and device
CN113158088A (en) Position recommendation method based on graph neural network
George et al. Hy-MOM: Hybrid recommender system framework using memory-based and model-based collaborative filtering framework
Chen et al. Gaussian mixture embedding of multiple node roles in networks
CN113392958B (en) Parameter optimization and application method and system of fuzzy neural network FNN
CN117033997A (en) Data segmentation method, device, electronic equipment and medium
CN114529399A (en) User data processing method, device, computer equipment and storage medium
Chen et al. Incomplete data analysis
Liu Research on personalized minority tourist route recommendation algorithm based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20210723

Assignee: Guilin Huajieyu Network Technology Co.,Ltd.

Assignor: GUILIN University OF ELECTRONIC TECHNOLOGY

Contract record no.: X2022450000361

Denomination of invention: A Method of Social Network Location Vector Modeling

Granted publication date: 20220419

License type: Common License

Record date: 20221219

EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20210723

Assignee: Guilin Jinghui Software Technology Co.,Ltd.

Assignor: GUILIN University OF ELECTRONIC TECHNOLOGY

Contract record no.: X2022450000428

Denomination of invention: A Method of Social Network Location Vector Modeling

Granted publication date: 20220419

License type: Common License

Record date: 20221227

EE01 Entry into force of recordation of patent licensing contract