CN116166885A - Interest point recommendation method based on user multi-behavior enhancement and efficient rich information negative sampling - Google Patents

Interest point recommendation method based on user multi-behavior enhancement and efficient rich information negative sampling Download PDF

Info

Publication number
CN116166885A
CN116166885A CN202310037251.6A CN202310037251A CN116166885A CN 116166885 A CN116166885 A CN 116166885A CN 202310037251 A CN202310037251 A CN 202310037251A CN 116166885 A CN116166885 A CN 116166885A
Authority
CN
China
Prior art keywords
interest
point
user
points
sign
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310037251.6A
Other languages
Chinese (zh)
Inventor
顾晶晶
李瀚哲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN202310037251.6A priority Critical patent/CN116166885A/en
Publication of CN116166885A publication Critical patent/CN116166885A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Algebra (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a point-of-interest recommendation method based on user multi-behavior enhancement and efficient rich information negative sampling, which comprises the following steps: obtaining a plurality of lines of data of a user and preprocessing the data; constructing three context graphs by using the user multi-behavior data; learning a multi-view representation of a user and a point of interest using a Skip-Gram model; constructing a pairwise data set, calculating a probability score of occurrence of each user-interest point sign-in pair by using a pairwise ordering model, and calculating a BPR loss function; replacing the negative sampling part in the two models with a negative sampling method based on a reference point diagram; designing a joint learning framework, and combining the loss of the Skip-Gram model with the loss function of the BPR; and calculating the probability score of the user to any interest point in the future through the pairwise ordering model, and recommending the interest point for the user. The method can more accurately recommend the interest points to the user, and the composition mode can be flexibly applied to different recommendation scenes.

Description

Interest point recommendation method based on user multi-behavior enhancement and efficient rich information negative sampling
Technical Field
The invention belongs to the technical field of Point-of-Interest (POI) recommendation, and particularly relates to a Point-of-Interest recommendation method based on user multi-behavior enhancement and efficient rich information negative sampling.
Background
With the rapid development of internet technology, network information presents an explosive growth trend. How to screen out the interested parts of the user from the massive information becomes more difficult, and personalized recommendation systems are generated. According to statistics, the number of people surfing the internet of mobile phones in China exceeds 9.86 million people by 2020, the people continuously provide various mobile information about mass travel, urban development and the like for countries, enterprises and various institutions, a Location-based network service (Location-BasedNetworking Service, LBNS) is formed based on GPS information, the development of path planning, interest point searching, civil guarantee and the like is promoted, convenience and guarantee are provided for life of urban residents, urban planning advances towards the direction of urban development, business site selection becomes more reasonable, and considerable commercial benefit is provided.
Point-of-interest (POI) recommendations are a very challenging research hotspot in the area of urban computing, playing an important role in LBNS. With the maturation of wireless communication and location acquisition technologies, many applications based on location information have emerged, such as drip-driven vehicles, google maps, uber, etc. Using this software, users generate a large number of behavioral records (e.g., check-in information, movement trajectories, and map queries), which all record the user's real life experiences. The task of recommending the interest points is to use the behavior records to personally recommend places which are interested in or even have not been moved to the users, so that the users are helped to explore urban life, provide travel service, improve life quality and promote new business value.
Traditional point of interest recommendation methods are based on collaborative filtering (Collaborative Filtering, CF) and utilize user sign-in data to learn about user preferences. However, severe data sparsity significantly reduces point of interest recommendation performance due to difficulty in collecting sufficient check-in logs and limitations on user range of motion. To address this problem, some studies have structured ranking pairs to more fully describe user preferences. Meanwhile, jiao et al use contextual information, such as the category and geographic impact of points of interest, to better understand the user's access behavior. Recently, xie, zhang et al have proposed some graph-based point of interest recommendation methods aimed at extracting relationship graphs from various contexts and learning representations through graph embedding or GNN models for more accurate modeling.
While there have been some work in the existing research works on point of interest recommendations, most work is based only on user historical check-in data, and because of the severe sparsity constraint of check-in data, the mining of user check-in behavior is still inadequate. In addition, the above work also ignores effective research on user map query behavior and effective negative samples.
Disclosure of Invention
Aiming at the problems existing in the prior art, the invention provides the interest point recommending method based on user multi-behavior enhancement and high-efficiency rich information negative sampling, which is suitable for a user sign-in data set and a map query data set, realizes more complete user multi-behavior relation extraction, further explores the influence of a negative sample with more information acquired based on geographic factors on the user trip interest point selection, enhances the performance of the interest point recommending, and recommends the interest points which are more in line with the preference of the user.
The technical solution for realizing the purpose of the invention is as follows: a point of interest recommendation method based on user multi-behavior enhancement and efficient negative sampling of rich information, the method comprising the steps of:
step 1, acquiring user multi-behavior data, and preprocessing the data to remove outlier data;
step 2, respectively constructing three context relation diagrams by utilizing the user multi-behavior data: sign-in interest point and query interest point relationship graph G p-q User and query point of interest relationship graph G u-q And a sign-in interest point-to-sign-in interest point relationship graph G p-p The method comprises the steps of carrying out a first treatment on the surface of the Wherein each node in the relationship graph represents a user or point of interest;
step 3, utilizing a Skip-Gram model to mine user multi-behavior relations in the three context relation diagrams, and learning multi-view representation of the user and the interest points;
step 4, constructing a pair data set, calculating a probability score of occurrence of each user-interest point sign-in pair by using a pair ordering model, and calculating BPR loss;
step 5, replacing the negative sampling part in the Skip-Gram and pairwise ordering model with a negative sampling method based on a reference point diagram;
step 6, designing a joint learning framework, combining the Skip-Gram model with a loss function of the BPR method, and jointly learning the representations of users and interest points in a sign-in and query space;
and 7, calculating the probability score of the user to any interest point in the future by using the pairwise ordering model, and recommending the interest point for the user.
A point of interest recommendation system based on user multi-behavior enhancement and efficient negative sampling of rich information, the system comprising:
the first module is used for acquiring multi-behavior data of the user, preprocessing the data and eliminating outlier data;
the second module is used for respectively constructing three context relation diagrams by utilizing the user multi-behavior data: sign-in interest point and query interest point relationship graph G p-q User and query point of interest relationship graph G u-q And a sign-in interest point-to-sign-in interest point relationship graph G p-p The method comprises the steps of carrying out a first treatment on the surface of the Wherein each node in the relationship graph represents a user or point of interest;
the third module is used for utilizing the Skip-Gram model to mine the multi-behavior relation of the users in the three context relation diagrams and learning multi-view representation of the users and the interest points;
a fourth module for constructing a pairwise data set, calculating a likelihood score for each user-point of interest sign-in pair occurrence using a pairwise ordering model, and calculating BPR loss;
a fifth module for replacing the negative sampling part in the Skip-Gram and pairwise ordering model with a negative sampling method based on a reference point diagram;
a sixth module, configured to design a joint learning framework, combine the Skip-Gram model with a loss function of the BPR method, and jointly learn representations of the user and the interest point in a check-in and query space;
and a seventh module for calculating the likelihood score of the user to any interest point in the future by using the pairwise ordering model, and recommending the interest point for the user.
Compared with the prior art, the invention has the remarkable advantages that:
1) Innovative exploration of implicit information and internal relationships of user multi-behavior data can alleviate the problem of data scarcity by combining user sign-in behavior with map query behavior.
2) Three context relation diagrams are constructed aiming at user signing behaviors and map query behaviors, and more accurate user and interest point representations are learned.
3) The negative sampling method based on the reference point diagram is used, geographic factors are combined into the sampling process, a negative sample with more information is generated, and the model convergence speed and the characterization learning precision are improved.
4) A joint learning framework is used to avoid the overfitting problem by sharing representations of users and points of interest.
The invention is described in further detail below with reference to the accompanying drawings.
Drawings
FIG. 1 is a flow chart of a point of interest recommendation method based on user multi-behavior enhancement and efficient negative sampling of rich information.
FIG. 2 is a graph of algorithmic error versus performance for one embodiment, where graph (a) is a graph of performance versus Precision for the four variants of the present invention and graph (b) is a graph of performance versus Recall for the four variants of the present invention.
FIG. 3 is a graph of algorithmic error versus performance for one embodiment, wherein graph (a) is a graph of performance versus performance for four variants of the present invention on MAP and graph (b) is a graph of performance versus performance for four variants of the present invention on NDCG.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
It should be noted that, if there is a description of "first", "second", etc. in the embodiments of the present invention, the description of "first", "second", etc. is only for descriptive purposes, and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present invention.
In one embodiment, there is provided a point of interest recommendation method based on user multi-behavior enhancement and efficient rich information negative sampling, the method comprising the steps of:
step 1, acquiring user multi-behavior data (user interest point sign-in data and user map query data), and preprocessing the data to eliminate outlier data;
here, the user multi-behavior data includes:
user interest point sign-in data comprising user ID, sign-in time, interest point ID, interest point name, interest point category and interest point location;
the user map query data includes user ID, query time, point of interest ID, point of interest name, point of interest category, and point of interest location.
Step 2, respectively constructing three context relation diagrams by utilizing the user multi-behavior data: sign-in interest point and query interest point relationship graph G p-q User and query point of interest relationship graph G u-q Sign-in interest point-to-sign-in interest point relationship graph G p-p The method comprises the steps of carrying out a first treatment on the surface of the Wherein each node in the graph represents a user or point of interest;
step 3, utilizing a Skip-Gram model to mine user multi-behavior relations in the three context relation diagrams, and learning multi-view representation of the user and the interest points;
step 4, constructing a pair data set, calculating a probability score of occurrence of each user-interest point sign-in pair by using a pair ordering model, and calculating BPR loss;
step 5, replacing the negative sampling part in the Skip-Gram and pairwise ordering model with a negative sampling method based on a reference point diagram;
here, since the common negative sampling method cannot screen huge negative samples, selecting the negative samples which are more important and have more information can improve the model convergence speed and the characterization learning precision. Therefore, a negative sampling method based on the reference point diagram is adopted to improve the accuracy of characterization learning.
Step 6, designing a joint learning framework, combining the Skip-Gram model with a loss function of the BPR method, and jointly learning the representations of users and interest points in a sign-in and query space;
and 7, calculating the probability score of the user to any interest point in the future by using the pairwise ordering model, and recommending the interest point for the user.
Further, in one embodiment, three context graphs are constructed in step 2, specifically:
step 2-1, firstly, dividing a day into four time periods: (1) an morning period, i.e., 3:00 a.m. to 10:59 a.m., (2) an afternoon period, i.e., 11:00 a.m. to 15:59 a.m., (3) an evening period, i.e., 16:00 a.m. to 21:59 a.m., and (4) a midnight period, i.e., 22:00 a.m. to 2:59 a.m.; combining the two parts of the weekday and the weekend to obtain eight time periods;
step 2-2, constructing three context graphs according to the time period:
the sign-in interest point and the query interest point diagram G p-q The method comprises the following steps: establishing edges between points of interest within the same day or time period where check-in and query actions occur, expressed as
Figure BDA0004049307580000051
Wherein->
Figure BDA0004049307580000052
Representing a point of interest v j Representation in sign-in space, < >>
Figure BDA0004049307580000053
Representing a point of interest v k A representation in a query space; the graph may be denoted as G p-q =(V,E p-q ,W p-q ) Where V represents a set of points of interest, E p-q Representing a collection of edges in the graph, W p-q Recording the number of times of edge occurrence in the graph;
the relationship graph G between the user and the query interest point u-q The method comprises the following steps: establishing edges between a user and points of interest he is querying during the same time period, expressed as
Figure BDA0004049307580000054
Wherein u is i Representing user i +.>
Figure BDA0004049307580000055
Representing a point of interest v k A representation in a query space; the graph may be denoted as G u-q =(U,V,E u-q ,W u-q ) Where U represents the user set, V represents the point of interest set E u-q Representing a collection of edges in the graph, W u-q Recording the number of times of edge occurrence in the graph;
the sign-in interest point and the sign-in interest point relation graph G p-p The method comprises the following steps: establishing edges between points of interest that users check in on the same day or same time period, expressed as
Figure BDA0004049307580000056
Wherein->
Figure BDA0004049307580000057
And->
Figure BDA0004049307580000058
Respectively represent points of interest v j And v k A representation in check-in space; the graph may be denoted as G p-p =(V,E p-p ,W p-p ) The method comprises the steps of carrying out a first treatment on the surface of the Where V represents a set of points of interest, E p-p Representing a collection of edges in the graph, W p-p The number of edge occurrences in the graph is recorded.
Further, in one embodiment, the Skip-Gram model used in step 3 is specifically:
for the check-in interest point and the query interest point map G described in the step 2 p-q The loss function of Skip-Gram model is:
Figure BDA0004049307580000061
where σ is a logical stet function: sigma (x) =1/(1+e++x), L represents the number of negative samples acquired, P v Representing the distribution of the negative sample acquisitions,
Figure BDA0004049307580000062
and v q Are vector representations of points of interest in the query space.
Graph G for user and query interest point u-q The loss function of Skip-Gram model is:
Figure BDA0004049307580000063
in the method, in the process of the invention,
Figure BDA0004049307580000064
is a vector representation of user i in the query space;
for the sign-in interest point and sign-in interest point relation graph G p-p The loss function of Skip-Gram model is:
Figure BDA0004049307580000065
in the method, in the process of the invention,
Figure BDA0004049307580000066
are vector representations of points of interest in space.
Further, in one embodiment, the structuring of the data set in step 4 is specifically:
Figure BDA0004049307580000067
in the method, in the process of the invention,
Figure BDA0004049307580000068
representing a set of points of interest checked in by a user i; (u) i ,v j ,v k )∈D s Representing, for user i, his accessed point of interest v j Is greater than the point of interest v that has not been accessed k May also be expressed as v j >u i,poi v k Wherein > u i,poi Satisfying 1) completeness, 2) antisymmetry, 3) transmissibility.
Further, in one embodiment, the step 4 uses a pairwise ordering model to calculate a likelihood score for the occurrence of a user-interest point check-in pair, specifically:
point of interest v j And v k The ordering relationship between them can also be expressed as:
Figure BDA00040493075800000612
in the formula, v j Representing user u i Accessed points of interest, v k Indicating points of interest that have not been accessed by the user, x ij And x ik Respectively represent future sign-in interest points v of the user j And v k Is a score of (2); x is x ij The calculation formula of (2) is as follows:
Figure BDA0004049307580000069
where w represents a hyper-parameter for balancing user query preferences and check-in preferences;
Figure BDA00040493075800000610
and->
Figure BDA00040493075800000611
Respectively the usersi vector representation in check-in and query space and points of interest v j A vector representation in sign-in space; here, no +.>
Figure BDA0004049307580000071
Because it is the probability that the user queries a certain point of interest, and not the probability of signing in to the corresponding point of interest. The loss function formula of the BPR is specifically:
Figure BDA0004049307580000072
wherein D is s For paired datasets, σ is a logical stet function: sigma (x) =1/(1+e (-x)).
Further, in one embodiment, in step 5, the negative sampling part in Skip-Gram and pairwise ordering model is replaced by a negative sampling method based on a benchmark plot, and the specific process includes:
step 5-1, randomly selecting interest points from the interest point set V according to a certain proportion tau, namely datum points, and defining the datum point set as V base
Step 5-2, recording other interest points existing in the radius r by taking the datum point as the center, defining that all interest points in the area have adjacent relation in the geographic distance, and constructing a datum point diagram G by using the adjacent relation base =(V,E nei ) Where V represents a set of points of interest, E nei As edges in the reference dot plot, in the form of:
E nei ={(v i ,v j )|v base ∈V base ,dis(v i ,v base )<r,dis(v j ,v base )<r}
wherein V is base Representing a set of fiducial points, dis (x, y) representing a geographic distance between two points of interest;
step 5-3, for context diagram G p-p And G p-q Each edge (v) j ,v k ) If v j Edge E located in reference dot diagram nei Then at the pointSet of adjacent points V of points j-nei ={v|(v j ,v)∈E nei Randomly collecting L negative samples in the process; otherwise, randomly sampling; for the context graph G u-q Define a historical sign-in trajectory S for user i i ={s 1 ,s 2 ,......s n Track center point v c ,v c The coordinate calculation formula of (2) is as follows:
Figure BDA0004049307580000073
Figure BDA0004049307580000074
in the method, in the process of the invention,
Figure BDA0004049307580000075
and->
Figure BDA0004049307580000076
Respectively representing longitude and latitude coordinates of a j-th sign-in record of the user;
definition of the definition
Figure BDA0004049307580000077
For the negative sample point set which is acquired for the user i and has the distance of not more than r from the track center point, then
Figure BDA0004049307580000078
The definition is as follows:
Figure BDA0004049307580000079
wherein dis (x, y) represents the distance between two points of interest;
step 5-4, the loss function of Skip-Gram model of the three context graphs in step 3 may be rewritten as:
Figure BDA0004049307580000081
Figure BDA0004049307580000082
Figure BDA0004049307580000083
wherein V is j-nei Representing points of interest
Figure BDA0004049307580000084
Is adjacent to the sample point set,/>
Figure BDA0004049307580000085
Representing a point of interest v neg Vector representation in query space, +.>
Figure BDA0004049307580000086
Representing a point of interest v neg A vector representation in sign-in space;
step 5-5, the loss function of the BPR in step 4 may be rewritten as:
Figure BDA0004049307580000087
wherein D is base The training set is a pairwise ordered training set acquired according to a negative sampling mode based on a reference point diagram, and is as follows:
Figure BDA0004049307580000088
wherein V is j-nei Representing a point of interest v j Is a set of adjacent sample points.
Further, in one embodiment, the joint learning framework calculation formula described in step 6 is:
Figure BDA0004049307580000089
in the formula, alpha, beta and gamma are regularized parameters for controlling the contribution of three context graphs, and a parameter set theta to be trained by the model comprises U c ,V c ,U q ,y q The vector representations of the user and the point of interest in the check-in space and the query space, respectively, are represented with the remaining symbols as above.
Further, in one embodiment, step 7 uses a pairwise ranking model to calculate a likelihood score of the user for any interest point in the future, and recommends the interest point for the user based on the likelihood score, specifically:
step 7-1, learning to obtain a user and interest point representation U according to the joint learning framework in step 6 c ,V c ,U q Calculating the future sign-in interest point v of the user i through the following formula in step 4 j Is a score of (2):
Figure BDA00040493075800000810
where w represents a hyper-parameter for balancing user query preferences and check-in preferences;
Figure BDA0004049307580000091
and->
Figure BDA0004049307580000092
User u respectively i Vector representation and points of interest v in sign-in and query space j A vector representation in sign-in space;
and 7-2, after calculating all the points of interest which are not removed for the user i, sorting the points of interest according to the scores from high to low, and recommending the points of interest with the highest scores for the user.
In one embodiment, a point of interest recommendation system based on user multi-behavior enhancement and efficient negative sampling of rich information is provided, the system comprising:
the first module is used for acquiring multi-behavior data of the user, preprocessing the data and eliminating outlier data;
the second module is used for respectively constructing three context relation diagrams by utilizing the user multi-behavior data: sign-in interest point and query interest point relationship graph G p-q User and query point of interest relationship graph G u-q And a sign-in interest point-to-sign-in interest point relationship graph G p-p The method comprises the steps of carrying out a first treatment on the surface of the Wherein each node in the relationship graph represents a user or point of interest;
the third module is used for utilizing the Skip-Gram model to mine the multi-behavior relation of the users in the three context relation diagrams and learning multi-view representation of the users and the interest points;
a fourth module for constructing a pairwise data set, calculating a likelihood score for each user-point of interest sign-in pair occurrence using a pairwise ordering model, and calculating BPR loss;
a fifth module for replacing the negative sampling part in the Skip-Gram and pairwise ordering model with a negative sampling method based on a reference point diagram;
a sixth module, configured to design a joint learning framework, combine the Skip-Gram model with a loss function of the BPR method, and jointly learn representations of the user and the interest point in a check-in and query space;
and a seventh module for calculating the likelihood score of the user to any interest point in the future by using the pairwise ordering model, and recommending the interest point for the user.
Specific limitations regarding the point of interest recommendation system based on user multi-behavior enhancement and efficient rich information negative sampling can be found in the above description of the point of interest recommendation method based on user multi-behavior enhancement and efficient rich information negative sampling, and will not be described in detail herein. The various modules in the point of interest recommendation system based on user multi-behavior enhancement and efficient rich information negative sampling described above may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
As a specific example, in one embodiment, the present invention is further illustrated.
The present embodiment uses real city user check-in and user map query dataset to conduct experiments. The check-in and query data set includes 474969 check-in data and 443404 query data from 2018, 7, 1, to 2018, 9, 30, each record including a user ID, a check-in/query time, a point of interest ID, a point of interest category, a point of interest location, and a point of interest name. The data sets are divided according to time, the user check-in record of the last week is used as a test set, the last week is used as a verification set, and the rest data are used as training sets. The learning rate is set to 0.001, and the data scale of each part task is different due to the different edge numbers of the three context graphs, and the graph G is used for balancing the three context graphs p-q Sum diagram G u-q Batch size is set to 128, set map G p-p Is of batch size 512. In the model, the potential representation dimension d of the user and the point of interest is set to 32, and the model is optimized using Skip-Gram and BPR loss functions, which are minimized by back propagation. The performance of the model was evaluated using Precision, recall, average Precision (MAP), and normalized loss cumulative gain (NDCG).
In this embodiment, different currently popular interest point recommendation methods are selected as comparison methods to perform comparison experiments: weighted regularization matrix decomposition (WRMF), bayesian personalized ordering (BPR), personalized Top-N sequence recommendation model based on convolutional sequence embedding (CASER), user preference and context information embedding model (PACE), graph neural network based filter collaborative method (NGCF), unwrapped graph collaborative filtering method (DGCF), self-care auto-encoder based on proximity effect sensing (SAE-NAD), collaborative recommendation combined with geographic effect (USG), negative sampling model (MixGCF) on graph based collaborative filtering method, and ultra-simplified graph convolution model (UltraGCN). The method of the present invention is denoted UMPRec, and the variant employing normal random negative sampling in this model is denoted UMPRec-r.
Tables 1 and 2 show the results of point of interest recommendations made by different methods on a real dataset.
Table 1 comparison of the Performance of different methods on real datasets (precision@k, recall@k)
Figure BDA0004049307580000101
Table 2 comparison of the Performance of different methods on real datasets (MAP@k, NDCG@k)
Figure BDA0004049307580000111
From tables 1 and 2 it can be seen that the method proposed by the present invention (UMPRec) performs best. From the table it can be seen that: () (1) CASER and PACE may perform relatively poorly due to too sparse data. In addition, poor PACE performance may also be caused by a lack of effectiveness of PACE for sparse data extraction features using a multi-layer perceptron (MLP); (2) The latter approach performed better than the two approaches of BPR. The negative sampling technology is mainly used for effectively distinguishing positive and negative samples after model training is completed, so that the pairwise ordering method is superior to a point-based optimization algorithm in implicit data; (3) Both USG and SAE-NAD algorithms take into account the influence of geographical factors, so that the method has better performance compared with most comparison methods; (4) The two methods of NGCF, DGCF and UltraGCN are based on GNN and collaborative filtering algorithms, the performances of the two methods are not very different, mainly due to serious data scarcity and similar GNN structures are adopted by the two methods; the UltraGCN performs better because it filters the relation between redundant users and interest points, preventing excessive noise from being added; (5) The MixGCF also obtains better performance due to the utilization of a valid negative sample; (6) Finally, the UMPRec model of the present invention performs better than other comparative methods in any index; meanwhile, the comparison with UMPRec-r variant shows the superiority of the key negative sample selection method based on the benchmark point diagram, and proves that the geographic distance is an important influence factor when a user selects the interest point.
In order to explore the effect of user check-in and query behavior on point of interest recommendation performance, in the present invention, in addition to the complete model "All" that considers All contextual relationships, three variants of the present invention are set: (1) The BPR method is that other context information is not considered, and only sign-in data of a user is used for recommending interest points; (2) WithPOI-POI, i.e. a diagram of contextual relation G between points of interest only p-p The method comprises the steps of carrying out a first treatment on the surface of the (3) WithQuery, i.e. consider only the context graph G related to map query behavior u-q Sum diagram G p-q . It should be noted that, since the umprac model and the umprac-r model both consider the multi-behavior relationship of the user, only the negative sampling mode is different, so that the results of the two models are not greatly different, and the analysis result of the umprac model is mainly presented herein. The final experimental results are shown in fig. 2 and 3.
As can be seen from fig. 2 and 3, first, the model using the user multi-behavior relationship performs better than the BPR method without considering any context information. Secondly, the performance of the WithPOI-POI is better than that of the WithQuery, because the sign-in behavior of the user can clearly indicate the preference of the user to the interest point, and the map query behavior of the user contains more noise and is not necessarily converted into the sign-in behavior. Finally, the effect of the complete model All provided by the invention is better than three variants no matter how much the k value is, namely, when k interest points are recommended for the user. The method and the device indicate that three context relations are extracted from various behaviors of the user, and a certain complementary relation exists between the sign-in behavior and the query behavior of the user, so that point-of-interest recommendation can be promoted together.
In summary, the interest point recommendation method based on the user multi-behavior enhancement and the high-efficiency rich information negative sampling can fully utilize the user multi-behavior data, mine three context relations, relieve the data scarcity and enhance the performance of the interest point recommendation. Meanwhile, a negative sampling method based on the reference point diagram is used, a negative sample of richer information can be extracted, and efficient training and more accurate characterization of users and interest points are realized. Finally, by comparing with other related algorithms, the method of the invention is further verified to be capable of recommending the interest points to the user more accurately.
The foregoing has outlined and described the basic principles, features, and advantages of the present invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and that the above embodiments and descriptions are merely illustrative of the principles of the present invention, and various changes and modifications may be made without departing from the spirit and scope of the invention, which is defined in the appended claims.

Claims (9)

1. A point of interest recommendation method based on user multi-behavior enhancement and efficient rich information negative sampling, the method comprising the steps of:
step 1, acquiring user multi-behavior data, and preprocessing the data to remove outlier data;
step 2, respectively constructing three context relation diagrams by utilizing the user multi-behavior data: sign-in interest point and query interest point relationship graph G p-q User and query point of interest relationship graph G u-q And a sign-in interest point-to-sign-in interest point relationship graph G p-p The method comprises the steps of carrying out a first treatment on the surface of the Wherein each node in the relationship graph represents a user or point of interest;
step 3, utilizing a Skip-Gram model to mine user multi-behavior relations in the three context relation diagrams, and learning multi-view representation of the user and the interest points;
step 4, constructing a pair data set, calculating a probability score of each user-interest point sign-in pair occurrence by using a pair ordering model, and calculating BPR loss;
step 5, replacing the negative sampling part in the Skip-Gram and pairwise ordering model with a negative sampling method based on a reference point diagram;
step 6, designing a joint learning framework, combining the Skip-Gram model with a loss function of the BPR method, and jointly learning the representations of users and interest points in a sign-in and query space;
and 7, calculating the probability score of the user to any interest point in the future by using the pairwise ordering model, and recommending the interest point for the user.
2. The user multi-behavior enhancement and efficient rich information negative sampling-based point of interest recommendation method as recited in claim 1, wherein the user multi-behavior data in step 1 comprises:
user interest point sign-in data comprising user ID, sign-in time, interest point ID, interest point name, interest point category and interest point location;
the user map query data includes a user ID, a query time, a point of interest ID, a point of interest name, a point of interest category, and a point of interest location.
3. The interest point recommendation method based on user multi-behavior enhancement and efficient rich information negative sampling according to claim 1, wherein in step 2, three context graphs are respectively constructed by using user multi-behavior data, and specifically:
step 2-1, dividing a day into four time periods: (1) an morning period, i.e., 3:00 a.m. to 10:59 a.m., (2) an afternoon period, i.e., 11:00 a.m. to 15:59 a.m., (3) an evening period, i.e., 16:00 a.m. to 21:59 a.m., and (4) a midnight period, i.e., 22:00 a.m. to 2:59 a.m.; combining the two parts of the weekday and the weekend to obtain eight time periods;
step 2-2, constructing three context graphs according to the time periods divided in the step 2-1:
the sign-in interest point and query interest point relation graph G p-q The method comprises the following steps: establishing edges between points of interest within the same day or time period where check-in and query actions occur, expressed as
Figure FDA0004049307570000021
Wherein->
Figure FDA0004049307570000022
Representing a point of interest v j Representation in sign-in space, < >>
Figure FDA0004049307570000023
Representing a point of interest v k A representation in a query space; the graph may be denoted as G p-q =(V,E p-q ,W p-q ) Where V represents a set of points of interest, E p-q Representing a collection of edges in the graph, W p-q Recording the number of times of edge occurrence in the graph;
the relationship graph G between the user and the query interest point u-q The method comprises the following steps: establishing edges between a user and points of interest that it queries during the same time period, expressed as
Figure FDA0004049307570000024
Wherein u is i Representing user i +.>
Figure FDA0004049307570000025
Representing a point of interest v k A representation in a query space; the graph may be denoted as G u-q =(U,V,E u-q ,W u-q ) Wherein U represents a user set, V represents a point of interest set, E u-q Representing a collection of edges in the graph, W u-q Recording the number of times of edge occurrence in the graph;
the sign-in interest point and the sign-in interest point relation graph G p-p The method comprises the following steps: establishing edges between points of interest that users check in on the same day or same time period, expressed as
Figure FDA0004049307570000026
Wherein->
Figure FDA0004049307570000027
And->
Figure FDA0004049307570000028
Respectively represent points of interest v j And v k A representation in check-in space; the graph may be denoted as G p-p =(V,E p-p ,W p-p ) The method comprises the steps of carrying out a first treatment on the surface of the Where V represents a set of points of interest, E p-p Representing a collection of edges in the graph, W p-p The number of edge occurrences in the graph is recorded.
4. The interest point recommendation method based on user multi-behavior enhancement and efficient rich information negative sampling according to claim 3, wherein the Skip-Gram model used in step 3 is specifically:
for the sign-in interest point and query interest point relationship graph G described in step 2 p-q The loss function of Skip-Gram model is:
Figure FDA0004049307570000029
where σ is a logical stet function: sigma (x) =1/(1+e++x), L represents the number of negative samples acquired, P v Representing the distribution of the negative sample acquisitions,
Figure FDA00040493075700000210
and v q Are vector representations of interest points in the query space;
graph G for user and query interest point u-q The loss function of Skip-Gram model is:
Figure FDA00040493075700000211
in the method, in the process of the invention,
Figure FDA0004049307570000031
is a vector representation of user i in the query space;
for the sign-in interest point and sign-in interest point relation graph G p-p The loss function of Skip-Gram model is:
Figure FDA0004049307570000032
in the method, in the process of the invention,
Figure FDA0004049307570000033
v c are vector representations of points of interest in space.
5. The method for recommending interest points based on user multi-behavior enhancement and efficient negative sampling of rich information according to claim 4, wherein the paired data sets in step 4 are specifically:
Figure FDA0004049307570000034
in the method, in the process of the invention,
Figure FDA0004049307570000035
representing a set of points of interest checked in by a user i; (u) i ,v j ,v k )∈D s Representing, for user i, the point of interest v that it has accessed j Is greater than the point of interest v that has not been accessed k Can also be expressed as v j >u i,poi v k Wherein > u i,poi Satisfying 1) completeness, 2) antisymmetry, 3) transmissibility.
6. The method for recommending points of interest based on user multi-behavior enhancement and efficient negative sampling of rich information according to claim 5, wherein the calculating the likelihood score of the occurrence of the user-point of interest check-in pair using the pairwise ordering model in step 4 is specifically:
point of interest v j And v k The ordering relationship between them is expressed as:
Figure FDA00040493075700000310
in the formula, v j Representing the points of interest accessed by user i, v k Indicating points of interest that have not been accessed by the user, x ij And x ik Respectively represent future sign-in interest points v of the user j And v k Is a score of (2); x is x ij The calculation formula of (2) is as follows:
Figure FDA0004049307570000036
where w represents a hyper-parameter for balancing user query preferences and check-in preferences;
Figure FDA0004049307570000037
and->
Figure FDA0004049307570000038
Vector representation of user i in check-in and query space and point of interest v, respectively j A vector representation in sign-in space;
the loss function formula of the BPR is specifically:
Figure FDA0004049307570000039
wherein D is s For paired datasets, σ is a logical stet function: sigma (x) =1/(1+e (-x)).
7. The user multi-behavior enhancement and high-efficiency rich information negative sampling-based interest point recommendation method according to claim 6, wherein in step 5, the negative sampling part in Skip-Gram and pairwise ordering model is replaced by a reference point diagram-based negative sampling method, and the specific process comprises:
step 5-1, randomly selecting interest points from the interest point set V according to a certain proportion tau, namely datum points, and defining the datum point set as V base
Step 5-2, recording other interest points existing in the radius r by taking the datum point as the center, defining that all interest points in the area have adjacent relation in the geographic distance, and constructing a datum point diagram G by using the adjacent relation base =(V,E nei ) Where V represents a set of points of interest, E nei As edges in the reference dot plot, in the form of:
E nei ={(v i ,v j )|v base ∈V base ,dis(v i ,v base )<r,dis(v j ,v base )<r}
wherein V is base Represents the set of reference points, dis (v) i ,v base ) Representing two points of interest v i 、v base Geographic distance between;
step 5-3, for context diagram G p-p And G p-q Each edge (v) j ,v k ) If v j Edge E located in reference dot diagram nei Then at the point set V of the neighboring points of the point j-nei ={v|(v j ,v)∈E nei Randomly collecting L negative samples in the process; otherwise, randomly sampling; for the context graph G u-q Define a historical sign-in trajectory S for user i i ={s 1 ,s 2 ,......s n Track center point v c ,v c The coordinate calculation formula of (2) is as follows:
Figure FDA0004049307570000041
Figure FDA0004049307570000042
in the method, in the process of the invention,
Figure FDA0004049307570000043
and->
Figure FDA0004049307570000044
Respectively representing longitude and latitude coordinates of a j-th sign-in record of the user;
definition of the definition
Figure FDA0004049307570000045
For the negative sample which is collected for the user i and has the distance of not more than r from the track center pointThe points are integrated, then
Figure FDA0004049307570000046
The definition is as follows:
Figure FDA0004049307570000047
in the formula, dis (v, v c ) Representing two points of interest v, v c The distance between the two plates is set to be equal,
Figure FDA0004049307570000048
sign-in history data representing a user;
step 5-4, rewriting the loss functions of Skip-Gram models of the three context graphs in the step 3 as follows:
Figure FDA0004049307570000051
Figure FDA0004049307570000052
Figure FDA0004049307570000053
wherein V is j-nei Representing points of interest
Figure FDA0004049307570000054
Is adjacent to the sample point set,/>
Figure FDA0004049307570000055
Representing a point of interest v neg Vector representation in query space, +.>
Figure FDA0004049307570000056
Representing a point of interest v neg A vector representation in sign-in space;
step 5-5, rewriting the loss function of the BPR in the step 4 as follows:
Figure FDA0004049307570000057
wherein D is base The training set is a pairwise ordered training set acquired according to a negative sampling mode based on a reference point diagram, and is as follows:
Figure FDA0004049307570000058
wherein V is j-nei Representing a point of interest v j Is a set of adjacent sample points.
8. The interest point recommendation method based on user multi-behavior enhancement and efficient rich information negative sampling according to claim 7, wherein the joint learning framework calculation formula in step 6 is:
Figure FDA0004049307570000059
in the formula, alpha, beta and gamma are regularized parameters for controlling the contribution of three context graphs respectively, and a parameter set theta to be trained by the model comprises U c ,V c ,U q ,V q Vector representations of the user and the point of interest in the check-in space and the query space, respectively.
9. The interest point recommendation method based on user multi-behavior enhancement and efficient rich information negative sampling according to claim 8, wherein step 7 uses a pairwise ordering model to calculate a likelihood score of a user to sign any interest point in the future, and recommends the interest point for the user based on the likelihood score, specifically:
step 7-1, according to the joint learning in step 6Frame learning to obtain user and interest point representation U c ,V c ,U q Calculating the future sign-in interest point v of the user i through the following formula in step 4 j Is a score of (2):
Figure FDA0004049307570000061
where w represents a hyper-parameter for balancing user query preferences and check-in preferences;
Figure FDA0004049307570000062
and->
Figure FDA0004049307570000063
Vector representation of user i in check-in and query space and point of interest, respectively>
Figure FDA0004049307570000064
A vector representation in sign-in space;
and 7-2, after calculating all the points of interest which are not removed for the user i, sorting the points of interest according to the scores from high to low, and recommending the points of interest with the highest scores for the user.
CN202310037251.6A 2023-01-10 2023-01-10 Interest point recommendation method based on user multi-behavior enhancement and efficient rich information negative sampling Pending CN116166885A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310037251.6A CN116166885A (en) 2023-01-10 2023-01-10 Interest point recommendation method based on user multi-behavior enhancement and efficient rich information negative sampling

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310037251.6A CN116166885A (en) 2023-01-10 2023-01-10 Interest point recommendation method based on user multi-behavior enhancement and efficient rich information negative sampling

Publications (1)

Publication Number Publication Date
CN116166885A true CN116166885A (en) 2023-05-26

Family

ID=86419417

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310037251.6A Pending CN116166885A (en) 2023-01-10 2023-01-10 Interest point recommendation method based on user multi-behavior enhancement and efficient rich information negative sampling

Country Status (1)

Country Link
CN (1) CN116166885A (en)

Similar Documents

Publication Publication Date Title
US10580025B2 (en) Micro-geographic aggregation system
CN107133277B (en) A kind of tourist attractions recommended method based on Dynamic Theme model and matrix decomposition
Wei et al. Constructing popular routes from uncertain trajectories
US8401771B2 (en) Discovering points of interest from users map annotations
US20170236073A1 (en) Machine learned candidate selection on inverted indices
JP4878178B2 (en) Data processing method and apparatus, and processing program therefor
CN108804551B (en) Spatial interest point recommendation method considering diversity and individuation
Xing et al. Points-of-interest recommendation based on convolution matrix factorization
CN106960044B (en) Time perception personalized POI recommendation method based on tensor decomposition and weighted HITS
Kong et al. CoPFun: An urban co-occurrence pattern mining scheme based on regional function discovery
Liu et al. Efficient similar region search with deep metric learning
CN113918837B (en) Method and system for generating city interest point category representation
CN109977324B (en) Interest point mining method and system
CN112543926A (en) Determining a geographic location of a network device
Wang et al. Knowledge graph-based spatial-aware user community preference query algorithm for lbsns
Lim et al. Origin-aware next destination recommendation with personalized preference attention
Choi et al. Multimodal location estimation of consumer media: Dealing with sparse training data
Schäfers et al. SimMatching: adaptable road network matching for efficient and scalable spatial data integration
Alfarrarjeh et al. A data-centric approach for image scene localization
CN115408618B (en) Point-of-interest recommendation method based on social relation fusion position dynamic popularity and geographic features
CN107766881B (en) Way finding method and device based on basic classifier and storage device
CN116166885A (en) Interest point recommendation method based on user multi-behavior enhancement and efficient rich information negative sampling
CN110633890A (en) Land utilization efficiency judgment method and system
CN111177565B (en) Interest point recommendation method based on correlation matrix and word vector model
CN114065024A (en) POI recommendation method based on user personalized life mode

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination