CN115408618B - Point-of-interest recommendation method based on social relation fusion position dynamic popularity and geographic features - Google Patents

Point-of-interest recommendation method based on social relation fusion position dynamic popularity and geographic features Download PDF

Info

Publication number
CN115408618B
CN115408618B CN202211174747.XA CN202211174747A CN115408618B CN 115408618 B CN115408618 B CN 115408618B CN 202211174747 A CN202211174747 A CN 202211174747A CN 115408618 B CN115408618 B CN 115408618B
Authority
CN
China
Prior art keywords
interest
user
point
target user
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211174747.XA
Other languages
Chinese (zh)
Other versions
CN115408618A (en
Inventor
朱俊
韩立新
李振旺
徐逸卿
梁太波
汪洋
李树
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Vocational University of Industry Technology NUIT
Original Assignee
Nanjing Vocational University of Industry Technology NUIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Vocational University of Industry Technology NUIT filed Critical Nanjing Vocational University of Industry Technology NUIT
Priority to CN202211174747.XA priority Critical patent/CN115408618B/en
Publication of CN115408618A publication Critical patent/CN115408618A/en
Application granted granted Critical
Publication of CN115408618B publication Critical patent/CN115408618B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a point-of-interest recommendation method based on a social relation fusion position dynamic popularity and geographic characteristics, which comprises the following steps: sorting the sign-in data set and the social relationship data set to generate a three-dimensional scoring matrix and a two-dimensional relationship matrix; deleting weak correlation rows and weak correlation columns in the scoring matrix, generating a personalized scoring matrix for each target user, and calculating a predicted score based on friendship relation; calculating the dynamic popularity of each interest point based on time perception; mining the influence of the geographic features on the access probability of the interest points based on the power law distribution model; and comprehensively considering the social relationship of the user, the dynamic popularity of the position and the influence of the geographic distance, fusing the predicted score based on friendship, the dynamic popularity of the interest point and the access probability based on the distance, and generating a final predicted score. According to the method and the system, the user can recommend a plurality of interest points for the user in real time according to the historical access interests of the user, the social relations among the users, the geographic distances among the positions and the position popularity at the current time, and the method and the system have important practical application values.

Description

Point-of-interest recommendation method based on social relation fusion position dynamic popularity and geographic features
Technical Field
The invention relates to a point-of-interest recommendation method based on a social relation fusion position dynamic popularity and geographic characteristics, and belongs to the technical field of artificial intelligence and machine learning.
Background
In recent years, advances in mobile computing, wireless communication, and Location acquisition technologies have facilitated the popularity and development of Location-based social networks (Location-based Social Networks, LBSNs). There are a large number of mature social network platforms based on location at home and abroad, such as Foursquare, gowalla, facebook, twitter and yellow, etc., public critique, newwave microblog, weChat friend circle, etc. in China, more and more people use online social networks through smart phones. The 49 th statistical report of the development status of the internet in China according to the 49 th statistical report of the development status of the internet in China issued by the internet information center (CNNIC) of China in 2 months of 2022 shows that: by 12 months of 2021, the scale of Chinese netizens reaches 10.32 hundred million, wherein the proportion of netizens using mobile phones to surf the net reaches 99.7%. The social network based on the position has become a novel media form for people to share and transfer information, on one hand, users establish social relations in LBSNs, publish interesting contents, and share the current position, picture, audio, video, comments and the like of the users at any time and any place; on the other hand, when facing a huge amount of information resources, in order to alleviate the information overload (Information Overload) problem, the user can obtain recommended contents, such as location, friends, music, advertisement, etc., conforming to the user's preference by using the personalized service provided in the LBSNs.
In LBSNs, a location check-in service is an important and widely used service. The LBSNs determine the user's current location through GPS global positioning system or WiFi positioning in combination with a geographic location system. With the rapid development of cities, more and more places such as tourist attractions, theatres, hotels, banks, shops and the like are appeared, and people often make a choice of where to go among various places according to personal interest preferences. These places of Interest to the user, which are actually present in the physical world, are called points-of-Interest (POIs). However, each user has a limited number of points of interest, and when the user faces a huge number of places without going or accesses to strange cities, the problems of "place information overload" and "select panic" are generated, and how to recommend a location meeting the interest preference of the user to the user in a huge number of places in the real world is a problem to be solved urgently. The interest point recommendation (POIs Recommendation) is used as an emerging recommendation field, can effectively solve the problem of selection trouble brought by overload of position information to users, is beneficial to improving experience of the users in social networks and real life, can help merchants analyze and mine potential users to conduct advertisement pushing service, and becomes a research hotspot.
Considering that the point of interest recommendation is an important branch of a recommendation system, whether development history or key technology is carried out in a pulse manner with a traditional recommendation system, part of point of interest recommendation research regards the position as a common item similar to films, music and the like, and a recommendation result is generated by using a traditional recommendation method. According to design strategies, traditional recommendation methods mainly comprise collaborative filtering algorithms, content-based recommendation algorithms and hybrid recommendation algorithms. The collaborative filtering algorithm further comprises a memory-based collaborative filtering algorithm (such as user-based collaborative filtering UBCF, project-based collaborative filtering IBCF) and a model-based collaborative filtering algorithm (such as singular value decomposition SVD, clustering model, probability latent semantic analysis and the like). Wherein content-based point-of-interest recommendation techniques extract relevant information, such as tags, classifications, and user reviews, from the accessed location; user preferences are extracted from the user's profile and then matched with the location profile to obtain accurate recommendations. The collaborative filtering technology converts the sign-in behavior of the user into a user-interest point scoring matrix, searches for a user similar to the current active user or a position highly similar to the previous favorite address of the active user by using the existing sign-in record, predicts the score of the active user on the place which is not signed in according to the similarity between the users or the addresses, and recommends the interest point with the highest predicted score to the current user. Singular Value Decomposition (SVD) is a classical representation of matrix decomposition, the main task of which is to generate low rank approximations. The low-dimensional orthogonal matrix decomposed by the SVD technology reduces noise on the basis of the original matrix, and can more effectively reveal potential association between users and commodities.
The above conventional recommendation techniques ignore the influence of social relationships, geographic distances and popularity of locations in the location social network on user check-in behaviors at different times. However, in reality the sign-in habits of the user are always closely related to relationship, location and temporal context. For example, a user will often check in with his friends at the same point of interest; when a user plays a tourist attraction, the user is more willing to stay in a nearby hotel; the interest points of the catering class are visited most (popularity is highest) at about 12 and 18 points, and the popularity of the bar starts to rise from 21 points. How to introduce social relations, geographic features and position popularity in a recommendation algorithm, and providing a suitable point-of-interest recommendation list for users in a specific time period has become an urgent need for various social application platforms.
Currently, some solutions of point of interest recommendation technology consider one or more factors of geographic impact, time impact, social impact and the like, but still have some drawbacks and disadvantages, which are summarized as follows:
(1) The time-dimensional dynamic features of location popularity are ignored. The point of interest recommendation technique, which focuses on location popularity features, is relatively few compared to other categories of context of social relationships, geographic features, etc., and only analyzes the overall popularity of a location from a macroscopic perspective for a few related studies (comparing the number of times a location is accessed to the total number of accesses in a dataset), ignoring the time-dimensional dynamic features of microscopic-level location popularity, and in fact, using global location popularity at different time periods throughout the day is not in line with the fact rule. Therefore, how to mine the change rule of the position popularity with the passage of time is a considerable problem.
(2) The one-sided nature of the location popularity calculation method. The current existing popularity calculation method only carries out transverse comparison on the accessed times of the interest points and the accessed times of other positions, and if the accessed frequency of the current position is relatively higher, the position is considered to have higher popularity. However, the method is limited to the transverse comparison result of the target position and other positions, and neglects to longitudinally compare the popularity of the target position at the current time with the popularity of the target position in other time periods, so that the method has certain one-sidedness, reduces the accuracy of position popularity estimation and influences the recommendation accuracy of a position recommendation system.
(3) High time complexity problem of user-time-point of interest three-dimensional matrix computation. The scoring matrix in the traditional recommendation system only contains two-dimensional information of the user and the project, and in order to explore the behavior mode of the user in the target time period, the interest point recommendation system needs to expand the two-dimensional matrix of the user-interest point to the three-dimensional scoring matrix of the user-time-interest point, which certainly aggravates the computational complexity of the recommendation process. With the situation that the calculated data volume presents exponential growth, the calculation cost has become one of the bottlenecks for restricting the rapid development of the recommendation system. Therefore, a recommendation method capable of reducing the computational complexity must be studied to improve the operation efficiency of the recommendation system.
The defects of the conventional interest point recommending technology bring about larger defects in the design, development, deployment and operation of social network platforms at different positions, and particularly the service quality of a recommending system is reduced on the network platform with massive project information, thereby influencing the sales performance of an electronic commerce system.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a point-of-interest recommendation method based on the fusion of the dynamic popularity of the social relationship and the geographic features. In consideration of the problems of large data volume and high computational complexity of a user-time-interest point three-dimensional matrix, the invention carries out data preprocessing on the scoring matrix based on friendship relation, generates a personalized scoring matrix for each user, improves the traditional collaborative filtering algorithm based on the user, and achieves the aim of improving recommendation efficiency; meanwhile, the invention innovatively provides a calculation method of the dynamic popularity of the interest points, and the effectiveness of scoring prediction is improved by extracting different popularity of the interest points in different time periods, so that the service quality of a recommendation system is enhanced.
The technical scheme adopted for solving the technical problems is as follows: a point-of-interest recommendation method based on social relation fusion of dynamic popularity and geographic features comprises the following steps:
step 1: collecting and sorting an original sign-in data set and a social relationship data set, and respectively converting the original sign-in data set and the social relationship data set into a user-time-interest point three-dimensional scoring matrix and a two-dimensional relationship matrix;
step 2: and selecting an active user in the location-based social network as a recommended service object. Deleting weak correlation rows and weak correlation columns in a scoring matrix for the target user, improving a traditional collaborative filtering algorithm based on the user, and generating a predicted score based on friendship for the target user;
step 3: counting the accessed times of all the interest points in each time slot, comparing the accessed times with the total accessed times of all the interest points at all times and the accessed times of all the interest points at a certain time, and calculating the dynamic popularity of all the interest points based on time perception;
step 4: according to longitude and latitude information of the position in the sign-in data set, geographic distances among different interest points are calculated, and influence of geographic features on the access probability of the interest points is mined based on a power law distribution model;
Step 5: comprehensively considering the influence of the social relationship of the user, the dynamic popularity of the position and the geographic distance on the access behavior of the user, fusing the predicted scores based on friendship, the dynamic popularity of the interest points and the access probability based on the distance, and generating a final predicted score of the non-accessed address for the target user. Sequencing all the non-accessed addresses according to the final prediction scores, and providing a recommendation list composed of a plurality of addresses ranked at the top for a target user;
step 6: the accuracy Precision, recall ratio Recall and comprehensive accuracy index F1 are used as accuracy evaluation indexes of a recommendation system, and the applicability and effectiveness of the proposed technology are evaluated by comparing the prediction accuracy of the proposed recommendation method and other related classical recommendation methods.
The beneficial effects are that:
1. the method and the system can recommend a plurality of interest points for the user in real time according to the historical access interests of the user, the social relations among the users, the geographic distances among the positions and the popularity of the positions at the current time, can also help merchants to accurately push advertisements to potential clients, and have important practical application values.
2. According to the method, the three-dimensional sign-in matrix is subjected to data preprocessing by using the user social relation network, the personalized scoring matrix is generated for each target user, the scale of scoring data is reduced, the calculation complexity of a recommendation algorithm is reduced, and the operation efficiency of a recommendation system is improved, so that the use satisfaction degree of users and merchants on a social network platform is improved, and the method has very important significance in practical application.
3. According to the method, the time dimension dynamic characteristics of the position popularity are mined, the accessed times of the interest points in the current time period are compared with other statistical data longitudinally and transversely at the microscopic level, the dynamic popularity of the interest points based on time perception is calculated, the change rule of the position popularity along with the time is fully mined, and the accuracy of predicting the user behavior mode of the recommendation system is effectively improved. The method has certain universality and portability, can be applied to not only the interest point recommendation system, but also the personalized recommendation field of other traditional projects, and has wide industrial application prospect.
Drawings
Fig. 1 is a schematic overview of a point of interest recommendation method based on social relation fusion location dynamic popularity and geographic features.
FIG. 2 is a flowchart showing specific steps of a method for recommending interest points based on a social relationship fusion location dynamic popularity and geographic characteristics.
FIG. 3 is a schematic diagram of check-in records of a user in a location social network in an embodiment of the present invention.
FIG. 4 is a schematic diagram of social relationship records of a user in a location social network in an embodiment of the invention.
Fig. 5 is a schematic diagram of the probability of being accessed in each time slot for the first three points of interest with the highest number of times of being accessed in the embodiment of the present invention.
Fig. 6 is a schematic diagram of a probability distribution of geographic distances between neighboring points of interest visited by all users on the same day in an embodiment of the present invention.
Fig. 7 is a bar graph comparing accuracy rates of a recommendation method and a classical collaborative filtering method UBCF based on a user, a collaborative filtering method SCF based on a social relationship, an access probability prediction method PLD based on power law distribution, and an access probability prediction method KDE based on kernel density estimation.
Fig. 8 is a histogram of Recall contrast of a recommendation method and a classical collaborative filtering method UBCF based on a user, a collaborative filtering method SCF based on a social relationship, an access probability prediction method PLD based on power law distribution, and an access probability prediction method KDE based on kernel density estimation according to the present invention in an embodiment.
Fig. 9 is a bar graph comparing the values of the comprehensive precision index F1 of the recommendation method and the classical collaborative filtering method UBCF based on the user, the collaborative filtering method SCF based on the social relationship, the access probability prediction method PLD based on the power law distribution, and the access probability prediction method KDE based on the kernel density estimation, which are proposed by the present invention in the implementation case.
Detailed Description
The invention will be described in further detail with reference to the drawings.
1-2, the invention provides a point-of-interest recommendation method based on a social relationship fusion position dynamic popularity and geographic characteristics, which specifically comprises the following steps: preprocessing the three-dimensional scoring matrix, deleting weak correlation rows (other users without social relations with the target user) and weak correlation columns (addresses which are not visited by the target user and friends of the target user) in the scoring matrix for the target user, and generating a personalized scoring matrix; improving a traditional collaborative filtering algorithm based on users to generate predictive scores based on friendship; dividing a day into 24 time slots, respectively counting the check-in times of each interest point in each time slot according to time labels, transversely comparing the accessed times of the interest points with the accessed times of other positions, longitudinally comparing the popularity of the target position at the current time with the popularity of the target position in other time periods, and calculating the dynamic popularity of the target position based on time perception; simulating the influence of the geographical distance between the positions on the accessed probability of the interest point by adopting a power law distribution model; the influence of the three contexts (social relationship, dynamic popularity and geographic distance) on the access behaviors of the user is integrated, the prediction scores of all the non-access addresses are calculated and ranked, and a plurality of interest points with top ranks are selected and recommended to the user, as shown in figure 1.
The method comprises the following specific steps:
step 1: the original check-in data set and the social relation data set are collected and arranged and respectively converted into a user-time-interest point three-dimensional scoring matrix and a two-dimensional relation matrix. The operation steps are as follows:
step 1-1: the original check-in data set C is arranged to obtain n check-in records, and the n check-in records are recorded as C= { C 1 ,c 2 ,…,c n }. All users in the check-in data set C form a user set U, all interest points form an address set L, and the number of users and the number of interest points are respectively marked as NU and NL. Rounding the check-in time in the check-in record converts the discrete check-in time into 24 time slots, the time slot set t= {0,1,2, …,23}.
Step 1-2: counting the number of times a user u accesses an address l in a time slot t in a check-in data set C, and if the number of times of check-in is 0, scoring r of the user u on the address l in the time slot t u,t,l 0, otherwise, r u,t,l =1. Summarizing all scores to form a user-time-interest point three-dimensional scoring matrix R= { R u,t,l U e U, t e [0,23 ]]L e L, U and L are the user set and the point of interest set, respectively.
Step 1-3: the original social relation data set F is arranged to obtain m social relation records, and the m social relation records are recorded as F= { F 1 ,f 2 ,…,f m }. Formalize each social relationship record into<User u x User u y >Binary group, x.epsilon.1, NU],y∈[1,NU]NU is the number of all users in the dataset.
Step 1-4: constructing a two-dimensional user social relation matrix S= { S xy The number of rows and columns of the matrix is the number of users NU, x E [1, NU ]],y∈[1,NU]Its element s xy Representing user u x And u is equal to y Whether there is a social relationship between:
step 2: and selecting an active user in the location-based social network as a recommended service object. The weakly relevant rows and weakly relevant columns in the scoring matrix are deleted for the target user, a traditional user-based collaborative filtering algorithm is improved, and a friendship-based predictive score is generated for the target user. The specific operation steps are as follows:
step 2-1: determining a target user u in a location social network a As a recommendation service object, searching for a target user u in the social relationship matrix S a The row where the target user u is located is obtained by obtaining the column number with the element value of 1 in the row a Friend set F a ,F a The number of elements in the set is denoted as FNum a At the same time, the column number with element value of 0 in the row is recorded to form a target user u a Social relationship-free user collection UnF a
In the three-dimensional scoring matrix R of the user-time-interest points, if the user u i ∈UnF a Then delete user u in R i Score information of (a) to obtain deletion and target user u a Scoring matrix R behind users (weakly related rows) without social relationship 1 ={r i',t,j },i'∈[1,Fnum a ],t∈[0,23],j∈[1,NL]Where i' denotes the user number and t denotes the time slotJ represents the address number, fnum a Representing target user u a NL represents the total number of points of interest, r i',t,j Representing user u i' For address l at time slot t j Is a score of (2).
Through the screening of social relations, the row scale of the original scoring matrix R is effectively reduced, namely |i' |=FNum a <<NU, NU represents the total number of users.
Step 2-2: in the scoring matrix R 1 Calculating the sum of the scores of each address in all time slots one by one, and if the sum of the scores is equal to 0, representing the target user u a All friends that have not accessed this address at any time add this address to unvisit_F a In the collection.
If address l j ∈unvisit_F a Then at R 1 Delete l in j To obtain a scoring matrix R after deleting irrelevant addresses (weakly relevant columns) 2 ={r i',t,j' },i'∈[1,Fnum a ],t∈[0,23],j'∈[1,NL-|unvisit_F a |]Where i 'denotes a user number, t denotes a time slot value, j' denotes an address number, FNum a Representing target user u a NL represents the total number of points of interest, |unvisit_f a The number of addresses that all friends of the target user have not visited, r i',t,j' Representing user u i' For address l at time slot t j' Is a score of (2).
By deleting irrelevant addresses, in the scoring matrix R 1 Further reduces the column size on the basis of (i) j' |=nl- |unvisit_f a |<<NL, NL represents the total number of users.
Step 2-3: based on the scoring matrix R after pretreatment 2 Calculating the target user u a Scoring similarity to its friend users. If user v epsilon F a Target user u a The scoring similarity with user v is:
wherein u is a Is the target object of the current service of the recommendation system, v is the target user u a T is a time slot, T is a time slot set, L is a set for all points of interest, and unvisit_F a Representing target user u a A set of addresses that have not been accessed by all friends of (a) at any time,and r v,t,l Respectively represent user u a And scoring the point of interest l by the user v at time t.
Step 2-4: improving traditional collaborative filtering algorithm based on user and utilizing scoring matrix R after data compression 2 Calculating target user u based on social relationship a At t r Prediction score for time access point of interest/:
wherein u is a Is a target object of the current service of the recommendation system, t r Is the time slot corresponding to the current recommended time, l is an interest point which is not visited by the target user in the location social network, and unvisit_F a Representing target user u a An address set that all friends have not visited at any time, v is the target user u a Is a friend user of F a Representing target user u a Is set of friends, sim (u) a V) is user u obtained in step 2-3 a And the scoring similarity to the user v,indicating that user v is at time t r Point of interest i is scored.
Step 3: and counting the accessed times of all the interest points in each time slot, comparing the accessed times with the total accessed times of all the interest points at all times and the accessed times of all the interest points at a certain time, and calculating the dynamic popularity of all the interest points based on time perception. The implementation steps are as follows:
step 3-1: counting the number of times Cnum that a point of interest l was accessed in time slot t in an original check-in dataset C l,t
Cnum l,t =∑ u∈U Cnum u,t,l Equation 4
Step 3-2: counting the total number of times Cnum that a point of interest l is accessed in a check-in dataset C l
Cnum l =∑ t∈Tu∈U Cnum u,t,l Equation 5
Step 3-3: counting all accessed times Cnum occurring within a time slot t in a check-in dataset C t
Cnum t =∑ l∈Lu∈U Cnum u,t,l Equation 6
In the formulas 4, 5 and 6, U is all user sets, L is all address sets, T is a time slot set, cnum u,l,t Indicating the number of times a user u accesses the point of interest i in the time slot t in the check-in dataset C.
Step 3-4: the longitudinal popularity of the interest point is obtained by calculating the ratio of the accessed times of the interest point l in the time slot t to the total accessed times of the interest point l in all times:
wherein Cnum is l,t Representing the number of times the point of interest l is accessed in the time slot t, cnum l Representing the total number of times point of interest i is accessed at all times.
Step 3-5: the transverse popularity of the interest point is obtained by calculating the ratio of the accessed times of the interest point l in the time slot t to the accessed times of all the interest points in the time slot t:
wherein Cn isum l,t Representing the number of times the point of interest l is accessed in the time slot t, cnum t Indicating the number of times all points of interest have been accessed in time slot t.
Step 3-6: combining the results of the longitudinal comparison and the transverse comparison to obtain the time perception-based dynamic popularity of the interest point l in the time slot t:
popu (l, t) =popu 1 (l, t) ×popu2 (l, t) equation 9
Where popu1 (l, t) is the longitudinal popularity result of point of interest l in time slot t, and popu2 (l, t) is the lateral popularity result of point of interest l in time slot t.
The dynamic popularity of each interest point based on time perception in each time slot is summarized to form an interest point-time two-dimensional popularity matrix P of NL row 24 columns.
Step 4: and according to longitude and latitude information of the position in the sign-in data set, calculating geographic distances among different interest points, and mining the influence of geographic features on the access probability of the interest points based on a power law distribution model. The implementation steps are as follows:
step 4-1: suppose target user u a The accessed interest point set is L_u a ,l'∈L_u a Acquiring the geographical longitude Lng of l' in the check-in data set C l' And latitude Lat l' Let l' =as<Lng l' ,Lat l' >L is a certain point of interest that the target user has never visited, l=<Lng l ,Lat l >. Calculating a geographic distance dist (l, l ') between interest points l and l':
where R is the earth radius, r=6371 km.
Step 4-2: the sign-in probability of the user at different places accords with the power law distribution, and a power law function based on geographic distance is constructed:
pr(dist(l,l'))=a×dist(l,l') b equation 11
Where a and b are two parameters of the power law function, and the values of the two parameters can be obtained by a maximum likelihood estimation method.
Step 4-3: calculating a conditional probability pr (l|l ') of the user accessing the candidate point of interest l when the user is currently at the point of interest l':
where L is the set of all addresses.
Step 4-4: calculating the target user u by using a naive Bayes method a Probability of accessing candidate point of interest/:
wherein L_u a Is target user u a The set of points of interest that have been accessed.
Step 5: comprehensively considering the influence of the social relationship of the user, the dynamic popularity of the position and the geographic distance on the access behavior of the user, fusing the predicted scores based on friendship, the dynamic popularity of the interest points and the access probability based on the distance, and generating a final predicted score of the non-accessed address for the target user. And sequencing all the non-accessed addresses according to the final prediction scores, and providing a recommendation list consisting of a plurality of addresses ranked at the top for the target user. The implementation steps are as follows:
step 5-1: determining a target user u in a location social network a As a recommended service object, the current recommended time is used for time r Conversion to time slot t r
Step 5-2: comprehensively considering influence of user social relationship, position dynamic popularity and geographic distance on user access behaviors, and calculating target user u a At t r Prediction score for time access point of interest/:
wherein u is a Is the recommendation systemTarget object of front service, t r Is the time slot corresponding to the current recommended time, l is a point of interest in the location social network that the target user has not accessed,calculating the target user u based on social relationship a At t r Predictive scoring of time-access interest point l, popu (l, t r ) Is the point of interest l at t r Real-time popularity of time, pr (l|L_u) a ) The predicted access probability L_u is obtained by mining the influence of the geographic distance by using a power law distribution model a Is target user u a The set of points of interest that have been accessed.
Step 5-3: for target user u a All addresses which are not accessed are ordered according to predictive scores, N positions which are ranked at the top are formed into a recommendation list, and the recommendation list TopNList is formed a And returning to the target user.
Step 6: the accuracy Precision, recall ratio Recall and comprehensive accuracy index F1 are used as accuracy evaluation indexes of a recommendation system, and the applicability and effectiveness of the proposed technology are evaluated by comparing the prediction accuracy of the proposed recommendation method and other related classical recommendation methods. The implementation steps are as follows:
step 6-1: NU x 20% of the users are randomly chosen as the target user set TestU, NU representing the total number of users. And respectively running each recommendation method for each target user in the testU set, and generating a recommendation list.
Step 6-2: calculating the accuracy and recall rate of the recommendation method in the time slot t:
the Testu is a set of all target users, R (u, t) is a recommendation list provided for a certain target user u at the moment t by a recommendation method, and Like (u, t) is a set of interest points actually visited by the user u at the moment t.
Step 6-3: the overall accuracy and recall of the recommendation method are calculated, and the values of the overall accuracy and recall are the average value of corresponding evaluation indexes in each time slot:
where T is the time slot set, precision (T) and recovery (T) are the accuracy and recall of the recommended method in time slot T, respectively.
Step 6-4: calculating the comprehensive accuracy F1 value of the recommendation system:
precision and recall are the overall accuracy and recall, respectively, of the recommended method run once.
Step 6-5: and repeatedly executing the steps 6-1 to 6-4Ntimes, wherein the final prediction accuracy (the values of the Precision, the Recall and the comprehensive Precision index F1) of the recommendation method is the average value of the corresponding index results of the Ntimes.
Step 6-6: comparing and analyzing the results of each index: if the accuracy Precision of the interest point recommendation method based on the social relation fusion position dynamic popularity and the geographic characteristics is larger than the Precision values of other recommendation methods, the technology provided by the invention can help users to find interested addresses more accurately; if the Recall rate Recall of the technology proposed by the invention is larger than the Recall values of other recommendation methods, the technology proposed by the invention is explained to be capable of more comprehensively covering the addresses of interest of the user; if the F1 value of the method provided by the invention is larger than that of other recommended methods, the technology provided by the invention has stronger comprehensive capacity in the aspect of prediction accuracy.
3-9, a specific social network based on a location is taken as an example to describe in detail how the method for recommending points of interest based on the dynamic popularity and geographic characteristics of social relationship fusion locations in the present invention works.
The U.S. university of Stanford SNAP laboratory has collected user check-in records for a period of time for the overseas popular location social networking sites Gowalla, forming a well-known public dataset Gowalla. The Gowalla dataset contained 6442892 check-in actions by 196591 users at 1256379 addresses during month 2 2009 to 10 2010. 196591 users form 950327 social relations. The invention selects the trap county with the most abundant check-in data in the Gowalla data set as an example for instantiation and explanation.
Step 1: the original check-in data set and the social relation data set are collected and arranged and respectively converted into a user-time-interest point three-dimensional scoring matrix and a two-dimensional relation matrix. The operation steps are as follows:
step 1-1: collecting and sorting check-in data and social data of Travis county in an example data set Gowalla, obtaining 200817 check-in records of 3280 users at 3335 addresses, and forming a check-in data set C= { C 1 ,c 2 ,…,c 200817 Each check-in record is composed of a user ID, a check-in time, a geographic latitude, a geographic longitude, and a point of interest ID, as shown in fig. 3. 36050 social relations are formed among users, and a social relation record schematic diagram of the users in the Gowalla network is shown in FIG. 4.
All users in the check-in data set C form a user set U, all points of interest form an address set L, the number of users NU is 3280, and the number of points of interest NL is 3335.
Rounding the check-in time in the check-in record, converting the discrete check-in time into 24 time slots, for example, the time slot corresponding to the check-in time=00:01:25 is t=0, and the time slot corresponding to the check-in time=23:11:56 is t=23. The time slot set t= {0,1,2, …,23}.
Step 1-2: in check-inCounting the number of times a certain user u accesses a certain address l in a time slot t in a data set C, and if the sign-in number is 0, scoring r of the user u on the address l in the time slot t u,t,l 0, otherwise, r u,t,l =1. Summarizing all scores to form a user-time-interest point three-dimensional scoring matrix R= { R u,t,l U is [1,3280 ]],t∈[0,23],l∈[1,3335]。
Step 1-3: the original social relation data set F is arranged to obtain 36050 social relation records, and the social relation records are recorded as F= { F 1 ,f 2 ,…,f 36050 }. Formalize each social relationship record into<User u x User u y >Binary group, x E [1,3280 ]],y∈[1,3280]。
Step 1-4: constructing a two-dimensional user social relation matrix S= { S xy The number of rows and columns of the matrix is 3280, x E [1,3280 ]],y∈[1,3280]Its element s xy Representing user u x And u is equal to y Whether there is a social relationship between:
step 2: and selecting an active user in the location-based social network as a recommended service object. The weakly relevant rows and weakly relevant columns in the scoring matrix are deleted for the target user, a traditional user-based collaborative filtering algorithm is improved, and a friendship-based predictive score is generated for the target user. The specific operation steps are as follows:
step 2-1: determining a target user u in a location social network a As a recommendation service object, searching for a target user u in the social relationship matrix S a The row in which the target user u is located is obtained by acquiring the column number (user ID) with the element value of 1 in the row a Friend set F a ,F a The number of elements in the set is denoted as FNum a At the same time, record the column number (user ID) with element value 0 in the row to form a list with the target user u a Social relationship-free user collection UnF a
Three-dimensional scoring matrix at user-time-interest pointsR, if user u i ∈UnF a Then delete user u in R i Score information of (a) to obtain deletion and target user u a Scoring matrix R behind users (weakly related rows) without social relationship 1 ={r i',t,j },i'∈[1,Fnum a ],t∈[0,23],j∈[1,3335]Where i' denotes the user number, t denotes the value of the time slot, j denotes the address number, FNum a Representing target user u a Number of friends r i',t,j Representing user u i' For address l at time slot t j Is a score of (2).
Through the screening of social relations, the row scale of the original scoring matrix R is effectively reduced, namely |i' |=FNum a <<3280。
Step 2-2: in the scoring matrix R 1 Calculating the sum of the scores of each address in all time slots one by one, and if the sum of the scores is equal to 0, representing the target user u a All friends that have not accessed this address at any time add this address to unvisit_F a In the collection.
If address l j ∈unvisit_F a Then at R 1 Delete l in j To obtain a scoring matrix R after deleting irrelevant addresses (weakly relevant columns) 2 ={r i',t,j' },i'∈[1,Fnum a ],t∈[0,23],j'∈[1,3335-|unvisit_F a |]Where i 'denotes a user number, t denotes a time slot value, j' denotes an address number, FNum a Representing target user u a Is the number of friends, |unvisit_F a The number of addresses that all friends of the target user have not visited, r i',t,j' Representing user u i' For address l at time slot t j' Is a score of (2).
By deleting irrelevant addresses, in the scoring matrix R 1 Further reduces the column size on the basis of (j' |=3335- |unvisit_f) a |<<3335。
Step 2-3: based on the scoring matrix R after pretreatment 2 Calculating the target user u a Scoring similarity to its friend users. If user v epsilon F a Target thenUser u a The scoring similarity with user v is:
wherein u is a Is the target object of the current service of the recommendation system, v is the target user u a T is a certain time slot, unvisit_f a Representing target user u a Address set that all friends of (1) have not visited at any time, |unvisit_f a I represents unvisit_F a Number of addresses in set, r ua,t,l And r v,t,l Respectively represent user u a And scoring the point of interest l by the user v at time t.
Step 2-4: improving traditional collaborative filtering algorithm based on user and utilizing scoring matrix R after data compression 2 Calculating target user u based on social relationship a At t r Prediction score for time access point of interest/:
wherein u is a Is a target object of the current service of the recommendation system, t r Is the time slot corresponding to the current recommended time, l is an interest point which is not visited by the target user in the location social network, and unvisit_F a Representing target user u a An address set that all friends have not visited at any time, v is the target user u a Is a friend user of F a Representing target user u a Is set of friends, sim (u) a V) is user u obtained in step 2-3 a And the scoring similarity to the user v,indicating that user v is at time t r Point of interest i is scored.
Step 3: and counting the accessed times of all the interest points in each time slot, comparing the accessed times with the total accessed times of all the interest points at all times and the accessed times of all the interest points at a certain time, and calculating the dynamic popularity of all the interest points based on time perception. The implementation steps are as follows:
step 3-1: counting the number of times Cnum that a point of interest l is accessed in a time slot t in a check-in data set C l,t
Cnum l,t =∑ u∈[1,3280] Cnum u,t,l Equation 23
Step 3-2: counting the total number of times Cnum that a point of interest l is accessed in a check-in dataset C l
Cnum l =∑ t∈[0,23]u∈[1,3280] Cnum u,t,l Equation 24
The probability of being accessed in each time slot of the first three interest points with the largest number of times of being accessed in the embodiment of the invention is shown in fig. 5. Fig. 5 illustrates that the access probabilities of points of interest vary widely in different time slots. Therefore, it is necessary to explore the dynamic popularity of the interest points in each time slot, and the accuracy of the recommendation of the interest points can be effectively improved by mining the dynamic popularity of the interest points based on time perception.
Step 3-3: counting all accessed times Cnum occurring within a time slot t in a check-in dataset C t
Cnum t =∑ l∈[1,3335]u∈[1,3280] Cnum u,t,l Equation 25
In equations 23, 24, 25, cnum u,l,t Indicating the number of times a user u accesses the point of interest i in the time slot t in the check-in dataset C.
Step 3-4: the longitudinal popularity of the interest point is obtained by calculating the ratio of the accessed times of the interest point l in the time slot t to the total accessed times of the interest point l in all times:
wherein Cnum is l,t Representing the time of interest point lThe number of accesses to slot t, cnum l Representing the total number of times point of interest i is accessed at all times.
Step 3-5: the transverse popularity of the interest point is obtained by calculating the ratio of the accessed times of the interest point l in the time slot t to the accessed times of all the interest points in the time slot t:
wherein Cnum is l,t Representing the number of times the point of interest l is accessed in the time slot t, cnum t Indicating the number of times all points of interest have been accessed in time slot t.
Step 3-6: combining the results of the longitudinal comparison and the transverse comparison to obtain the time perception-based dynamic popularity of the interest point l in the time slot t:
popu (l, t) =popu 1 (l, t) ×popu2 (l, t) equation 28
Where popu1 (l, t) is the longitudinal popularity result of point of interest l in time slot t, and popu2 (l, t) is the lateral popularity result of point of interest l in time slot t.
The time-aware based dynamic popularity of each interest point in each time slot is summarized to form a 3335 row 24 column interest point-time two-dimensional popularity matrix P.
Step 4: and according to longitude and latitude information of the position in the sign-in data set, calculating geographic distances among different interest points, and mining the influence of geographic features on the access probability of the interest points based on a power law distribution model. The implementation steps are as follows:
step 4-1: suppose target user u a The accessed interest point set is L_u a ,l'∈L_u a Acquiring the geographical longitude Lng of l' in the check-in data set C l' And latitude Lat l' Let l' =as<Lng l' ,Lat l' >L is a certain point of interest that the target user has never visited, l=<Lng l ,Lat l >. Calculating a geographic distance dist (l, l ') between interest points l and l':
where R is the earth radius, r=6371 km.
Step 4-2: a schematic diagram of the probability distribution of the geographic distances between adjacent points of interest visited by all users on the same day in the embodiment of the present invention is shown in fig. 6. It can be seen that the probability of a user checking in at different places conforms to the power law distribution. Constructing a power law function based on geographic distance:
pr(dist(l,l'))=a×dist(l,l') b equation 30
Wherein a and b are two parameters of the power law function, and the numerical values of the two parameters can be obtained through a maximum likelihood estimation method:
First, the logarithm is taken for both the left and right sides of equation 30:
log 2 (pr(dist(l,l')))=log 2 (a)+b×log 2 (dist (l, l')) formula 31
Let y=log 2 (pr(dist(l,l'))),x=log 2 (dist(l,l')),ω 0 =log 2 (a),ω 1 =b,ω=(ω 01 ) Linear regression is achieved:
y(ω,x)=ω 01 x formula 32
In the process of learning the parameter omega, a least square method is adopted to solve the problem of linear curve fitting, and a linear regression model loss function is defined as follows:
wherein x is n Is equal to x' n The corresponding true value, λ, is a regular term and M is the amount of input data. And (3) taking the value of the minimized loss function as an optimization target, finding out the value of the corresponding parameter omega, and further obtaining the values of the power law parameters a and b. In an embodiment, for example, the power law parameters a= 0.145587 in the power law distribution model of a certain group of test users are calculated according to the distance between the addresses visited by the test users in the method,b=-0.985544。
step 4-3: calculating a conditional probability pr (l|l ') of the user accessing the candidate point of interest l when the user is currently at the point of interest l':
where L is the set of all addresses.
Step 4-4: calculating the target user u by using a naive Bayes method a Probability of accessing candidate point of interest/:
wherein L_u a Is target user u a The set of points of interest that have been accessed.
Step 5: comprehensively considering the influence of the social relationship of the user, the dynamic popularity of the position and the geographic distance on the access behavior of the user, fusing the predicted scores based on friendship, the dynamic popularity of the interest points and the access probability based on the distance, and generating a final predicted score of the non-accessed address for the target user. And sequencing all the non-accessed addresses according to the final prediction scores, and providing a recommendation list consisting of a plurality of addresses ranked at the top for the target user. The implementation steps are as follows:
Step 5-1: determining a target user u in a location social network a As a recommended service object, the current recommended time is used for time r Conversion to time slot t r
Step 5-2: comprehensively considering influence of user social relationship, position dynamic popularity and geographic distance on user access behaviors, and calculating target user u a At t r Prediction score for time access point of interest/:
wherein u is a Is currently served by the recommendation systemTarget object, t r Is the time slot corresponding to the current recommended time, l is a point of interest in the location social network that the target user has not accessed,calculating the target user u based on social relationship a At t r Predictive scoring of time-access interest point l, popu (l, t r ) Is the point of interest l at t r Real-time popularity of time, pr (lL_u) a ) The predicted access probability L_u is obtained by mining the influence of the geographic distance by using a power law distribution model a Is target user u a The set of points of interest that have been accessed.
Step 5-3: for target user u a All addresses which are not accessed are ordered according to predictive scores, N positions which are ranked at the top are formed into a recommendation list, and the recommendation list TopNList is formed a And returning to the target user. N may take a multiple of 10, and typically N takes a value of 10, 20, 30, 40, 50, respectively.
Step 6: the accuracy Precision, recall ratio Recall and comprehensive accuracy index F1 are used as accuracy evaluation indexes of a recommendation system, and the applicability and effectiveness of the proposed technology are evaluated by comparing the prediction accuracy of the proposed recommendation method and other related classical recommendation methods. The implementation steps are as follows:
step 6-1: the method comprises the steps of randomly selecting 656 users as a target user set Testu, and respectively operating a social relationship fusion position dynamic popularity and geographic feature-based interest point recommendation method, a classical user-based collaborative filtering method UBCF, a social relationship-based collaborative filtering method SCF, a power law distribution-based access probability prediction method PLD and a kernel density estimation-based access probability prediction method KDE for 656 target users to generate a recommendation list.
Step 6-2: calculating the accuracy and recall rate of each recommendation method in the time slot t:
/>
the Testu is a set of all target users, R (u, t) is a recommendation list provided for a certain target user u at the moment t by a recommendation method, and Like (u, t) is a set of interest points actually visited by the user u at the moment t.
Step 6-3: the overall accuracy and recall of the recommendation method are calculated, and the values of the overall accuracy and recall are the average value of corresponding evaluation indexes in each time slot:
Wherein precision (t) and recovery (t) are the accuracy and recall of the recommended method in time slot t, respectively.
Step 6-4: calculating the comprehensive accuracy F1 value of the recommendation system:
precision and recall are the overall accuracy and recall, respectively, of the recommended method run once.
Step 6-5: repeating the steps 6-1 to 6-4 100 times, wherein the final prediction accuracy (the values of the Precision, recall and comprehensive Precision index F1) of the recommendation method is the average value of the 100 corresponding index results.
When N is 10, 20, 30, 40, 50, the Precision, recall, and integrated Precision index F1 results of each recommended method are shown in tables 1, 2, and 3, respectively, wherein the values in bold format for each row represent the maximum value of the index for that row:
TABLE 1 Precision index values for different recommendation methods
Table 2 Recall index values for different recommendation methods
TABLE 3 recommendation precision F1 index values of different recommendation methods
In the present case, the recommended method and classical collaborative filtering method UBCF based on the user, collaborative filtering method SCF based on the social relationship, access probability prediction method PLD based on power law distribution, accuracy Precision of access probability prediction method KDE based on kernel density estimation, recall ratio Recall and histogram of comparison of comprehensive accuracy index F1 are shown in fig. 7, fig. 8 and fig. 9, respectively.
Step 6-6: comparing and analyzing the results of each index: the accuracy rate Precision of the interest point recommendation method based on the social relation fusion position dynamic popularity and the geographic characteristics is larger than that of all other methods, so that the technology provided by the invention can help users to find interested addresses more accurately; the Recall rate Recall of the technology provided by the invention is larger than the Recall values of other recommendation methods, which shows that the technology provided by the invention can more comprehensively cover the addresses interested by the user; the F1 value of the method provided by the invention is larger than that of other recommended methods, which shows that the technology provided by the invention has stronger comprehensive capacity in the aspect of prediction accuracy.
Different from the conventional interest point recommendation method, the method aims at constructing a real-time, accurate and dynamic interest point recommendation system, considers the influence of social relations of users, dynamic popularity of positions and geographic distances on sign-in behaviors of the users, innovatively utilizes the social relations to conduct data compression, generates personalized scoring matrixes for each user so as to improve the operation efficiency of the recommendation system, and meanwhile innovatively provides a calculation method of the dynamic popularity of the interest points, and improves the effectiveness of scoring prediction by extracting different popularity of the interest points in different time periods so as to strengthen the service quality of the recommendation system. The technology provided by the invention has wide application prospect and is expected to be widely applied to social network markets based on the positions at home and abroad.
The above technical process is only a preferred embodiment of the present invention, but not represents all the details of the present invention. Any modification, equivalent replacement, and improvement made by those skilled in the art within the scope of the present disclosure, which is within the spirit and principles of the present invention, should be included in the scope of the present invention.

Claims (4)

1. The interest point recommendation method based on the social relation fusion position dynamic popularity and geographic characteristics is characterized by comprising the following steps:
step 1: collecting and sorting the original check-in data set and the social relation data set, and respectively converting the original check-in data set and the social relation data set into a user-time-interest point three-dimensional scoring matrix and a two-dimensional relation matrix, wherein the method comprises the following steps of:
step 1-1: the original check-in data set C is arranged to obtain n check-in records, and the n check-in records are recorded as C= { C 1 ,c 2 ,…,c n All users in the check-in data set C form a user set U, all interest points form an address set L, the number of users and the number of interest points are respectively marked as NU and NL, check-in time in a check-in record is rounded, discrete check-in time is converted into 24 time slots, and the time slot set T= {0,1,2, …,23};
step 1-2: counting the number of times a user u accesses an address l in a time slot t in a check-in data set C, and if the number of times of check-in is 0, scoring r of the user u on the address l in the time slot t u,t,l 0, otherwise, r u,t,l =1, summarizing all scores, formingThree-dimensional scoring matrix R= { R of user-time-interest point u,t,l U e U, t e [0,23 ]]L epsilon L, U and L are a user set and a point of interest set respectively;
step 1-3: the original social relation data set F is arranged to obtain m social relation records, and the m social relation records are recorded as F= { F 1 ,f 2 ,…,f m Formalize each social relationship record into<User u x User u y >Binary group, x.epsilon.1, NU],y∈[1,NU]NU is the number of all users in the dataset;
step 1-4: constructing a two-dimensional user social relation matrix S= { S xy }:
The number of rows and columns of the matrix are the number of users NU, x E [1, NU],y∈[1,NU]Its element s xy Representing user u x And u is equal to y Whether social relationship exists between the two;
step 2: selecting an active user in a location-based social network as a recommended service object, deleting weak correlation rows and weak correlation columns in a scoring matrix for the target user, improving a traditional user-based collaborative filtering algorithm, and generating a friendship-based predictive score for the target user, wherein the method comprises the following steps:
step 2-1: determining a target user u in a location social network a As a recommendation service object, searching for a target user u in the social relationship matrix S a The row where the target user u is located is obtained by obtaining the column number with the element value of 1 in the row a Friend set F a ,F a The number of elements in the set is denoted as FNum a At the same time, the column number with element value of 0 in the row is recorded to form a target user u a Social relationship-free user collection UnF a
In the three-dimensional scoring matrix R of the user-time-interest points, if the user u i ∈UnF a Then delete user u in R i Score information of (a) to obtain deletion and target user u a Social relationship-freeScoring matrix R for users of the family, i.e. behind weakly related rows 1 ={r i',t,j },i'∈[1,Fnum a ],t∈[0,23],j∈[1,NL]Where i' denotes the user number, t denotes the value of the time slot, j denotes the address number, FNum a Representing target user u a NL represents the total number of points of interest, r i',t,j Representing user u i' For address l at time slot t j Is a score of (2);
through the screening of social relations, the row scale of the original scoring matrix R is effectively reduced, namely |i' |=FNum a <<NU, NU represents the total number of users;
step 2-2: in the scoring matrix R 1 Calculating the sum of the scores of each address in all time slots one by one, and if the sum of the scores is equal to 0, representing the target user u a All friends that have not accessed this address at any time add this address to unvisit_F a In the collection;
if address l j ∈unvisit_F a Then at R 1 Delete l in j To obtain a scoring matrix R after deleting irrelevant addresses, namely weak relevant columns 2 ={r i',t,j' },i'∈[1,Fnum a ],t∈[0,23],j'∈[1,NL-|unvisit_F a |]Where i 'denotes a user number, t denotes a time slot value, j' denotes an address number, FNum a Representing target user u a NL represents the total number of points of interest, |unvisit_f a The number of addresses that all friends of the target user have not visited, r i',t,j' Representing user u i' For address l at time slot t j' Is a score of (2);
by deleting irrelevant addresses, in the scoring matrix R 1 Further reduces the column size on the basis of (i) j' |=nl- |unvisit_f a |<<NL, NL represents the total number of users;
step 2-3: based on the scoring matrix R after pretreatment 2 Calculating the target user u a Score similarity with friend users if user v e F a Target user u a The scoring similarity with user v is:
wherein u is a Is the target object of the current service of the recommendation system, v is the target user u a T is a time slot, T is a time slot set, L is a set for all points of interest, and unvisit_F a Representing target user u a A set of addresses that have not been accessed by all friends of (a) at any time,and r v,t,l Respectively represent user u a And scoring the point of interest l by the user v at time t;
step 2-4: improving traditional collaborative filtering algorithm based on user and utilizing scoring matrix R after data compression 2 Calculating target user u based on social relationship a At t r Prediction score for time access point of interest/:
wherein u is a Is a target object of the current service of the recommendation system, t r Is the time slot corresponding to the current recommended time, l is an interest point which is not visited by the target user in the location social network, and unvisit_F a Representing target user u a An address set that all friends have not visited at any time, v is the target user u a Is a friend user of F a Representing target user u a Is set of friends, sim (u) a V) is user u obtained in step 2-3 a And the scoring similarity to the user v,indicating that user v is at time t r Scoring the interest point l;
step 3: counting the accessed times of each interest point in each time slot, comparing the accessed times with the total accessed times of all the interest points at all times and the accessed times of all the interest points at a certain time, and calculating the dynamic popularity of each interest point based on time perception, wherein the method comprises the following steps:
step 3-1: counting the number of times Cnum that a point of interest l was accessed in time slot t in an original check-in dataset C l,t
Cnum l,t =∑ u∈U Cnum u,t,l Equation 4
Step 3-2: counting the total number of times Cnum that a point of interest l is accessed in a check-in dataset C l
Cnum l =∑ t∈Tu∈U Cnum u,t,l Equation 5
Step 3-3: counting all accessed times Cnum occurring within a time slot t in a check-in dataset C t
Cnum t =∑ l∈Lu∈U Cnum u,t,l Equation 6
In the formulas 4, 5 and 6, U is all user sets, L is all address sets, T is a time slot set, cnum u,l,t Indicating the number of times a user u accesses the point of interest i in the time slot t in the check-in dataset C;
step 3-4: the longitudinal popularity of the interest point is obtained by calculating the ratio of the accessed times of the interest point l in the time slot t to the total accessed times of the interest point l in all times:
wherein Cnum is l,t Representing the number of times the point of interest l is accessed in the time slot t, cnum l Representing the total number of times point of interest i is accessed at all times;
step 3-5: the transverse popularity of the interest point is obtained by calculating the ratio of the accessed times of the interest point l in the time slot t to the accessed times of all the interest points in the time slot t:
wherein Cnum is l,t Representing the number of times the point of interest l is accessed in the time slot t, cnum t Indicating the accessed times of all the interest points in the time slot t;
step 3-6: and (3) synthesizing the results of the longitudinal comparison in the step (3-4) and the transverse comparison in the step (3-5) to obtain the time perception-based dynamic popularity of the interest point l in the time slot t:
popu (l, t) =popu 1 (l, t) ×popu2 (l, t) equation 9
Wherein popu1 (l, t) is the longitudinal popularity result of the point of interest l in the time slot t, popu2 (l, t) is the lateral popularity result of the point of interest l in the time slot t;
Summarizing the time-perception-based dynamic popularity of each interest point in each time slot to form an interest point-time two-dimensional popularity matrix P of NL row 24 columns;
step 4: according to longitude and latitude information of the position in the sign-in data set, geographic distances among different interest points are calculated, and influence of geographic features on the access probability of the interest points is mined based on a power law distribution model;
step 5: comprehensively considering the influence of the social relationship, the dynamic popularity of the position and the geographic distance of the user on the access behavior of the user, fusing the predicted score based on friendship, the dynamic popularity of the interest point and the access probability based on the distance, generating a final predicted score of the non-accessed addresses for the target user, sequencing all the non-accessed addresses according to the final predicted score, and providing a recommendation list consisting of a plurality of addresses with top ranking for the target user;
step 6: and using Precision, recall and comprehensive Precision index F1 as accuracy evaluation indexes of a recommendation system, comparing the prediction accuracy of the recommendation method with that of other related classical recommendation methods, and evaluating the applicability and effectiveness of the proposed technology.
2. The method for recommending points of interest based on a fusion of dynamic popularity and geographic features of social relationships according to claim 1, wherein said step 4 comprises:
Step 4-1: suppose target user u a The accessed interest point set is L_u a ,l'∈L_u a Acquiring the geographical longitude Lng of l' in the check-in data set C l' And latitude Lat l' Let l' =as<Lng l' ,Lat l' >L is a certain point of interest that the target user has never visited, l=<Lng l ,Lat l >Calculating a geographic distance dist (l, l ') between the interest points l and l':
where R is the earth radius, r=6371 km;
step 4-2: the sign-in probability of the user at different places accords with the power law distribution, and a power law function based on geographic distance is constructed:
pr(dist(l,l'))=a×dist(l,l') b equation 11
Wherein the target user u a The accessed interest point set is L_u a ,l'∈L_u a Acquiring the geographical longitude Lng of l' in the check-in data set C l' And latitude Lat l' Let l' =as<Lng l' ,Lat l' >L is a certain point of interest that the target user has never visited, l=<Lng l ,Lat l >Calculating a geographical distance dist (l, l ') between interest points l and l', wherein a and b are two parameters of a power law function, and the numerical values of the two parameters can be obtained through a maximum likelihood estimation method;
step 4-3: calculating a conditional probability pr (l|l ') of the user accessing the candidate point of interest l when the user is currently at the point of interest l':
wherein the target user u a The accessed interest point set is L_u a ,l'∈L_u a Acquiring the geographical longitude Lng of l' in the check-in data set C l' And latitude Lat l' Let l' =as <Lng l' ,Lat l' >L is a certain point of interest that the target user has never visited, l=<Lng l ,Lat l >Calculating a geographic distance dist (L, L ') between interest points L and L', wherein L is a set of all addresses;
step 4-4: calculating the target user u by using a naive Bayes method a Probability of accessing candidate point of interest/:
wherein the target user u a The accessed interest point set is L_u a ,l'∈L_u a Acquiring the geographical longitude Lng of l' in the check-in data set C l' And latitude Lat l' Let l' =as<Lng l' ,Lat l '>L is a certain point of interest that the target user has never visited, l=<Lng l ,Lat l >The geographical distance dist (l, l ') between points of interest l and l' is calculated.
3. The method for recommending points of interest based on a fusion of dynamic popularity and geographic features of social relationships according to claim 1, wherein said step 5 comprises:
step 5-1: determining a target user u in a location social network a As a recommended service object, the current recommended time is used for time r Conversion to time slot t r
Step 5-2: comprehensively considering influence of user social relationship, position dynamic popularity and geographic distance on user access behaviors, and calculating target user u a At t r Prediction score for time access point of interest/:
wherein u is a Is a target object of the current service of the recommendation system, t r Is the time slot corresponding to the current recommended time, l is a point of interest in the location social network that the target user has not accessed,calculating the target user u based on social relationship a At t r Predictive scoring of time-access interest point l, popu (l, t r ) Is the point of interest l at t r Real-time popularity of time, pr (l|L_u) a ) The predicted access probability L_u is obtained by mining the influence of the geographic distance by using a power law distribution model a Is target user u a A set of points of interest that have been accessed;
step 5-3: for target user u a All addresses which are not accessed are ordered according to predictive scores, N positions which are ranked at the top are formed into a recommendation list, and the recommendation list TopNList is formed a And returning to the target user.
4. The method for recommending points of interest based on a fusion of dynamic popularity and geographic features of social relationships according to claim 1, wherein said step 6 comprises:
step 6-1: randomly selecting NU x 20% of users as a target user set Testu, wherein NU represents the total number of users, and respectively running each recommendation method for each target user in the Testu set to generate a recommendation list;
step 6-2: calculating the accuracy and recall rate of the recommendation method in the time slot t:
the Testu is a set of all target users, R (u, t) is a recommendation list provided for a certain target user u at the moment t by a recommendation method, and Like (u, t) is a set of interest points actually visited by the user u at the moment t;
Step 6-3: the overall accuracy and recall of the recommendation method are calculated, and the values of the overall accuracy and recall are the average value of corresponding evaluation indexes in each time slot:
wherein T is a time slot set, and precision (T) and recovery (T) are respectively the accuracy and recall of the recommendation method in the time slot T;
step 6-4: calculating the comprehensive accuracy F1 value of the recommendation system:
the precision and the recovery are the overall accuracy and recall rate of the recommended method running once respectively;
step 6-5: repeating the steps 6-1 to 6-4 for Ntimes, and recommending final prediction accuracy of the method, wherein the final prediction accuracy comprises values of an accuracy Precision, a Recall and a comprehensive accuracy index F1, and the values are average values of index results corresponding to the Ntimes;
step 6-6: comparing and analyzing the results of each index: if the Precision of the method is greater than the Precision of other recommended methods, the proposed technique is described as being able to help the user to find the address of interest more accurately; if the Recall rate Recall of the technology proposed by the method is larger than the Recall values of other recommended methods, the proposed technology is described as being capable of more comprehensively covering the addresses of interest to the user; if the F1 value of the proposed method is larger than that of other recommended methods, the proposed technique is shown to have a stronger comprehensive capacity in terms of prediction accuracy.
CN202211174747.XA 2022-09-26 2022-09-26 Point-of-interest recommendation method based on social relation fusion position dynamic popularity and geographic features Active CN115408618B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211174747.XA CN115408618B (en) 2022-09-26 2022-09-26 Point-of-interest recommendation method based on social relation fusion position dynamic popularity and geographic features

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211174747.XA CN115408618B (en) 2022-09-26 2022-09-26 Point-of-interest recommendation method based on social relation fusion position dynamic popularity and geographic features

Publications (2)

Publication Number Publication Date
CN115408618A CN115408618A (en) 2022-11-29
CN115408618B true CN115408618B (en) 2023-10-20

Family

ID=84165499

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211174747.XA Active CN115408618B (en) 2022-09-26 2022-09-26 Point-of-interest recommendation method based on social relation fusion position dynamic popularity and geographic features

Country Status (1)

Country Link
CN (1) CN115408618B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115604130B (en) * 2022-12-01 2023-03-14 中南大学 APP popularity prediction model construction method, prediction method, device and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104182543A (en) * 2014-09-05 2014-12-03 上海理工大学 Similarity propagation and popularity dimensionality reduction based mixed recommendation method
CN106682114A (en) * 2016-12-07 2017-05-17 广东工业大学 Personalized recommending method fused with user trust relationships and comment information
CN107341261A (en) * 2017-07-13 2017-11-10 南京邮电大学 A kind of point of interest of facing position social networks recommends method
CN107909108A (en) * 2017-11-15 2018-04-13 东南大学 Edge cache system and method based on content popularit prediction
CN108460101A (en) * 2018-02-05 2018-08-28 山东师范大学 Point of interest of the facing position social networks based on geographical location regularization recommends method
CN109284449A (en) * 2018-10-23 2019-01-29 厦门大学 The recommended method and device of point of interest
CN114036376A (en) * 2021-10-26 2022-02-11 南京理工大学紫金学院 Time-aware self-adaptive interest point recommendation method based on K-means clustering
CN114385915A (en) * 2022-01-11 2022-04-22 未鲲(上海)科技服务有限公司 Content recommendation method and device, storage medium and electronic equipment
CN114528480A (en) * 2022-01-21 2022-05-24 朱俊 Time-sensing self-adaptive interest point recommendation method based on K-means clustering

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104182543A (en) * 2014-09-05 2014-12-03 上海理工大学 Similarity propagation and popularity dimensionality reduction based mixed recommendation method
CN106682114A (en) * 2016-12-07 2017-05-17 广东工业大学 Personalized recommending method fused with user trust relationships and comment information
CN107341261A (en) * 2017-07-13 2017-11-10 南京邮电大学 A kind of point of interest of facing position social networks recommends method
CN107909108A (en) * 2017-11-15 2018-04-13 东南大学 Edge cache system and method based on content popularit prediction
CN108460101A (en) * 2018-02-05 2018-08-28 山东师范大学 Point of interest of the facing position social networks based on geographical location regularization recommends method
CN109284449A (en) * 2018-10-23 2019-01-29 厦门大学 The recommended method and device of point of interest
CN114036376A (en) * 2021-10-26 2022-02-11 南京理工大学紫金学院 Time-aware self-adaptive interest point recommendation method based on K-means clustering
CN114385915A (en) * 2022-01-11 2022-04-22 未鲲(上海)科技服务有限公司 Content recommendation method and device, storage medium and electronic equipment
CN114528480A (en) * 2022-01-21 2022-05-24 朱俊 Time-sensing self-adaptive interest point recommendation method based on K-means clustering

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Effects of Social Interaction Dynamics on Platforms;Thies, F 等;《JOURNAL OF MANAGEMENT INFORMATION SYSTEMS》;843-873 *
Yaqian Duan 等.POI Popularity Prediction via Hierarchical Fusion of Multiple Social Clues.《SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval》.2017,1001–1004. *
陈远成.中国绿色公共产品投资效率统计分析.《中国优秀硕士学位论文全文数据库 经济与管理科学辑》.2016,J147-63. *

Also Published As

Publication number Publication date
CN115408618A (en) 2022-11-29

Similar Documents

Publication Publication Date Title
Yin et al. Joint modeling of user check-in behaviors for real-time point-of-interest recommendation
Bao et al. A survey on recommendations in location-based social networks
Li et al. A survey on personalized news recommendation technology
Xing et al. Points-of-interest recommendation based on convolution matrix factorization
Zhang et al. A novelty-seeking based dining recommender system
Sang et al. Activity sensor: Check-in usage mining for local recommendation
CN114036376A (en) Time-aware self-adaptive interest point recommendation method based on K-means clustering
CN114528480A (en) Time-sensing self-adaptive interest point recommendation method based on K-means clustering
Menk et al. Recommendation systems for tourism based on social networks: A survey
Chen et al. Place recommendation based on users check-in history for location-based services
CN115712780A (en) Information pushing method and device based on cloud computing and big data
CN115422441A (en) Continuous interest point recommendation method based on social space-time information and user preference
CN115408618B (en) Point-of-interest recommendation method based on social relation fusion position dynamic popularity and geographic features
Yin et al. A fuzzy clustering based collaborative filtering algorithm for time-aware POI recommendation
Zou et al. An innovative GPS trajectory data based model for geographic recommendation service
Tan et al. Preference-oriented mining techniques for location-based store search
Yin et al. A tensor decomposition based collaborative filtering algorithm for time-aware POI recommendation in LBSN
Özsoy et al. Multi-objective optimization based location and social network aware recommendation
CN113256024B (en) User behavior prediction method fusing group behaviors
Chen et al. A restaurant recommendation approach with the contextual information
CN110928920B (en) Knowledge recommendation method, system and storage medium based on improved position social contact
CN114065024A (en) POI recommendation method based on user personalized life mode
Gang Personalized Recommendation of Tourist Attractions Based on Collaborative Filtering
CN114417166A (en) Continuous interest point recommendation method based on behavior sequence and dynamic social influence
Wang et al. Optimization of Digital Recommendation Service System for Tourist Attractions Based on Personalized Recommendation Algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant