CN115408618B

CN115408618B - Point-of-interest recommendation method based on social relation fusion position dynamic popularity and geographic features

Info

Publication number: CN115408618B
Application number: CN202211174747.XA
Authority: CN
Inventors: 朱俊; 韩立新; 李振旺; 徐逸卿; 梁太波; 汪洋; 李树
Original assignee: Nanjing Vocational University of Industry Technology NUIT
Current assignee: Nanjing Vocational University of Industry Technology NUIT
Priority date: 2022-09-26
Filing date: 2022-09-26
Publication date: 2023-10-20
Anticipated expiration: 2042-09-26
Also published as: CN115408618A

Abstract

The invention discloses a point-of-interest recommendation method based on a social relation fusion position dynamic popularity and geographic characteristics, which comprises the following steps: sorting the sign-in data set and the social relationship data set to generate a three-dimensional scoring matrix and a two-dimensional relationship matrix; deleting weak correlation rows and weak correlation columns in the scoring matrix, generating a personalized scoring matrix for each target user, and calculating a predicted score based on friendship relation; calculating the dynamic popularity of each interest point based on time perception; mining the influence of the geographic features on the access probability of the interest points based on the power law distribution model; and comprehensively considering the social relationship of the user, the dynamic popularity of the position and the influence of the geographic distance, fusing the predicted score based on friendship, the dynamic popularity of the interest point and the access probability based on the distance, and generating a final predicted score. According to the method and the system, the user can recommend a plurality of interest points for the user in real time according to the historical access interests of the user, the social relations among the users, the geographic distances among the positions and the position popularity at the current time, and the method and the system have important practical application values.

Description

Point-of-interest recommendation method based on social relation fusion position dynamic popularity and geographic features

Technical Field

The invention relates to a point-of-interest recommendation method based on a social relation fusion position dynamic popularity and geographic characteristics, and belongs to the technical field of artificial intelligence and machine learning.

Background

In recent years, advances in mobile computing, wireless communication, and Location acquisition technologies have facilitated the popularity and development of Location-based social networks (Location-based Social Networks, LBSNs). There are a large number of mature social network platforms based on location at home and abroad, such as Foursquare, gowalla, facebook, twitter and yellow, etc., public critique, newwave microblog, weChat friend circle, etc. in China, more and more people use online social networks through smart phones. The 49 th statistical report of the development status of the internet in China according to the 49 th statistical report of the development status of the internet in China issued by the internet information center (CNNIC) of China in 2 months of 2022 shows that: by 12 months of 2021, the scale of Chinese netizens reaches 10.32 hundred million, wherein the proportion of netizens using mobile phones to surf the net reaches 99.7%. The social network based on the position has become a novel media form for people to share and transfer information, on one hand, users establish social relations in LBSNs, publish interesting contents, and share the current position, picture, audio, video, comments and the like of the users at any time and any place; on the other hand, when facing a huge amount of information resources, in order to alleviate the information overload (Information Overload) problem, the user can obtain recommended contents, such as location, friends, music, advertisement, etc., conforming to the user's preference by using the personalized service provided in the LBSNs.

In LBSNs, a location check-in service is an important and widely used service. The LBSNs determine the user's current location through GPS global positioning system or WiFi positioning in combination with a geographic location system. With the rapid development of cities, more and more places such as tourist attractions, theatres, hotels, banks, shops and the like are appeared, and people often make a choice of where to go among various places according to personal interest preferences. These places of Interest to the user, which are actually present in the physical world, are called points-of-Interest (POIs). However, each user has a limited number of points of interest, and when the user faces a huge number of places without going or accesses to strange cities, the problems of "place information overload" and "select panic" are generated, and how to recommend a location meeting the interest preference of the user to the user in a huge number of places in the real world is a problem to be solved urgently. The interest point recommendation (POIs Recommendation) is used as an emerging recommendation field, can effectively solve the problem of selection trouble brought by overload of position information to users, is beneficial to improving experience of the users in social networks and real life, can help merchants analyze and mine potential users to conduct advertisement pushing service, and becomes a research hotspot.

Considering that the point of interest recommendation is an important branch of a recommendation system, whether development history or key technology is carried out in a pulse manner with a traditional recommendation system, part of point of interest recommendation research regards the position as a common item similar to films, music and the like, and a recommendation result is generated by using a traditional recommendation method. According to design strategies, traditional recommendation methods mainly comprise collaborative filtering algorithms, content-based recommendation algorithms and hybrid recommendation algorithms. The collaborative filtering algorithm further comprises a memory-based collaborative filtering algorithm (such as user-based collaborative filtering UBCF, project-based collaborative filtering IBCF) and a model-based collaborative filtering algorithm (such as singular value decomposition SVD, clustering model, probability latent semantic analysis and the like). Wherein content-based point-of-interest recommendation techniques extract relevant information, such as tags, classifications, and user reviews, from the accessed location; user preferences are extracted from the user's profile and then matched with the location profile to obtain accurate recommendations. The collaborative filtering technology converts the sign-in behavior of the user into a user-interest point scoring matrix, searches for a user similar to the current active user or a position highly similar to the previous favorite address of the active user by using the existing sign-in record, predicts the score of the active user on the place which is not signed in according to the similarity between the users or the addresses, and recommends the interest point with the highest predicted score to the current user. Singular Value Decomposition (SVD) is a classical representation of matrix decomposition, the main task of which is to generate low rank approximations. The low-dimensional orthogonal matrix decomposed by the SVD technology reduces noise on the basis of the original matrix, and can more effectively reveal potential association between users and commodities.

The above conventional recommendation techniques ignore the influence of social relationships, geographic distances and popularity of locations in the location social network on user check-in behaviors at different times. However, in reality the sign-in habits of the user are always closely related to relationship, location and temporal context. For example, a user will often check in with his friends at the same point of interest; when a user plays a tourist attraction, the user is more willing to stay in a nearby hotel; the interest points of the catering class are visited most (popularity is highest) at about 12 and 18 points, and the popularity of the bar starts to rise from 21 points. How to introduce social relations, geographic features and position popularity in a recommendation algorithm, and providing a suitable point-of-interest recommendation list for users in a specific time period has become an urgent need for various social application platforms.

Currently, some solutions of point of interest recommendation technology consider one or more factors of geographic impact, time impact, social impact and the like, but still have some drawbacks and disadvantages, which are summarized as follows:

(1) The time-dimensional dynamic features of location popularity are ignored. The point of interest recommendation technique, which focuses on location popularity features, is relatively few compared to other categories of context of social relationships, geographic features, etc., and only analyzes the overall popularity of a location from a macroscopic perspective for a few related studies (comparing the number of times a location is accessed to the total number of accesses in a dataset), ignoring the time-dimensional dynamic features of microscopic-level location popularity, and in fact, using global location popularity at different time periods throughout the day is not in line with the fact rule. Therefore, how to mine the change rule of the position popularity with the passage of time is a considerable problem.

(2) The one-sided nature of the location popularity calculation method. The current existing popularity calculation method only carries out transverse comparison on the accessed times of the interest points and the accessed times of other positions, and if the accessed frequency of the current position is relatively higher, the position is considered to have higher popularity. However, the method is limited to the transverse comparison result of the target position and other positions, and neglects to longitudinally compare the popularity of the target position at the current time with the popularity of the target position in other time periods, so that the method has certain one-sidedness, reduces the accuracy of position popularity estimation and influences the recommendation accuracy of a position recommendation system.

(3) High time complexity problem of user-time-point of interest three-dimensional matrix computation. The scoring matrix in the traditional recommendation system only contains two-dimensional information of the user and the project, and in order to explore the behavior mode of the user in the target time period, the interest point recommendation system needs to expand the two-dimensional matrix of the user-interest point to the three-dimensional scoring matrix of the user-time-interest point, which certainly aggravates the computational complexity of the recommendation process. With the situation that the calculated data volume presents exponential growth, the calculation cost has become one of the bottlenecks for restricting the rapid development of the recommendation system. Therefore, a recommendation method capable of reducing the computational complexity must be studied to improve the operation efficiency of the recommendation system.

The defects of the conventional interest point recommending technology bring about larger defects in the design, development, deployment and operation of social network platforms at different positions, and particularly the service quality of a recommending system is reduced on the network platform with massive project information, thereby influencing the sales performance of an electronic commerce system.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a point-of-interest recommendation method based on the fusion of the dynamic popularity of the social relationship and the geographic features. In consideration of the problems of large data volume and high computational complexity of a user-time-interest point three-dimensional matrix, the invention carries out data preprocessing on the scoring matrix based on friendship relation, generates a personalized scoring matrix for each user, improves the traditional collaborative filtering algorithm based on the user, and achieves the aim of improving recommendation efficiency; meanwhile, the invention innovatively provides a calculation method of the dynamic popularity of the interest points, and the effectiveness of scoring prediction is improved by extracting different popularity of the interest points in different time periods, so that the service quality of a recommendation system is enhanced.

The technical scheme adopted for solving the technical problems is as follows: a point-of-interest recommendation method based on social relation fusion of dynamic popularity and geographic features comprises the following steps:

step 1: collecting and sorting an original sign-in data set and a social relationship data set, and respectively converting the original sign-in data set and the social relationship data set into a user-time-interest point three-dimensional scoring matrix and a two-dimensional relationship matrix;

step 2: and selecting an active user in the location-based social network as a recommended service object. Deleting weak correlation rows and weak correlation columns in a scoring matrix for the target user, improving a traditional collaborative filtering algorithm based on the user, and generating a predicted score based on friendship for the target user;

step 3: counting the accessed times of all the interest points in each time slot, comparing the accessed times with the total accessed times of all the interest points at all times and the accessed times of all the interest points at a certain time, and calculating the dynamic popularity of all the interest points based on time perception;

step 4: according to longitude and latitude information of the position in the sign-in data set, geographic distances among different interest points are calculated, and influence of geographic features on the access probability of the interest points is mined based on a power law distribution model;

Step 5: comprehensively considering the influence of the social relationship of the user, the dynamic popularity of the position and the geographic distance on the access behavior of the user, fusing the predicted scores based on friendship, the dynamic popularity of the interest points and the access probability based on the distance, and generating a final predicted score of the non-accessed address for the target user. Sequencing all the non-accessed addresses according to the final prediction scores, and providing a recommendation list composed of a plurality of addresses ranked at the top for a target user;

step 6: the accuracy Precision, recall ratio Recall and comprehensive accuracy index F1 are used as accuracy evaluation indexes of a recommendation system, and the applicability and effectiveness of the proposed technology are evaluated by comparing the prediction accuracy of the proposed recommendation method and other related classical recommendation methods.

The beneficial effects are that:

1. the method and the system can recommend a plurality of interest points for the user in real time according to the historical access interests of the user, the social relations among the users, the geographic distances among the positions and the popularity of the positions at the current time, can also help merchants to accurately push advertisements to potential clients, and have important practical application values.

2. According to the method, the three-dimensional sign-in matrix is subjected to data preprocessing by using the user social relation network, the personalized scoring matrix is generated for each target user, the scale of scoring data is reduced, the calculation complexity of a recommendation algorithm is reduced, and the operation efficiency of a recommendation system is improved, so that the use satisfaction degree of users and merchants on a social network platform is improved, and the method has very important significance in practical application.

3. According to the method, the time dimension dynamic characteristics of the position popularity are mined, the accessed times of the interest points in the current time period are compared with other statistical data longitudinally and transversely at the microscopic level, the dynamic popularity of the interest points based on time perception is calculated, the change rule of the position popularity along with the time is fully mined, and the accuracy of predicting the user behavior mode of the recommendation system is effectively improved. The method has certain universality and portability, can be applied to not only the interest point recommendation system, but also the personalized recommendation field of other traditional projects, and has wide industrial application prospect.

Drawings

Fig. 1 is a schematic overview of a point of interest recommendation method based on social relation fusion location dynamic popularity and geographic features.

FIG. 2 is a flowchart showing specific steps of a method for recommending interest points based on a social relationship fusion location dynamic popularity and geographic characteristics.

FIG. 3 is a schematic diagram of check-in records of a user in a location social network in an embodiment of the present invention.

FIG. 4 is a schematic diagram of social relationship records of a user in a location social network in an embodiment of the invention.

Fig. 5 is a schematic diagram of the probability of being accessed in each time slot for the first three points of interest with the highest number of times of being accessed in the embodiment of the present invention.

Fig. 6 is a schematic diagram of a probability distribution of geographic distances between neighboring points of interest visited by all users on the same day in an embodiment of the present invention.

Fig. 7 is a bar graph comparing accuracy rates of a recommendation method and a classical collaborative filtering method UBCF based on a user, a collaborative filtering method SCF based on a social relationship, an access probability prediction method PLD based on power law distribution, and an access probability prediction method KDE based on kernel density estimation.

Fig. 8 is a histogram of Recall contrast of a recommendation method and a classical collaborative filtering method UBCF based on a user, a collaborative filtering method SCF based on a social relationship, an access probability prediction method PLD based on power law distribution, and an access probability prediction method KDE based on kernel density estimation according to the present invention in an embodiment.

Fig. 9 is a bar graph comparing the values of the comprehensive precision index F1 of the recommendation method and the classical collaborative filtering method UBCF based on the user, the collaborative filtering method SCF based on the social relationship, the access probability prediction method PLD based on the power law distribution, and the access probability prediction method KDE based on the kernel density estimation, which are proposed by the present invention in the implementation case.

Detailed Description

The invention will be described in further detail with reference to the drawings.

1-2, the invention provides a point-of-interest recommendation method based on a social relationship fusion position dynamic popularity and geographic characteristics, which specifically comprises the following steps: preprocessing the three-dimensional scoring matrix, deleting weak correlation rows (other users without social relations with the target user) and weak correlation columns (addresses which are not visited by the target user and friends of the target user) in the scoring matrix for the target user, and generating a personalized scoring matrix; improving a traditional collaborative filtering algorithm based on users to generate predictive scores based on friendship; dividing a day into 24 time slots, respectively counting the check-in times of each interest point in each time slot according to time labels, transversely comparing the accessed times of the interest points with the accessed times of other positions, longitudinally comparing the popularity of the target position at the current time with the popularity of the target position in other time periods, and calculating the dynamic popularity of the target position based on time perception; simulating the influence of the geographical distance between the positions on the accessed probability of the interest point by adopting a power law distribution model; the influence of the three contexts (social relationship, dynamic popularity and geographic distance) on the access behaviors of the user is integrated, the prediction scores of all the non-access addresses are calculated and ranked, and a plurality of interest points with top ranks are selected and recommended to the user, as shown in figure 1.

The method comprises the following specific steps:

step 1: the original check-in data set and the social relation data set are collected and arranged and respectively converted into a user-time-interest point three-dimensional scoring matrix and a two-dimensional relation matrix. The operation steps are as follows:

step 1-1: the original check-in data set C is arranged to obtain n check-in records, and the n check-in records are recorded as C= { C ₁ ,c ₂ ,…,c _n }. All users in the check-in data set C form a user set U, all interest points form an address set L, and the number of users and the number of interest points are respectively marked as NU and NL. Rounding the check-in time in the check-in record converts the discrete check-in time into 24 time slots, the time slot set t= {0,1,2, …,23}.

Step 1-2: counting the number of times a user u accesses an address l in a time slot t in a check-in data set C, and if the number of times of check-in is 0, scoring r of the user u on the address l in the time slot t _u,t,l 0, otherwise, r _u,t,l =1. Summarizing all scores to form a user-time-interest point three-dimensional scoring matrix R= { R _u,t,l U e U, t e [0,23 ]]L e L, U and L are the user set and the point of interest set, respectively.

Step 1-3: the original social relation data set F is arranged to obtain m social relation records, and the m social relation records are recorded as F= { F ₁ ,f ₂ ,…,f _m }. Formalize each social relationship record into<User u _x User u _y >Binary group, x.epsilon.1, NU]，y∈[1,NU]NU is the number of all users in the dataset.

Step 1-4: constructing a two-dimensional user social relation matrix S= { S _xy The number of rows and columns of the matrix is the number of users NU, x E [1, NU ]]，y∈[1,NU]Its element s _xy Representing user u _x And u is equal to _y Whether there is a social relationship between:

step 2: and selecting an active user in the location-based social network as a recommended service object. The weakly relevant rows and weakly relevant columns in the scoring matrix are deleted for the target user, a traditional user-based collaborative filtering algorithm is improved, and a friendship-based predictive score is generated for the target user. The specific operation steps are as follows:

step 2-1: determining a target user u in a location social network _a As a recommendation service object, searching for a target user u in the social relationship matrix S _a The row where the target user u is located is obtained by obtaining the column number with the element value of 1 in the row _a Friend set F _a ，F _a The number of elements in the set is denoted as FNum _a At the same time, the column number with element value of 0 in the row is recorded to form a target user u _a Social relationship-free user collection UnF _a 。

In the three-dimensional scoring matrix R of the user-time-interest points, if the user u _i ∈UnF _a Then delete user u in R _i Score information of (a) to obtain deletion and target user u _a Scoring matrix R behind users (weakly related rows) without social relationship ₁ ＝{r _i',t,j }，i'∈[1,Fnum _a ]，t∈[0,23]，j∈[1,NL]Where i' denotes the user number and t denotes the time slotJ represents the address number, fnum _a Representing target user u _a NL represents the total number of points of interest, r _i',t,j Representing user u _i' For address l at time slot t _j Is a score of (2).

Through the screening of social relations, the row scale of the original scoring matrix R is effectively reduced, namely |i' |=FNum _a <<NU, NU represents the total number of users.

Step 2-2: in the scoring matrix R ₁ Calculating the sum of the scores of each address in all time slots one by one, and if the sum of the scores is equal to 0, representing the target user u _a All friends that have not accessed this address at any time add this address to unvisit_F _a In the collection.

If address l _j ∈unvisit_F _a Then at R ₁ Delete l in _j To obtain a scoring matrix R after deleting irrelevant addresses (weakly relevant columns) ₂ ＝{r _i',t,j' }，i'∈[1,Fnum _a ]，t∈[0,23]，j'∈[1,NL-|unvisit_F _a |]Where i 'denotes a user number, t denotes a time slot value, j' denotes an address number, FNum _a Representing target user u _a NL represents the total number of points of interest, |unvisit_f _a The number of addresses that all friends of the target user have not visited, r _i',t,j' Representing user u _i' For address l at time slot t _j' Is a score of (2).

By deleting irrelevant addresses, in the scoring matrix R ₁ Further reduces the column size on the basis of (i) j' |=nl- |unvisit_f _a |<<NL, NL represents the total number of users.

Step 2-3: based on the scoring matrix R after pretreatment ₂ Calculating the target user u _a Scoring similarity to its friend users. If user v epsilon F _a Target user u _a The scoring similarity with user v is:

wherein u is _a Is the target object of the current service of the recommendation system, v is the target user u _a T is a time slot, T is a time slot set, L is a set for all points of interest, and unvisit_F _a Representing target user u _a A set of addresses that have not been accessed by all friends of (a) at any time,and r _v,t,l Respectively represent user u _a And scoring the point of interest l by the user v at time t.

Step 2-4: improving traditional collaborative filtering algorithm based on user and utilizing scoring matrix R after data compression ₂ Calculating target user u based on social relationship _a At t _r Prediction score for time access point of interest/:

wherein u is _a Is a target object of the current service of the recommendation system, t _r Is the time slot corresponding to the current recommended time, l is an interest point which is not visited by the target user in the location social network, and unvisit_F _a Representing target user u _a An address set that all friends have not visited at any time, v is the target user u _a Is a friend user of F _a Representing target user u _a Is set of friends, sim (u) _a V) is user u obtained in step 2-3 _a And the scoring similarity to the user v,indicating that user v is at time t _r Point of interest i is scored.

Step 3: and counting the accessed times of all the interest points in each time slot, comparing the accessed times with the total accessed times of all the interest points at all times and the accessed times of all the interest points at a certain time, and calculating the dynamic popularity of all the interest points based on time perception. The implementation steps are as follows:

step 3-1: counting the number of times Cnum that a point of interest l was accessed in time slot t in an original check-in dataset C _l,t ：

Cnum _l,t ＝∑ _u∈U Cnum _u,t,l Equation 4

Step 3-2: counting the total number of times Cnum that a point of interest l is accessed in a check-in dataset C _l ：

Cnum _l ＝∑ _t∈T ∑ _u∈U Cnum _u,t,l Equation 5

Step 3-3: counting all accessed times Cnum occurring within a time slot t in a check-in dataset C _t ：

Cnum _t ＝∑ _l∈L ∑ _u∈U Cnum _u,t,l Equation 6

In the formulas 4, 5 and 6, U is all user sets, L is all address sets, T is a time slot set, cnum _u,l,t Indicating the number of times a user u accesses the point of interest i in the time slot t in the check-in dataset C.

Step 3-4: the longitudinal popularity of the interest point is obtained by calculating the ratio of the accessed times of the interest point l in the time slot t to the total accessed times of the interest point l in all times:

wherein Cnum is _l,t Representing the number of times the point of interest l is accessed in the time slot t, cnum _l Representing the total number of times point of interest i is accessed at all times.

Step 3-5: the transverse popularity of the interest point is obtained by calculating the ratio of the accessed times of the interest point l in the time slot t to the accessed times of all the interest points in the time slot t:

wherein Cn isum _l,t Representing the number of times the point of interest l is accessed in the time slot t, cnum _t Indicating the number of times all points of interest have been accessed in time slot t.

Step 3-6: combining the results of the longitudinal comparison and the transverse comparison to obtain the time perception-based dynamic popularity of the interest point l in the time slot t:

popu (l, t) =popu 1 (l, t) ×popu2 (l, t) equation 9

Where popu1 (l, t) is the longitudinal popularity result of point of interest l in time slot t, and popu2 (l, t) is the lateral popularity result of point of interest l in time slot t.

The dynamic popularity of each interest point based on time perception in each time slot is summarized to form an interest point-time two-dimensional popularity matrix P of NL row 24 columns.

Step 4: and according to longitude and latitude information of the position in the sign-in data set, calculating geographic distances among different interest points, and mining the influence of geographic features on the access probability of the interest points based on a power law distribution model. The implementation steps are as follows:

step 4-1: suppose target user u _a The accessed interest point set is L_u _a ，l'∈L_u _a Acquiring the geographical longitude Lng of l' in the check-in data set C _l' And latitude Lat _l' Let l' =as<Lng _l' ,Lat _l' >L is a certain point of interest that the target user has never visited, l=<Lng _l ,Lat _l >. Calculating a geographic distance dist (l, l ') between interest points l and l':

where R is the earth radius, r=6371 km.

Step 4-2: the sign-in probability of the user at different places accords with the power law distribution, and a power law function based on geographic distance is constructed:

pr(dist(l,l'))＝a×dist(l,l') ^b equation 11

Where a and b are two parameters of the power law function, and the values of the two parameters can be obtained by a maximum likelihood estimation method.

Step 4-3: calculating a conditional probability pr (l|l ') of the user accessing the candidate point of interest l when the user is currently at the point of interest l':

where L is the set of all addresses.

Step 4-4: calculating the target user u by using a naive Bayes method _a Probability of accessing candidate point of interest/:

wherein L_u _a Is target user u _a The set of points of interest that have been accessed.

Step 5: comprehensively considering the influence of the social relationship of the user, the dynamic popularity of the position and the geographic distance on the access behavior of the user, fusing the predicted scores based on friendship, the dynamic popularity of the interest points and the access probability based on the distance, and generating a final predicted score of the non-accessed address for the target user. And sequencing all the non-accessed addresses according to the final prediction scores, and providing a recommendation list consisting of a plurality of addresses ranked at the top for the target user. The implementation steps are as follows:

step 5-1: determining a target user u in a location social network _a As a recommended service object, the current recommended time is used for time _r Conversion to time slot t _r 。

Step 5-2: comprehensively considering influence of user social relationship, position dynamic popularity and geographic distance on user access behaviors, and calculating target user u _a At t _r Prediction score for time access point of interest/:

wherein u is _a Is the recommendation systemTarget object of front service, t _r Is the time slot corresponding to the current recommended time, l is a point of interest in the location social network that the target user has not accessed,calculating the target user u based on social relationship _a At t _r Predictive scoring of time-access interest point l, popu (l, t _r ) Is the point of interest l at t _r Real-time popularity of time, pr (l|L_u) _a ) The predicted access probability L_u is obtained by mining the influence of the geographic distance by using a power law distribution model _a Is target user u _a The set of points of interest that have been accessed.

Step 5-3: for target user u _a All addresses which are not accessed are ordered according to predictive scores, N positions which are ranked at the top are formed into a recommendation list, and the recommendation list TopNList is formed _a And returning to the target user.

Step 6: the accuracy Precision, recall ratio Recall and comprehensive accuracy index F1 are used as accuracy evaluation indexes of a recommendation system, and the applicability and effectiveness of the proposed technology are evaluated by comparing the prediction accuracy of the proposed recommendation method and other related classical recommendation methods. The implementation steps are as follows:

step 6-1: NU x 20% of the users are randomly chosen as the target user set TestU, NU representing the total number of users. And respectively running each recommendation method for each target user in the testU set, and generating a recommendation list.

Step 6-2: calculating the accuracy and recall rate of the recommendation method in the time slot t:

the Testu is a set of all target users, R (u, t) is a recommendation list provided for a certain target user u at the moment t by a recommendation method, and Like (u, t) is a set of interest points actually visited by the user u at the moment t.

Step 6-3: the overall accuracy and recall of the recommendation method are calculated, and the values of the overall accuracy and recall are the average value of corresponding evaluation indexes in each time slot:

where T is the time slot set, precision (T) and recovery (T) are the accuracy and recall of the recommended method in time slot T, respectively.

Step 6-4: calculating the comprehensive accuracy F1 value of the recommendation system:

precision and recall are the overall accuracy and recall, respectively, of the recommended method run once.

Step 6-5: and repeatedly executing the steps 6-1 to 6-4Ntimes, wherein the final prediction accuracy (the values of the Precision, the Recall and the comprehensive Precision index F1) of the recommendation method is the average value of the corresponding index results of the Ntimes.

Step 6-6: comparing and analyzing the results of each index: if the accuracy Precision of the interest point recommendation method based on the social relation fusion position dynamic popularity and the geographic characteristics is larger than the Precision values of other recommendation methods, the technology provided by the invention can help users to find interested addresses more accurately; if the Recall rate Recall of the technology proposed by the invention is larger than the Recall values of other recommendation methods, the technology proposed by the invention is explained to be capable of more comprehensively covering the addresses of interest of the user; if the F1 value of the method provided by the invention is larger than that of other recommended methods, the technology provided by the invention has stronger comprehensive capacity in the aspect of prediction accuracy.

3-9, a specific social network based on a location is taken as an example to describe in detail how the method for recommending points of interest based on the dynamic popularity and geographic characteristics of social relationship fusion locations in the present invention works.

The U.S. university of Stanford SNAP laboratory has collected user check-in records for a period of time for the overseas popular location social networking sites Gowalla, forming a well-known public dataset Gowalla. The Gowalla dataset contained 6442892 check-in actions by 196591 users at 1256379 addresses during month 2 2009 to 10 2010. 196591 users form 950327 social relations. The invention selects the trap county with the most abundant check-in data in the Gowalla data set as an example for instantiation and explanation.

step 1-1: collecting and sorting check-in data and social data of Travis county in an example data set Gowalla, obtaining 200817 check-in records of 3280 users at 3335 addresses, and forming a check-in data set C= { C ₁ ,c ₂ ,…,c ₂₀₀₈₁₇ Each check-in record is composed of a user ID, a check-in time, a geographic latitude, a geographic longitude, and a point of interest ID, as shown in fig. 3. 36050 social relations are formed among users, and a social relation record schematic diagram of the users in the Gowalla network is shown in FIG. 4.

All users in the check-in data set C form a user set U, all points of interest form an address set L, the number of users NU is 3280, and the number of points of interest NL is 3335.

Rounding the check-in time in the check-in record, converting the discrete check-in time into 24 time slots, for example, the time slot corresponding to the check-in time=00:01:25 is t=0, and the time slot corresponding to the check-in time=23:11:56 is t=23. The time slot set t= {0,1,2, …,23}.

Step 1-2: in check-inCounting the number of times a certain user u accesses a certain address l in a time slot t in a data set C, and if the sign-in number is 0, scoring r of the user u on the address l in the time slot t _u,t,l 0, otherwise, r _u,t,l =1. Summarizing all scores to form a user-time-interest point three-dimensional scoring matrix R= { R _u,t,l U is [1,3280 ]]，t∈[0,23]，l∈[1,3335]。

Step 1-3: the original social relation data set F is arranged to obtain 36050 social relation records, and the social relation records are recorded as F= { F ₁ ,f ₂ ,…,f ₃₆₀₅₀ }. Formalize each social relationship record into<User u _x User u _y >Binary group, x E [1,3280 ]]，y∈[1,3280]。

Step 1-4: constructing a two-dimensional user social relation matrix S= { S _xy The number of rows and columns of the matrix is 3280, x E [1,3280 ]]，y∈[1,3280]Its element s _xy Representing user u _x And u is equal to _y Whether there is a social relationship between:

step 2-1: determining a target user u in a location social network _a As a recommendation service object, searching for a target user u in the social relationship matrix S _a The row in which the target user u is located is obtained by acquiring the column number (user ID) with the element value of 1 in the row _a Friend set F _a ，F _a The number of elements in the set is denoted as FNum _a At the same time, record the column number (user ID) with element value 0 in the row to form a list with the target user u _a Social relationship-free user collection UnF _a 。

Three-dimensional scoring matrix at user-time-interest pointsR, if user u _i ∈UnF _a Then delete user u in R _i Score information of (a) to obtain deletion and target user u _a Scoring matrix R behind users (weakly related rows) without social relationship ₁ ＝{r _i',t,j }，i'∈[1,Fnum _a ]，t∈[0,23]，j∈[1,3335]Where i' denotes the user number, t denotes the value of the time slot, j denotes the address number, FNum _a Representing target user u _a Number of friends r _i',t,j Representing user u _i' For address l at time slot t _j Is a score of (2).

Through the screening of social relations, the row scale of the original scoring matrix R is effectively reduced, namely |i' |=FNum _a <<3280。

If address l _j ∈unvisit_F _a Then at R ₁ Delete l in _j To obtain a scoring matrix R after deleting irrelevant addresses (weakly relevant columns) ₂ ＝{r _i',t,j' }，i'∈[1,Fnum _a ]，t∈[0,23]，j'∈[1,3335-|unvisit_F _a |]Where i 'denotes a user number, t denotes a time slot value, j' denotes an address number, FNum _a Representing target user u _a Is the number of friends, |unvisit_F _a The number of addresses that all friends of the target user have not visited, r _i',t,j' Representing user u _i' For address l at time slot t _j' Is a score of (2).

By deleting irrelevant addresses, in the scoring matrix R ₁ Further reduces the column size on the basis of (j' |=3335- |unvisit_f) _a |<<3335。

Step 2-3: based on the scoring matrix R after pretreatment ₂ Calculating the target user u _a Scoring similarity to its friend users. If user v epsilon F _a Target thenUser u _a The scoring similarity with user v is:

wherein u is _a Is the target object of the current service of the recommendation system, v is the target user u _a T is a certain time slot, unvisit_f _a Representing target user u _a Address set that all friends of (1) have not visited at any time, |unvisit_f _a I represents unvisit_F _a Number of addresses in set, r _ua,t,l And r _v,t,l Respectively represent user u _a And scoring the point of interest l by the user v at time t.

step 3-1: counting the number of times Cnum that a point of interest l is accessed in a time slot t in a check-in data set C _l,t ：

Cnum _l,t ＝∑ _u∈[1,3280] Cnum _u,t,l Equation 23

Cnum _l ＝∑ _t∈[0,23] ∑ _u∈[1,3280] Cnum _u,t,l Equation 24

The probability of being accessed in each time slot of the first three interest points with the largest number of times of being accessed in the embodiment of the invention is shown in fig. 5. Fig. 5 illustrates that the access probabilities of points of interest vary widely in different time slots. Therefore, it is necessary to explore the dynamic popularity of the interest points in each time slot, and the accuracy of the recommendation of the interest points can be effectively improved by mining the dynamic popularity of the interest points based on time perception.

Cnum _t ＝∑ _l∈[1,3335] ∑ _u∈[1,3280] Cnum _u,t,l Equation 25

In equations 23, 24, 25, cnum _u,l,t Indicating the number of times a user u accesses the point of interest i in the time slot t in the check-in dataset C.

wherein Cnum is _l,t Representing the time of interest point lThe number of accesses to slot t, cnum _l Representing the total number of times point of interest i is accessed at all times.

wherein Cnum is _l,t Representing the number of times the point of interest l is accessed in the time slot t, cnum _t Indicating the number of times all points of interest have been accessed in time slot t.

popu (l, t) =popu 1 (l, t) ×popu2 (l, t) equation 28

The time-aware based dynamic popularity of each interest point in each time slot is summarized to form a 3335 row 24 column interest point-time two-dimensional popularity matrix P.

where R is the earth radius, r=6371 km.

Step 4-2: a schematic diagram of the probability distribution of the geographic distances between adjacent points of interest visited by all users on the same day in the embodiment of the present invention is shown in fig. 6. It can be seen that the probability of a user checking in at different places conforms to the power law distribution. Constructing a power law function based on geographic distance:

pr(dist(l,l'))＝a×dist(l,l') ^b equation 30

Wherein a and b are two parameters of the power law function, and the numerical values of the two parameters can be obtained through a maximum likelihood estimation method:

First, the logarithm is taken for both the left and right sides of equation 30:

log ₂ (pr(dist(l,l')))＝log ₂ (a)+b×log ₂ (dist (l, l')) formula 31

Let y=log ₂ (pr(dist(l,l')))，x＝log ₂ (dist(l,l'))，ω ₀ ＝log ₂ (a)，ω ₁ ＝b，ω＝(ω ₀ ,ω ₁ ) Linear regression is achieved:

y(ω,x)＝ω ₀ +ω ₁ x formula 32

In the process of learning the parameter omega, a least square method is adopted to solve the problem of linear curve fitting, and a linear regression model loss function is defined as follows:

wherein x is _n Is equal to x' _n The corresponding true value, λ, is a regular term and M is the amount of input data. And (3) taking the value of the minimized loss function as an optimization target, finding out the value of the corresponding parameter omega, and further obtaining the values of the power law parameters a and b. In an embodiment, for example, the power law parameters a= 0.145587 in the power law distribution model of a certain group of test users are calculated according to the distance between the addresses visited by the test users in the method,b＝-0.985544。

where L is the set of all addresses.

wherein u is _a Is currently served by the recommendation systemTarget object, t _r Is the time slot corresponding to the current recommended time, l is a point of interest in the location social network that the target user has not accessed,calculating the target user u based on social relationship _a At t _r Predictive scoring of time-access interest point l, popu (l, t _r ) Is the point of interest l at t _r Real-time popularity of time, pr (lL_u) _a ) The predicted access probability L_u is obtained by mining the influence of the geographic distance by using a power law distribution model _a Is target user u _a The set of points of interest that have been accessed.

Step 5-3: for target user u _a All addresses which are not accessed are ordered according to predictive scores, N positions which are ranked at the top are formed into a recommendation list, and the recommendation list TopNList is formed _a And returning to the target user. N may take a multiple of 10, and typically N takes a value of 10, 20, 30, 40, 50, respectively.

step 6-1: the method comprises the steps of randomly selecting 656 users as a target user set Testu, and respectively operating a social relationship fusion position dynamic popularity and geographic feature-based interest point recommendation method, a classical user-based collaborative filtering method UBCF, a social relationship-based collaborative filtering method SCF, a power law distribution-based access probability prediction method PLD and a kernel density estimation-based access probability prediction method KDE for 656 target users to generate a recommendation list.

Step 6-2: calculating the accuracy and recall rate of each recommendation method in the time slot t:

/>

Wherein precision (t) and recovery (t) are the accuracy and recall of the recommended method in time slot t, respectively.

Step 6-5: repeating the steps 6-1 to 6-4 100 times, wherein the final prediction accuracy (the values of the Precision, recall and comprehensive Precision index F1) of the recommendation method is the average value of the 100 corresponding index results.

When N is 10, 20, 30, 40, 50, the Precision, recall, and integrated Precision index F1 results of each recommended method are shown in tables 1, 2, and 3, respectively, wherein the values in bold format for each row represent the maximum value of the index for that row:

TABLE 1 Precision index values for different recommendation methods

Table 2 Recall index values for different recommendation methods

TABLE 3 recommendation precision F1 index values of different recommendation methods

In the present case, the recommended method and classical collaborative filtering method UBCF based on the user, collaborative filtering method SCF based on the social relationship, access probability prediction method PLD based on power law distribution, accuracy Precision of access probability prediction method KDE based on kernel density estimation, recall ratio Recall and histogram of comparison of comprehensive accuracy index F1 are shown in fig. 7, fig. 8 and fig. 9, respectively.

Step 6-6: comparing and analyzing the results of each index: the accuracy rate Precision of the interest point recommendation method based on the social relation fusion position dynamic popularity and the geographic characteristics is larger than that of all other methods, so that the technology provided by the invention can help users to find interested addresses more accurately; the Recall rate Recall of the technology provided by the invention is larger than the Recall values of other recommendation methods, which shows that the technology provided by the invention can more comprehensively cover the addresses interested by the user; the F1 value of the method provided by the invention is larger than that of other recommended methods, which shows that the technology provided by the invention has stronger comprehensive capacity in the aspect of prediction accuracy.

Different from the conventional interest point recommendation method, the method aims at constructing a real-time, accurate and dynamic interest point recommendation system, considers the influence of social relations of users, dynamic popularity of positions and geographic distances on sign-in behaviors of the users, innovatively utilizes the social relations to conduct data compression, generates personalized scoring matrixes for each user so as to improve the operation efficiency of the recommendation system, and meanwhile innovatively provides a calculation method of the dynamic popularity of the interest points, and improves the effectiveness of scoring prediction by extracting different popularity of the interest points in different time periods so as to strengthen the service quality of the recommendation system. The technology provided by the invention has wide application prospect and is expected to be widely applied to social network markets based on the positions at home and abroad.

The above technical process is only a preferred embodiment of the present invention, but not represents all the details of the present invention. Any modification, equivalent replacement, and improvement made by those skilled in the art within the scope of the present disclosure, which is within the spirit and principles of the present invention, should be included in the scope of the present invention.

Claims

1. The interest point recommendation method based on the social relation fusion position dynamic popularity and geographic characteristics is characterized by comprising the following steps:

step 1: collecting and sorting the original check-in data set and the social relation data set, and respectively converting the original check-in data set and the social relation data set into a user-time-interest point three-dimensional scoring matrix and a two-dimensional relation matrix, wherein the method comprises the following steps of:

step 1-1: the original check-in data set C is arranged to obtain n check-in records, and the n check-in records are recorded as C= { C ₁ ,c ₂ ,…,c _n All users in the check-in data set C form a user set U, all interest points form an address set L, the number of users and the number of interest points are respectively marked as NU and NL, check-in time in a check-in record is rounded, discrete check-in time is converted into 24 time slots, and the time slot set T= {0,1,2, …,23};

step 1-2: counting the number of times a user u accesses an address l in a time slot t in a check-in data set C, and if the number of times of check-in is 0, scoring r of the user u on the address l in the time slot t _u,t,l 0, otherwise, r _u,t,l =1, summarizing all scores, formingThree-dimensional scoring matrix R= { R of user-time-interest point _u,t,l U e U, t e [0,23 ]]L epsilon L, U and L are a user set and a point of interest set respectively;

step 1-3: the original social relation data set F is arranged to obtain m social relation records, and the m social relation records are recorded as F= { F ₁ ,f ₂ ,…,f _m Formalize each social relationship record into<User u _x User u _y >Binary group, x.epsilon.1, NU]，y∈[1,NU]NU is the number of all users in the dataset;

step 1-4: constructing a two-dimensional user social relation matrix S= { S _xy }：

The number of rows and columns of the matrix are the number of users NU, x E [1, NU]，y∈[1,NU]Its element s _xy Representing user u _x And u is equal to _y Whether social relationship exists between the two;

step 2: selecting an active user in a location-based social network as a recommended service object, deleting weak correlation rows and weak correlation columns in a scoring matrix for the target user, improving a traditional user-based collaborative filtering algorithm, and generating a friendship-based predictive score for the target user, wherein the method comprises the following steps:

step 2-1: determining a target user u in a location social network _a As a recommendation service object, searching for a target user u in the social relationship matrix S _a The row where the target user u is located is obtained by obtaining the column number with the element value of 1 in the row _a Friend set F _a ，F _a The number of elements in the set is denoted as FNum _a At the same time, the column number with element value of 0 in the row is recorded to form a target user u _a Social relationship-free user collection UnF _a ；

In the three-dimensional scoring matrix R of the user-time-interest points, if the user u _i ∈UnF _a Then delete user u in R _i Score information of (a) to obtain deletion and target user u _a Social relationship-freeScoring matrix R for users of the family, i.e. behind weakly related rows ₁ ＝{r _i',t,j }，i'∈[1,Fnum _a ]，t∈[0,23]，j∈[1,NL]Where i' denotes the user number, t denotes the value of the time slot, j denotes the address number, FNum _a Representing target user u _a NL represents the total number of points of interest, r _i',t,j Representing user u _i' For address l at time slot t _j Is a score of (2);

through the screening of social relations, the row scale of the original scoring matrix R is effectively reduced, namely |i' |=FNum _a <<NU, NU represents the total number of users;

step 2-2: in the scoring matrix R ₁ Calculating the sum of the scores of each address in all time slots one by one, and if the sum of the scores is equal to 0, representing the target user u _a All friends that have not accessed this address at any time add this address to unvisit_F _a In the collection;

if address l _j ∈unvisit_F _a Then at R ₁ Delete l in _j To obtain a scoring matrix R after deleting irrelevant addresses, namely weak relevant columns ₂ ＝{r _i',t,j' }，i'∈[1,Fnum _a ]，t∈[0,23]，j'∈[1,NL-|unvisit_F _a |]Where i 'denotes a user number, t denotes a time slot value, j' denotes an address number, FNum _a Representing target user u _a NL represents the total number of points of interest, |unvisit_f _a The number of addresses that all friends of the target user have not visited, r _i',t,j' Representing user u _i' For address l at time slot t _j' Is a score of (2);

by deleting irrelevant addresses, in the scoring matrix R ₁ Further reduces the column size on the basis of (i) j' |=nl- |unvisit_f _a |<<NL, NL represents the total number of users;

step 2-3: based on the scoring matrix R after pretreatment ₂ Calculating the target user u _a Score similarity with friend users if user v e F _a Target user u _a The scoring similarity with user v is:

wherein u is _a Is the target object of the current service of the recommendation system, v is the target user u _a T is a time slot, T is a time slot set, L is a set for all points of interest, and unvisit_F _a Representing target user u _a A set of addresses that have not been accessed by all friends of (a) at any time,and r _v,t,l Respectively represent user u _a And scoring the point of interest l by the user v at time t;

wherein u is _a Is a target object of the current service of the recommendation system, t _r Is the time slot corresponding to the current recommended time, l is an interest point which is not visited by the target user in the location social network, and unvisit_F _a Representing target user u _a An address set that all friends have not visited at any time, v is the target user u _a Is a friend user of F _a Representing target user u _a Is set of friends, sim (u) _a V) is user u obtained in step 2-3 _a And the scoring similarity to the user v,indicating that user v is at time t _r Scoring the interest point l;

step 3: counting the accessed times of each interest point in each time slot, comparing the accessed times with the total accessed times of all the interest points at all times and the accessed times of all the interest points at a certain time, and calculating the dynamic popularity of each interest point based on time perception, wherein the method comprises the following steps:

Cnum _l,t ＝∑ _u∈U Cnum _u,t,l Equation 4

Cnum _l ＝∑ _t∈T ∑ _u∈U Cnum _u,t,l Equation 5

Cnum _t ＝∑ _l∈L ∑ _u∈U Cnum _u,t,l Equation 6

In the formulas 4, 5 and 6, U is all user sets, L is all address sets, T is a time slot set, cnum _u,l,t Indicating the number of times a user u accesses the point of interest i in the time slot t in the check-in dataset C;

wherein Cnum is _l,t Representing the number of times the point of interest l is accessed in the time slot t, cnum _l Representing the total number of times point of interest i is accessed at all times;

wherein Cnum is _l,t Representing the number of times the point of interest l is accessed in the time slot t, cnum _t Indicating the accessed times of all the interest points in the time slot t;

step 3-6: and (3) synthesizing the results of the longitudinal comparison in the step (3-4) and the transverse comparison in the step (3-5) to obtain the time perception-based dynamic popularity of the interest point l in the time slot t:

popu (l, t) =popu 1 (l, t) ×popu2 (l, t) equation 9

Wherein popu1 (l, t) is the longitudinal popularity result of the point of interest l in the time slot t, popu2 (l, t) is the lateral popularity result of the point of interest l in the time slot t;

Summarizing the time-perception-based dynamic popularity of each interest point in each time slot to form an interest point-time two-dimensional popularity matrix P of NL row 24 columns;

step 5: comprehensively considering the influence of the social relationship, the dynamic popularity of the position and the geographic distance of the user on the access behavior of the user, fusing the predicted score based on friendship, the dynamic popularity of the interest point and the access probability based on the distance, generating a final predicted score of the non-accessed addresses for the target user, sequencing all the non-accessed addresses according to the final predicted score, and providing a recommendation list consisting of a plurality of addresses with top ranking for the target user;

step 6: and using Precision, recall and comprehensive Precision index F1 as accuracy evaluation indexes of a recommendation system, comparing the prediction accuracy of the recommendation method with that of other related classical recommendation methods, and evaluating the applicability and effectiveness of the proposed technology.

2. The method for recommending points of interest based on a fusion of dynamic popularity and geographic features of social relationships according to claim 1, wherein said step 4 comprises:

Step 4-1: suppose target user u _a The accessed interest point set is L_u _a ，l'∈L_u _a Acquiring the geographical longitude Lng of l' in the check-in data set C _l' And latitude Lat _l' Let l' =as<Lng _l' ,Lat _l' >L is a certain point of interest that the target user has never visited, l=<Lng _l ,Lat _l >Calculating a geographic distance dist (l, l ') between the interest points l and l':

where R is the earth radius, r=6371 km;

pr(dist(l,l'))＝a×dist(l,l') ^b equation 11

Wherein the target user u _a The accessed interest point set is L_u _a ，l'∈L_u _a Acquiring the geographical longitude Lng of l' in the check-in data set C _l' And latitude Lat _l' Let l' =as<Lng _l' ,Lat _l' >L is a certain point of interest that the target user has never visited, l=<Lng _l ,Lat _l >Calculating a geographical distance dist (l, l ') between interest points l and l', wherein a and b are two parameters of a power law function, and the numerical values of the two parameters can be obtained through a maximum likelihood estimation method;

wherein the target user u _a The accessed interest point set is L_u _a ，l'∈L_u _a Acquiring the geographical longitude Lng of l' in the check-in data set C _l' And latitude Lat _l' Let l' =as <Lng _l' ,Lat _l' >L is a certain point of interest that the target user has never visited, l=<Lng _l ,Lat _l >Calculating a geographic distance dist (L, L ') between interest points L and L', wherein L is a set of all addresses;

wherein the target user u _a The accessed interest point set is L_u _a ，l'∈L_u _a Acquiring the geographical longitude Lng of l' in the check-in data set C _l' And latitude Lat _l' Let l' =as<Lng _l' ,Lat _l '>L is a certain point of interest that the target user has never visited, l=<Lng _l ,Lat _l >The geographical distance dist (l, l ') between points of interest l and l' is calculated.

3. The method for recommending points of interest based on a fusion of dynamic popularity and geographic features of social relationships according to claim 1, wherein said step 5 comprises:

step 5-1: determining a target user u in a location social network _a As a recommended service object, the current recommended time is used for time _r Conversion to time slot t _r ；

wherein u is _a Is a target object of the current service of the recommendation system, t _r Is the time slot corresponding to the current recommended time, l is a point of interest in the location social network that the target user has not accessed,calculating the target user u based on social relationship _a At t _r Predictive scoring of time-access interest point l, popu (l, t _r ) Is the point of interest l at t _r Real-time popularity of time, pr (l|L_u) _a ) The predicted access probability L_u is obtained by mining the influence of the geographic distance by using a power law distribution model _a Is target user u _a A set of points of interest that have been accessed;

4. The method for recommending points of interest based on a fusion of dynamic popularity and geographic features of social relationships according to claim 1, wherein said step 6 comprises:

step 6-1: randomly selecting NU x 20% of users as a target user set Testu, wherein NU represents the total number of users, and respectively running each recommendation method for each target user in the Testu set to generate a recommendation list;

the Testu is a set of all target users, R (u, t) is a recommendation list provided for a certain target user u at the moment t by a recommendation method, and Like (u, t) is a set of interest points actually visited by the user u at the moment t;

wherein T is a time slot set, and precision (T) and recovery (T) are respectively the accuracy and recall of the recommendation method in the time slot T;

the precision and the recovery are the overall accuracy and recall rate of the recommended method running once respectively;

step 6-5: repeating the steps 6-1 to 6-4 for Ntimes, and recommending final prediction accuracy of the method, wherein the final prediction accuracy comprises values of an accuracy Precision, a Recall and a comprehensive accuracy index F1, and the values are average values of index results corresponding to the Ntimes;

step 6-6: comparing and analyzing the results of each index: if the Precision of the method is greater than the Precision of other recommended methods, the proposed technique is described as being able to help the user to find the address of interest more accurately; if the Recall rate Recall of the technology proposed by the method is larger than the Recall values of other recommended methods, the proposed technology is described as being capable of more comprehensively covering the addresses of interest to the user; if the F1 value of the proposed method is larger than that of other recommended methods, the proposed technique is shown to have a stronger comprehensive capacity in terms of prediction accuracy.