CN114528480A - Time-sensing self-adaptive interest point recommendation method based on K-means clustering - Google Patents

Time-sensing self-adaptive interest point recommendation method based on K-means clustering Download PDF

Info

Publication number
CN114528480A
CN114528480A CN202210071029.3A CN202210071029A CN114528480A CN 114528480 A CN114528480 A CN 114528480A CN 202210071029 A CN202210071029 A CN 202210071029A CN 114528480 A CN114528480 A CN 114528480A
Authority
CN
China
Prior art keywords
time
user
interest
check
time slot
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210071029.3A
Other languages
Chinese (zh)
Inventor
朱俊
梁太波
韩立新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Vocational University of Industry Technology NUIT
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202210071029.3A priority Critical patent/CN114528480A/en
Publication of CN114528480A publication Critical patent/CN114528480A/en
Priority to CN202211571570.7A priority patent/CN116166878A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06395Quality analysis or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0269Targeted advertisements based on user profile or attribute
    • G06Q30/0271Personalized advertisement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Data Mining & Analysis (AREA)
  • Educational Administration (AREA)
  • General Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Tourism & Hospitality (AREA)
  • Game Theory and Decision Science (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Probability & Statistics with Applications (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Primary Health Care (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Fuzzy Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a time perception self-adaptive interest point recommendation method based on K-means clustering, which comprises the following steps: firstly, converting a check-in data set into a three-dimensional scoring matrix; secondly, counting the number of sign-in users, the number of accessed interest points and sign-in times in each time slot, and constructing a three-dimensional sign-in feature vector of each time slot; thirdly, performing K-means clustering on the time slots, and calculating the time similarity among the time slots in the same cluster; fourthly, calculating the user similarity at the current time by using the grading information in other time slots in the same time cluster; fifthly, improving the traditional collaborative filtering method based on the user by utilizing the time clustering result and the time similarity inside the cluster, so that the method can generate an interest point prediction score in a self-adaptive manner according to the current recommended time; and sixthly, comparing the recommendation precision of the recommendation system provided by the invention with that of other classical recommendation systems, and evaluating the accuracy and effectiveness of the provided technology.

Description

Time-sensing self-adaptive interest point recommendation method based on K-means clustering
Technical Field
The invention relates to a time perception self-adaptive interest point recommendation method based on K-means clustering in a location social network, and belongs to the technical field of artificial intelligence and machine learning.
Background
In recent years, communication technology, positioning technology and mobile internet technology have been developed rapidly, and Location-based Social Networks (lbs ns) have become a new type of media for people to share and transmit information, and provide a platform for close connection between an online virtual network and an offline real world. At present, a large number of mature location-based social network platforms exist at home and abroad, such as Facebook, YouTube, Twitter, microblog, broad bean, popular comment, mei-qu network, WeChat friend circle and the like. In a location-based social network, a user may establish complex social relationships, such as friendships, coworkers, relatives, and the like; using the added geographic tag to view some interested places (called 'interest points' for short), such as restaurants, shops, movie theaters and the like; the mobile device is used for checking in when points of interest (POIs) are visited, releasing the geographical location information of the POIs, and sharing the suggestions and comments of the POIs. LBSSNs can bring convenience to users, and can help merchants to know real users behind the network, so that personalized services meeting the requirements of different users can be customized for the different users' best, and the method has strong practicability and advancement.
With the increasing number of users communicating in the lbs ns, the lbs ns store and accumulate abundant available information, such as check-in records, social relations, time-space data, and contents of various texts, images, videos, and the like. Although abundant data resources are provided for users, the problem of Information Overload (Information Overload) is also caused, and the difficulty of accurately acquiring target items by the users is increased. Therefore, recommendation systems for solving the information overload problem are concerned by more and more researchers, for example, the famous Amazon company uses the recommendation system to recommend commodities to users, so that the click rate and the turnover are improved for merchants; the movie recommendation website Netflix attracts a plurality of research teams to focus on research for improving recommendation accuracy by holding a recommendation system contest. As a special information filtering system, the recommending system does not need the user to actively provide determined keyword information, but models the interests and hobbies of the user by analyzing the existing historical behaviors of the user, mines the potential preference of the user, and then actively recommends commodities, services and the like meeting the requirements of the user. Based on a large amount of user information, friend information and position information, researchers face lbs ns to realize applications such as friend recommendation, expert discovery, point of interest recommendation, activity recommendation, path recommendation and the like. Among them, point of interest Recommendation (POI Recommendation) has become a research hotspot as a necessary product of the collaborative development of the traditional Recommendation system and the location social network.
Considering that point-of-interest recommendation is an important branch of a recommendation system, both development history and key technology bear the same pulse as a traditional recommendation system, part of point-of-interest recommendation research regards positions as common items similar to movies, music and the like, and recommendation results are generated by utilizing a traditional recommendation method. According to design strategies, conventional recommendation algorithms mainly include collaborative filtering algorithms, content-based recommendation algorithms, and hybrid recommendation algorithms. Collaborative filtering algorithms in turn include memory-based collaborative filtering algorithms (e.g., user-based collaborative filtering, project-based collaborative filtering) and model-based collaborative filtering algorithms (e.g., singular value decomposition, clustering models, probabilistic latent semantic analysis, etc.). Wherein the content-based point of interest recommendation technique extracts relevant information from the visited sites, such as tags, categories, and user comments; user preferences are extracted from the user's profile and then matched against the location profile to obtain accurate recommendations. Converting the check-in behavior of the user into a user-interest point scoring matrix based on a user collaborative filtering (UBCF) technology, searching for similar users of the current active user by using the existing check-in records, predicting the scoring of the active user to the non-check-in places according to the interest preference of the similar users, and recommending the interest point with the highest predicted scoring to the current user. Project-based collaborative filtering (IBCF) techniques are based on the assumption that: the user always prefers a location that is highly similar to his previous favorite address. The IBCF technique therefore first calculates the similarity between points of interest and recommends to active users the addresses that are most similar to POIs that the user has visited. Singular Value Decomposition (SVD) is a classical representation of matrix decomposition, whose main task is to generate low rank approximations. The low-dimensional orthogonal matrix decomposed by the SVD technology reduces noise on the basis of the original matrix, and can more effectively reveal potential association of users and commodities. In various recommendation technologies, the collaborative filtering algorithm does not need too much knowledge in specific fields, avoids complex information collection and content analysis processes, is easy to realize in engineering, and can be conveniently applied to products. Therefore, collaborative filtering has become the most widely used and popular recommendation technique in the traditional recommendation field.
The above conventional recommendation technologies ignore the influence of the time context in the point of interest recommendation on the check-in behavior of the user. However, in fact, the time attribute is very important context information in the point of interest recommendation application scenario, and the check-in habit of the user is always closely related to the time attribute. From a macroscopic perspective, the user's liking of points of interest can be influenced by the surrounding large time environment, e.g., the mei-rou platform recommends a dumpling store for the user in the winter solstice, and the portable network recommends a water park for the user in the summer. More importantly, user preferences may migrate over time, for example, users previously liked to go to KTV and movie theaters, but recently liked to go to bookstores and coffee shops. Besides the above macroscopic features, the fine-grained time influence can reflect the check-in preference of the user in a specific time period, for example, the interest points of the catering category are visited most frequently at about 12 o ' clock and 18 o ' clock, and the popularity of the bar begins to rise from 21 o ' clock. How to introduce time information into a recommendation algorithm and provide a suitable point of interest recommendation list for a user in a specific time period have become urgent requirements of various social application platforms.
At present, some recommendation systems integrate the temporal context into the point of interest recommendation problem, but the existing time-aware point of interest recommendation system still has some disadvantages and shortcomings, which are summarized as the following:
(1) the point of interest recommendation technique based on temporal features is still relatively less relevant to the recommendation technique considering social relationships, geographic features, and other category contexts. Most of the interest point recommendation technologies are not good at processing dynamically changing user requirements, are difficult to support correction and adjustment of user preferences over time, and cannot give an interest point recommendation result which best meets the current time situation in real time.
(2) The time-dimensional dynamic feature of user similarity is ignored. In the existing research, the time dimension dynamic characteristics of the user similarity are not considered when the user similarity is calculated, and the same similarity matrix is shared in different time periods. However, in reality, the user similarity changes with the passage of time. For example, at noon on a weekday, a user often visits a restaurant near a unit together with a colleague, where the similarity between the user and the colleague is higher than that between the user and a family, and after going home from work, the user often visits a supermarket near a home address together with the family, where the similarity between the user and the family is higher. Therefore, using global user similarity at different times is not in accordance with the laws of truth.
(3) The data sparseness problem of the three-dimensional matrix of user-time-interest points. The number of addresses visited by the user is very small compared to thousands of geographic locations in a location social network, which results in a sparse scoring matrix itself. In the interest point recommendation system considering the space-time context, the data sparsity problem is more obvious. This is because, in order to explore the behavior pattern of the user in the target time period, the check-in data set which is sparse originally needs to be further divided into a plurality of subsets according to the time axis, which undoubtedly aggravates the sparseness of the scoring matrix. Therefore, methods that can alleviate the data sparsity problem must be explored to improve the accuracy and reliability of the recommendation result over a certain period of time.
The above-mentioned disadvantages of the existing time-aware interest point recommendation technology bring great disadvantages in the design, development, deployment and operation of social network platforms at different positions, and especially cause the service quality of a recommendation system to be reduced on a network platform with massive project information, thereby affecting the sales performance of an e-commerce system.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a time-sensing self-adaptive interest point recommendation method based on K-means clustering by taking an interest point recommendation system which can generate an interest point list in real time according to time points and has an accurate recommendation result as a target. Meanwhile, in consideration of the difference and the correlation of user sign-in data characteristics in different time slots, the invention innovatively provides an analysis mode of the distance from a time point to a clustering center, adopts a K-means clustering method to mine the correlation among the time slots, relieves the sparse problem of high-dimensional sign-in data through time clustering, improves the effectiveness of scoring prediction and strengthens the service quality of a recommendation system.
The technical scheme adopted by the invention for solving the technical problems is as follows: dividing one day into 24 time slots, respectively counting the number of sign-in users, the number of accessed interest points and the sign-in times in each time slot according to the time labels, and carrying out K-means clustering on the time slots based on the three-order data characteristics; calculating user similarity in different time slots according to the time clustering result and historical sign-in information of the user; the scoring method of the traditional UBCF algorithm is improved by utilizing time clustering, so that the interest point prediction score can be generated in a self-adaptive manner according to a time slot; the prediction scores of all the inaccessible addresses are sorted, and a plurality of addresses which are ranked at the top are selected and recommended to the user (shown in figure 1).
The specific process of the method comprises the following steps:
step 1, collecting and sorting an original sign-in data set of a user, and converting the original sign-in data set into a user-time-interest point three-dimensional scoring matrix.
And 2, counting the number of check-in users, the number of accessed interest points and the check-in times in each time slot. And constructing a three-dimensional check-in feature vector of each time slot based on the statistical result to form a check-in data feature set of the time slot.
And 3, clustering the time slots by adopting a K mean value method based on the statistical result of the second step. And calculating the time similarity between each time slot in the same cluster.
And 4, reasonably utilizing the grading information in other time slots in the same time cluster to calculate the user similarity at the current recommendation time according to the basic principles of high intra-cluster similarity and low inter-cluster similarity.
And 5, improving the traditional scoring method based on the collaborative filtering algorithm of the user by using the time clustering result and the time similarity inside the cluster, so that the scoring method can generate the predicted scores of the interest points in a self-adaptive manner according to the current recommended time, and recommending a plurality of unaccessed addresses with the top ranking of the current time for the user.
And 6, evaluating the recommendation quality by using the recommendation precision index, and evaluating the accuracy and effectiveness of the proposed technology by comparing the recommendation precision of the recommendation system and other classical recommendation systems.
Has the advantages that:
(1) the time-aware self-adaptive interest point recommendation method based on K-means clustering can generate a real-time interest point recommendation list for a user at any time according to the current behavior habit of the user and the current prevalence trend of the interest points, and meanwhile, can help a merchant to quickly and accurately push advertisements for the user, so that more potential consumers are attracted.
(2) The method and the system innovatively cluster time, excavate the time dimension dynamic characteristics of the user similarity, and search different similar groups for the user at different times, and the time-varying adjacent user searching mode is more in line with the preference change of the user in reality, thereby greatly improving the use satisfaction of the user on a social network platform, increasing the accuracy and the interpretability of a recommendation system, and having very important significance for practical application.
(3) According to the invention, time is clustered by a K-means method, sharing of scoring data of each time slot in the cluster is realized, similarity among the time slots is fully mined, and the problem of data sparsity of a high-order scoring matrix is solved. The method has certain universality and portability, can be applied to an interest point recommendation system, is also suitable for the personalized recommendation field of other traditional projects, and has wide industrial application prospect.
Drawings
FIG. 1 is a flow chart of a time-aware adaptive interest point recommendation method based on K-means clustering according to the present invention.
FIG. 2 is a flowchart of specific steps of the time-aware adaptive interest point recommendation method based on K-means clustering according to the present invention.
FIG. 3 is a diagram illustrating check-in records of a user in a location social network in accordance with an embodiment of the present invention.
FIG. 4 is a diagram illustrating statistics of the number of check-in users, the number of visited points of interest, and the number of check-in times for each time slot in an embodiment of the present invention.
FIG. 5 is a diagram illustrating the K-means clustering results for all time slots in an embodiment of the present invention.
FIG. 6 is a histogram of Precision comparisons of recommendation algorithms and classical user-based collaborative filtering (UBCF), social-based collaborative filtering (SCF) algorithms in an embodiment of the present invention.
FIG. 7 is a bar graph comparing Recall rates Recall of a recommendation algorithm and a classical user-based collaborative filtering (UBCF), social relationship-based collaborative filtering (SCF) algorithm in an embodiment of the present invention.
FIG. 8 is a histogram comparing the values of the comprehensive accuracy index F1 of the recommendation algorithm and the classical user-based collaborative filtering (UBCF) and the social relationship-based collaborative filtering (SCF) algorithm in the embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and specific examples.
The specific flow of the design and implementation of the invention is shown in fig. 2, and the main variables and parameters in the process are shown in table 1.
TABLE 1 Functions of the principal variables and parameters
Figure BDA0003482089650000051
The method comprises the steps of firstly, collecting and sorting an original sign-in data set of a user, and converting the original sign-in data set into a user-time-interest point three-dimensional scoring matrix. The operation steps are as follows:
(1.a) sorting the original check-in data set C of the user to obtain n check-in records, and recording the records as C ═ C1,c2,…,cn}. Each check-in record is formatted as a user ID, check-in time, geographic latitude, geographic longitude, and a point of interest ID five tuple. All sets of users checked-in data sets are denoted by U, all sets of points of interest are denoted by L, NU and NL are the number of users and points of interest, respectively.
(1.b) the time of day is divided into 24 discrete time slots, the set of time slots being denoted T ═ 0,1,2, …, 23. And rounding the check-in time in each check-in record to obtain the value of the corresponding time slot t (t belongs to [0,23 ]).
(1, c) carrying out check-in times statistics on the check-in record quintuple set, and generating corresponding tetrad (u) for each pair of user-time-interest pointsi,t,lj,ni,t,j) Wherein u isiIs the ith user (i ∈ [1, NU)]),ljIs the jth point of interest (j ∈ [1, NL)]) And t is the value of the time slot obtained by rounding the time point in the check-in record (t ∈ [0,23]]),ni,t,jIs user uiAccess to a point of interest l in a time slot tjThe number of times.
(1.d) user uiAccess to a point of interest l in a time slot tjNumber of check-ins ni,t,jConversion to user uiFor point of interest l in time slot tjScore r ofi,t,j. If user uiGo past the point of interest l in time slot tjThen score r i,t,j1 is ═ 1; otherwise, ri,t,j=0:
Figure BDA0003482089650000061
Wherein r isi,t,jRepresenting user uiAt time slot t to address ljScore of n, ni,t,jRepresenting user uiAccess to point of interest l at time slot tjThe number of times.
Summarizing all scores to form a user-time-interest point three-dimensional score matrix R ═ Ri,t,j},i∈[1,NU],t∈[0,23],j∈[1,NL]Where i denotes the user number, t denotes the value of the time slot, j denotes the address number, NU denotes the total number of users, NL denotes the total number of points of interest, ri,t,jRepresenting user uiAt time slot t to address ljThe score of (1).
And secondly, counting the number of check-in users, the number of accessed interest points and the check-in times in each time slot. And constructing a three-dimensional check-in feature vector of each time slot based on the statistical result to form a check-in data feature set of the time slot. The specific operation steps are as follows:
(2.a) counting the number Unum of users who have checked-in behavior in the time slot t in the check-in datasett
Unumt=∑u∈UisCheck(u,t) (2)
Wherein U is a user in the location social network, U represents a set of all users in the check-in dataset, and the isCheck function represents whether the user U has check-in behavior within the time slot t:
Figure BDA0003482089650000062
where L is a certain point of interest in the location social network, L represents the set of all points of interest in the check-in dataset, ru,t,lIndicating the rating of user u for address/at time slot t.
(2.b) counting the number of interest points Pnum accessed in the time slot t in the check-in datasett
Pnumt=∑l∈LisChecked(l,t) (4)
Where L is a certain point of interest in the location social network, L represents the set of all points of interest in the check-in dataset, the isChecked function represents whether point of interest L is visited within time slot t:
Figure BDA0003482089650000071
where U is a user in the location social network, U represents the set of all users in the check-in dataset, ru,t,lIndicating the rating of address/by user u at time slot t.
(2.c) counting the total number of check-ins Cnum that occur in the time slot t in the check-in datasett
Figure BDA0003482089650000072
Wherein n is the number of check-in records in the check-in data set C, and the isTime function represents the ith check-in record CiWhether it occurs within time slot t:
Figure BDA0003482089650000073
wherein, timeiin t represents the ith check-in record ciTime of signing iniThe corresponding time slot is t.
(2.d) constructing a three-dimensional check-in feature vector x of each time slot t based on the statistical resultst={Unumt,Pnumt,CnumtForming a time slot check-in data feature set X ═ X0,x1,…,x23}. Wherein t ∈ [0,23]],UnumtIs the number of users, Pnum, that have taken place a sign-in action at time slot ttIs the number of points of interest, Cnum, visited at time slot ttIs the total number of check-ins that occur at time slot t.
And thirdly, clustering the time slots by adopting a K mean value method based on the statistical result of the second step. And calculating the time similarity between each time slot in the same cluster. The method comprises the following implementation steps:
(3.a) clustering 24 time slots by adopting a K-means method with simple algorithm and high convergence rate to generate nc clustering centers Cen ═ Cen1,cen2,…,cennc}(nc∈[2,24])。
(3.b) calculating the time similarity between any two time slots t and t' in each time cluster set:
Figure BDA0003482089650000074
where U is a user in the location social network, U represents a set of all users in the check-in dataset, L is a point of interest in the location social network, L represents a set of all points of interest in the check-in dataset, r isu,t,lIndicating the rating, r, of user u for address l at time slot tu,t',lRepresents the rating of user u for address/at time slot t', and NU represents the total number of users in the check-in dataset.
And fourthly, reasonably utilizing the scoring information in other time slots in the same time cluster to calculate the user similarity at the current recommendation time according to the basic principles of high similarity in the clusters and low similarity between the clusters. The method comprises the following implementation steps:
(4.a) selecting a target user u in the location social networktAs a recommended service object, the current recommended time is takenrConversion to time slot tr
(4.b) determining time slot t according to the clustering resultrTo which cluster cenjAnd the number of time slots in the cluster, nj, noted cenj={tr,t2,t3,…,tnj}. Computing active user utAnd other users v in time slot trUser similarity in time:
Figure BDA0003482089650000081
wherein u istIs a target object of the current service of the recommendation systemV is one other user in the location social network, trIs the time slot corresponding to the current recommended time, and nj is the time slot trThe cluster cen to which it belongsjNL represents the total number of points of interest in the check-in dataset,
Figure BDA0003482089650000082
representing a target user utIn clustering cenjOf other time slots cenj[a]Scoring of points of interest l, rv,cenj[b],lIndicating that user v is in cluster cenjOf other time slots cenj[b]The score of the interest point l, a belongs to [1, nj ]],b∈[1,nj]。
And fifthly, improving the traditional scoring method based on the collaborative filtering algorithm of the user by utilizing the time clustering result and the time similarity inside the cluster, so that the scoring method can generate the predicted scores of the interest points in a self-adaptive manner according to the current recommended time, and recommending a plurality of unaccessed addresses with the top ranking of the current time for the user. The method comprises the following implementation steps:
(5.a) determining a target user u in a location social networktAs a recommended service object, the current recommended time is takenrConversion to time slot tr
(5.b) determining time slot t based on the clustering resultrTo which cluster cenjAnd the number of time slots in the cluster, nj, noted cenj={tr,t2,t3,…,tnj}。
(5.c) calculating the target user utAt trPrediction score of point of interest/:
Figure BDA0003482089650000083
wherein u istIs the target object of the current service of the recommendation system, trIs a time slot corresponding to the current recommended time, l is an interest point which has not been visited by the target user in the location social network, v is another user in the location social network, U represents a set of all users, sim (U)t,v,tr) Representing user utAnd user v is in time slot trUser similarity of time, nj being time slot trThe cluster cen to which it belongsjThe number of time slots in (a) is,
Figure BDA0003482089650000091
indicating that user v is at time cenj[i]The score of the interest point l, i belongs to [1, nj ∈ ]],timesimi(tr,cenj[i]) Representing the current time trWith other times cenj[i]The similarity between them.
(5.d) for target user utSorting all the addresses which are not visited according to a prediction score, forming a recommendation list by N positions which are ranked at the top, and enabling the recommendation list to be TopNListtAnd returning to the target user.
And sixthly, evaluating the recommendation quality by using the recommendation precision index, and evaluating the accuracy and effectiveness of the proposed technology by comparing the recommendation precision of the recommendation system and other classical recommendation systems. The method comprises the following implementation steps:
and (6.a) randomly selecting NU multiplied by 10% of users from the target data set as a target user set AU, and operating each recommendation algorithm for each target user in the set to generate a recommendation list. Where NU represents the total number of users in the check-in dataset.
And (6.b) evaluating the accuracy of each recommendation system by using the accuracy indexes, wherein the values of Precision, Recall and comprehensive accuracy index F1 of each algorithm which runs once for the target user set AU are the average value of the indexes of all users in the AU set.
(6.c) repeating steps (6.a) and (6.b) times Ntimes, i.e. all algorithms run Ntimes independently.
(6.d) the values of Precision, Recall and integrated Precision index F1 for the set recommendation algorithm are the average of the results of Ntimes runs.
(6.e) comparing and analyzing the results of each index: if Precision of the time-sensing self-adaptive interest point recommendation algorithm based on the K-means clustering is greater than Precision values of other recommendation algorithms, the fact that the Precision of the time-sensing self-adaptive interest point recommendation algorithm based on the K-means clustering is higher in accuracy of hitting favorite items of the user is shown; if the Recall rate Recall of the algorithm provided by the invention is greater than the Recall values of other recommended algorithms, the technical search capability provided by the invention is stronger; if the comprehensive accuracy index F1 value of the algorithm provided by the invention is larger than the F1 values of other recommended algorithms, the comprehensive capability of the technology provided by the invention on the aspect of recommended accuracy is stronger.
The following describes in detail how the time-aware adaptive interest point recommendation method based on K-means clustering operates, taking a specific location-based social network as an example.
Gowalla is a location-based social networking service provider with users sharing their locations by checking in. The Gowalla dataset collected 196591 users' social relationships and check-in information on the website during the period from 2009, 2 to 2010, 10. The number of the points of interest in the Gowalla data set is 1256379, 6442892 check-in records of the users on the points of interest, and 950327 social relationships are formed among the users. The Gowalla dataset has become one of the most commonly used test datasets by recommendation system researchers.
The invention selects check-in data of five hot areas in Gowalla data set Los Angeles, San Francisco, New York, Maricopa and King as an example for instantiation description.
The method comprises the following steps of firstly, collecting and sorting an original sign-in data set of a user, and converting the original sign-in data set into a user-time-interest point three-dimensional scoring matrix, wherein the operation steps are as follows:
(1.a) collecting and sorting user check-in data in areas of Los Angeles, San Francisco, New York, Maricopa and King in the example data set Gowalla to obtain a check-in data set C consisting of 50007 historical access records of 1572 users on 1420 addresses, and recording the C as C ═ C1,c2,…,c50007}. A schematic diagram of the historical access records of users in the location social network in the Gowalla dataset is shown in fig. 3. 13864 social relations are formed among users, the number of check-in records of each user is 31.81 on average, the number of social relations of each user is 8.82 on average, and the number of times of visiting each interest point is 35.22 on average.
Each check-in record is formatted as a user ID, check-in time, geographic latitude, geographic longitude, and a point of interest ID five tuple. All sets of users checked-in data sets are denoted by U, all sets of points of interest by L, the number of users NU is 1572 and the number of points of interest NL is 1420.
(1.b) the time of day is divided into 24 discrete time slots, the set of time slots being denoted T ═ 0,1,2, …, 23. And rounding the check-in time in each check-in record to obtain the value of the corresponding time slot t (t belongs to [0,23 ]). For example, the time slot corresponding to the check-in time of 15:13:23 is t-15, and the time slot corresponding to the check-in time of 00:11:20 is t-0.
(1, c) carrying out check-in times statistics on the check-in record quintuple set, and generating corresponding tetrad (u) for each pair of user-time-interest pointsi,t,lj,ni,t,j) Wherein u isiIs the ith user (i e [1,1572 ]]),ljIs the jth point of interest (j e [1,1420 ]]) And t is the value of the time slot obtained by rounding the time point in the check-in record (t ∈ [0,23]]),ni,t,jIs user uiAccess to a point of interest l in a time slot tjThe number of times.
(1.d) assigning user uiAccess to a point of interest l in a time slot tjNumber of check-ins ni,t,jConversion to user uiFor point of interest l in time slot tjScore r ofi,t,j. If user uiGo past the point of interest l in time slot tjThen score r i,t,j1 is ═ 1; otherwise, ri,t,j=0:
Figure BDA0003482089650000101
Wherein r isi,t,jRepresenting user uiFor address l at time slot tjScore of n, ni,t,jRepresenting user uiAccess to point of interest l at time slot tjThe number of times.
Summarizing all scores to form a user-time-interest point three-dimensional score matrix R ═ Ri,t,j},i∈[1,1572],t∈[0,23],j∈[1,1420]Where i denotes the user number, t denotes the value of the time slot, j denotes the address number, ri,t,jRepresenting user uiFor address l at time slot tjThe score of (1).
And secondly, counting the number of check-in users, the number of accessed interest points and the check-in times in each time slot. And constructing a three-dimensional check-in feature vector of each time slot based on the statistical result to form a check-in data feature set of the time slot. The specific operation steps are as follows:
(2.a) counting the number Unum of users who have checked-in behavior in the time slot t in the check-in datasett
Unumt=∑u∈U isCheck(u,t) (12)
Wherein U is a user in the location social network, U represents a set of all users in the check-in dataset, and the isCheck function represents whether the user U has check-in behavior within the time slot t:
Figure BDA0003482089650000111
where L is a certain point of interest in the location social network, L represents the set of all points of interest in the check-in dataset, ru,t,lIndicating the rating of user u for address/at time slot t.
(2.b) counting the number of interest points Pnum visited in the time slot t in the check-in datasett
Pnumt=∑l∈L isChecked(l,t) (14)
Where L is a certain point of interest in the location social network, L represents the set of all points of interest in the check-in dataset, the isChecked function represents whether point of interest L is visited within time slot t:
Figure BDA0003482089650000112
where U is a user in the location social network, and U represents the number of check-insSet of all users in the data set, ru,t,lIndicating the rating of user u for address/at time slot t.
(2.c) counting the total number Cnum of check-ins occurring in the time slot t in the check-in datasett
Figure BDA0003482089650000113
Wherein n is the number of check-in records in the check-in data set C, and the isTime function represents the ith check-in record CiWhether it occurs within time slot t:
Figure BDA0003482089650000114
wherein, timeiin t represents the ith check-in record ciSign-in time ofiThe corresponding time slot is t.
Fig. 4 is a diagram illustrating statistics of the number of check-in users, the number of visited interest points, and the number of check-in times for each time slot.
(2.d) constructing a three-dimensional check-in feature vector x of each time slot t based on the statistical resultst={Unumt,Pnumt,CnumtForming a time slot check-in data feature set X ═ X0,x1,…,x23}. Wherein t ∈ [0,23]],UnumtIs the number of users, Pnum, that have taken place a sign-in action at time slot ttIs the number of points of interest, Cnum, visited in time slot ttIs the total number of check-ins that occur at time slot t.
And thirdly, clustering the time slots by adopting a K mean value method based on the statistical result of the second step. And calculating the time similarity between each time slot in the same cluster. The method comprises the following implementation steps:
(3.a) clustering 24 time slots by adopting a K-means method with simple algorithm and high convergence rate to generate 3 cluster clusters, wherein Cen is { Cen ═1,cen2,cen3}. Where the first cluster of time slot sets is a set of 7,8,9,10,11,12,13, the second cluster time slot set is 0,1,2,3,16,17,18,19,20,21,22,23, and the third cluster time slot set is 4,5,6,14, 15. A graph of the K-means clustering results for 24 time slots is shown in fig. 5.
(3.b) calculating the time similarity between any two time slots t and t' in the three time clustering sets:
Figure BDA0003482089650000121
where U is a user in the location social network, U represents a set of all users in the check-in dataset, L is a point of interest in the location social network, L represents a set of all points of interest in the check-in dataset, r isu,t,lIndicating the rating, r, of user u for address l at time slot tu,t',lIndicating the rating of address/by user u at time slot t'.
And fourthly, reasonably utilizing the scoring information in other time slots in the same time cluster to calculate the user similarity at the current recommendation time according to the basic principles of high similarity in the clusters and low similarity between the clusters. The realization steps are as follows:
(4.a) selecting a target user u in the location social networktAs a recommended service object, the current recommended time is takenrConversion to time slot tr. Assume the current timer20:14:13, corresponding time slot trIs 20.
(4.b) determining time slot t according to the clustering resultrTo which cluster cenjAnd the number of time slots in the cluster, nj, noted cenj={tr,t2,t3,…,tnj}. For example, when time slot t is recommendedrWhen the number is 20, the cluster cen belongs tojThe number of time slots in the cluster is 12(nj is 12), 20,0,1,2,3,16,17,18,19,21,22, 23.
Computing active user utAnd other users v in time slot trUser similarity in time:
Figure BDA0003482089650000122
wherein u istIs a target object of the current service of the recommendation system, v is one other user in the location social network, trIs the time slot corresponding to the current recommended time, and nj is the time slot trThe cluster cen to which it belongsjThe number of time slots in (a) is,
Figure BDA0003482089650000123
representing a target user utIn clustering cenjOf other time slots cenj[a]Scoring of points of interest l, rv,cenj[b],lIndicating that user v is clustering cenjOf other time slots cenj[b]The score of the interest point l, a belongs to [1, nj ]],b∈[1,nj]。
And fifthly, improving the traditional scoring method based on the collaborative filtering algorithm of the user by utilizing the time clustering result and the time similarity inside the cluster, so that the scoring method can generate the predicted scores of the interest points in a self-adaptive manner according to the current recommended time, and recommending a plurality of unaccessed addresses with the top ranking of the current time for the user. The method comprises the following implementation steps:
(5.a) determining a target user u in a location social networktAs a recommended service object, the current recommended time is takenrConversion to time slot tr
(5.b) determining time slot t based on the clustering resultrTo which cluster cenjAnd the number of time slots in the cluster, nj, noted cenj={tr,t2,t3,…,tnj}。
(5.c) calculating the target user utAt trPrediction score of point of interest/:
Figure BDA0003482089650000131
wherein u istIs the target object of the current service of the recommendation system, trIs the current recommended time corresponds toThe time slot of (c), where l is a point of interest that the target user has not visited yet in the location social network, v is another user in the location social network, U represents a set of all users, sim (U)t,v,tr) Representing user utAnd user v is in time slot trUser similarity of time, nj being time slot trThe cluster cen to which it belongsjThe number of time slots in (a) is,
Figure BDA0003482089650000132
indicating that user v is at time cenj[i]The score of the interest point l, i belongs to [1, nj ∈ ]],timesimi(tr,cenj[i]) Representing the current time trWith other times cenj[i]The similarity between them.
(5.d) for target user utSorting all the addresses which are not visited according to a prediction score, forming a recommendation list by N positions which are ranked at the top, and enabling the recommendation list to be TopNListtAnd returning the data to the target user (N can be a multiple of 5, and N is more than or equal to 5 and less than or equal to 50 under the normal condition).
And sixthly, evaluating the recommendation quality by using the recommendation precision index, and evaluating the accuracy and effectiveness of the proposed technology by comparing the recommendation precision of the recommendation system and other classical recommendation systems. The method comprises the following implementation steps:
and (6.a) selecting 157 users randomly from the target data set as a target user set AU, and respectively operating a time perception self-adaptive interest point recommendation algorithm, a classical user-based collaborative filtering algorithm UBCF and a social relationship-based collaborative filtering algorithm SCF for each target user in the set to generate a recommendation list.
And (6.b) evaluating the accuracy of each recommendation system by using the accuracy indexes, wherein the values of Precision ratio Precision, Recall and comprehensive accuracy index F1 of each algorithm which runs for the target user set AU once are the average value of the indexes of all users in the AU set.
(6.c) repeat steps (6.a) and (6.b) 100 times, i.e. all algorithms run 100 times independently.
(6.d) setting the Precision, Recall and comprehensive Precision index F1 of the recommendation algorithm and the UBCF and SCF algorithms proposed by the invention to be the average value of the results of 100 runs. When N takes different values, the Precision, Recall and overall Precision index F1 results of the recommended algorithms are shown in table 2, table 3 and table 4, respectively, where the value of each row with bold format represents the maximum value of the row index:
TABLE 2 Precision index values for different recommendation algorithms
Figure BDA0003482089650000141
TABLE 3 Recall ratio Recall index values for different recommendation algorithms
Figure BDA0003482089650000142
TABLE 4 recommendation accuracy F1 index values for different recommendation algorithms
Figure BDA0003482089650000143
The histograms comparing the Precision, Recall Precision and comprehensive Precision index F1 of the recommendation algorithm and the classical UBCF and SCF algorithms in this case are shown in fig. 6, 7 and 8, respectively.
(6.e) comparing and analyzing the results of each index: the Precision ratio Precision of the time-sensing self-adaptive interest point recommendation algorithm based on the K-means clustering is greater than Precision values of other recommendation algorithms, so that the accuracy of the technology provided by the invention for hitting favorite items of the user is higher; the Recall rate Recall of the algorithm provided by the invention is greater than the Recall values of other recommended algorithms, which shows that the technical search capability provided by the invention is stronger; the comprehensive accuracy index F1 value of the algorithm provided by the invention is larger than the F1 values of other recommended algorithms, which shows that the comprehensive capability of the technology provided by the invention in the aspect of recommended accuracy is stronger.
The method is different from a conventional interest point recommendation algorithm, aims to construct an interest point recommendation system which can generate an interest point list in real time according to time points and has an accurate recommendation result, emphasizes the difference and the correlation of user sign-in data characteristics in different time slots, innovatively provides an analysis mode of the distance from the time points to a clustering center, adopts a K-means clustering method to mine the correlation among the time slots, relieves the sparseness problem of high-dimensional sign-in data through time clustering, improves the accuracy and the effectiveness of scoring prediction, and strengthens the service quality of the recommendation system. The technology provided by the invention has wide application prospect and is expected to be widely applied to the social network market based on the position.
The above-described process flow is only a preferred embodiment of the present invention, but does not represent all the details of the present invention. Any modification, equivalent replacement, and improvement made by those skilled in the art within the technical scope of the present disclosure within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.

Claims (7)

1.A time-sensing adaptive interest point recommendation method based on K-means clustering is characterized by comprising the following steps:
step 1, collecting and sorting an original sign-in data set of a user, and converting the original sign-in data set into a user-time-interest point three-dimensional scoring matrix;
step 2, counting the number of check-in users, the number of accessed interest points and the check-in times in each time slot; constructing a three-dimensional check-in feature vector of each time slot based on the statistical result to form a check-in data feature set of the time slot;
step 3, based on the statistical result of the second step, clustering the time slots by adopting a K mean value method, and calculating the time similarity between each time slot in the same cluster;
step 4, reasonably utilizing the scoring information in other time slots in the same time cluster to calculate the user similarity at the current recommendation time according to the basic principles of high intra-cluster similarity and low inter-cluster similarity;
step 5, improving a traditional scoring method based on a collaborative filtering algorithm of the user by using a time clustering result and the time similarity inside the cluster, so that the scoring method can generate an interest point prediction score in a self-adaptive manner according to the current recommended time, and recommending a plurality of unaccessed addresses with the top ranking of the current time for the user;
and 6, evaluating the recommendation quality by using the recommendation precision index, comparing the recommendation precision index with the recommendation precision of other classical recommendation systems, and evaluating the accuracy and effectiveness of the proposed technology.
2. The method for recommending the time-aware adaptive interest points based on the K-means clustering is characterized in that the method comprises the following steps in step 1:
step 11: sorting the original check-in data set C of the user to obtain n check-in records, and recording the n check-in records as C ═ C1,c2,…,cn}; each check-in record is formed into a user ID, check-in time, geographical latitude, geographical longitude and an interest point ID quintuple; all user sets signed in to the data set are represented by U, all interest point sets are represented by L, and NU and NL are the number of users and interest points respectively;
step 12: dividing the time of day into 24 discrete time slots, wherein a time slot set is represented as T ═ 0,1,2, …,23 }; rounding the check-in time in each check-in record to obtain the value of the corresponding time slot t, wherein t belongs to [0,23 ];
step 13: carrying out check-in times statistics on a check-in record quintuple set, and generating a corresponding quadruplet u for each pair of user-time-interest pointsi,t,lj,ni,t,jWherein u isiIs the ith user (i ∈ [1, NU)]),ljIs the jth point of interest, j ∈ [1, NL)]T is the value of the time slot obtained by rounding the time point in the check-in record, and t is the [0,23]],ni,t,jIs user uiAccess to a point of interest l in a time slot tjThe number of times of (c);
step 14: user uiAccess to a point of interest l in a time slot tjNumber of check-ins ni,t,jConversion to user uiFor point of interest l in time slot tjScore r ofi,t,j. Such asFruit user uiGo past the point of interest l in time slot tjThen score ri,t,j1 is ═ 1; otherwise, ri,t,j=0:
Figure FDA0003482089640000011
Wherein r isi,t,jRepresenting user uiFor address l at time slot tjScore of n, ni,t,jRepresenting user uiAccess to point of interest l at time slot tjThe number of times of (c);
summarizing all scores to form a user-time-interest point three-dimensional score matrix R ═ Ri,t,j},i∈[1,NU],t∈[0,23],j∈[1,NL]Where i denotes the user number, t denotes the value of the time slot, j denotes the address number, NU denotes the total number of users, NL denotes the total number of points of interest, ri,t,jRepresenting user uiFor address l at time slot tjThe score of (1).
3. The method for recommending time-aware adaptive interest points based on K-means clustering according to claim 1, wherein step 2 of the method comprises:
step 21: counting the number Unum of users who have signed in the time slot t in the sign-in data sett
Unumt=∑u∈U isCheck(u,t) (2)
Wherein U is a user in the location social network, U represents a set of all users in the check-in dataset, and the isCheck function represents whether the user U has check-in behavior within the time slot t:
Figure FDA0003482089640000021
where L is a certain point of interest in the location social network, L represents the set of all points of interest in the check-in dataset, ru,t,lRepresents the rating of user u for address l at time slot t;
step 22: counting the number Pnum of interest points accessed in the time slot t in the check-in datasett
Pnumt=∑l∈L isChecked(l,t) (4)
Where L is a certain point of interest in the location social network, L represents the set of all points of interest in the check-in dataset, the isChecked function represents whether point of interest L is visited within time slot t:
Figure FDA0003482089640000022
where U is a user in the location social network, U represents the set of all users in the check-in dataset, ru,t,lRepresents the rating of user u for address l at time slot t;
step 23: counting the total sign-in times Cnum of the sign-in data set occurring in the time slot tt
Figure FDA0003482089640000023
Wherein n is the number of check-in records in the check-in data set C, and the isTime function represents the ith check-in record CiWhether it occurs within time slot t:
Figure FDA0003482089640000024
wherein, timeiin t represents the ith check-in record ciSign-in time ofiThe corresponding time slot is t;
step 24: based on the statistical results, three-dimensional sign-in feature vector x of each time slot t is constructedt={Unumt,Pnumt,CnumtForming a time slot check-in data feature set X ═ X0,x1,…,x23}. Wherein t ∈ [0,23]],UnumtIs sent at time slot tNumber of users, Pnum, who have given birth to check-in behaviortIs the number of points of interest, Cnum, visited at time slot ttIs the total number of check-ins that occur at time slot t.
4. The method for recommending time-aware adaptive interest points based on K-means clustering according to claim 1, wherein the step 3 comprises:
step 31: clustering 24 time slots by adopting a K-means method with simple algorithm and high convergence rate to generate nc clustering centers Cen ═ Cen1,cen2,…,cennc}(nc∈[2,24]);
Step 32: for any two time slots t and t' in each time cluster set, calculating the time similarity between the two time slots:
Figure FDA0003482089640000031
where U is a user in the location social network, U represents a set of all users in the check-in dataset, L is a point of interest in the location social network, L represents a set of all points of interest in the check-in dataset, r isu,t,lIndicating the rating, r, of user u for address l at time slot tu,t',lRepresents the rating of user u for address/at time slot t', and NU represents the total number of users in the check-in dataset.
5. The method for recommending time-aware adaptive interest points based on K-means clustering according to claim 1, wherein step 4 of the method comprises:
step 41: selecting a target user u in a location social networktAs a recommended service object, the current recommended time is takenrConversion to time slot tr
Step 42: determining time slot t according to clustering resultrTo which cluster cenjAnd the number of time slots in the cluster, nj, noted cenj={tr,t2,t3,…,tnj}. Computing active user utAnd other users v in time slot trUser similarity in time:
Figure FDA0003482089640000032
wherein u istIs a target object of the current service of the recommendation system, v is one other user in the location social network, trIs the time slot corresponding to the current recommended time, and nj is the time slot trThe cluster cen to which it belongsjNL represents the total number of points of interest in the check-in dataset,
Figure FDA0003482089640000033
representing a target user utIn clustering cenjOf other time slots cenj[a]The point of interest l is scored as such,
Figure FDA0003482089640000034
indicating that user v is in cluster cenjOther time slots cenj[b]The score of the interest point l, a belongs to [1, nj ]],b∈[1,nj]。
6. The method for recommending time-aware adaptive interest points based on K-means clustering according to claim 1, wherein the step 5 of the method comprises:
step 51: determining a target user u in a location social networktAs a recommended service object, the current recommended time is takenrConversion to time slot tr
Step 52: determining time slot t according to clustering resultrTo which cluster cenjAnd the number of time slots in the cluster, nj, noted cenj={tr,t2,t3,…,tnj};
Step 53: calculating target user utAt trPrediction score of point of interest/:
Figure FDA0003482089640000041
wherein u istIs the target object of the current service of the recommendation system, trIs a time slot corresponding to the current recommended time, l is an interest point which has not been visited by the target user in the location social network, v is another user in the location social network, U represents a set of all users, sim (U)t,v,tr) Representing user utAnd user v is in time slot trUser similarity of time, nj being time slot trThe cluster cen to which it belongsjThe number of time slots in (a) is,
Figure FDA0003482089640000042
indicating that user v is at time cenj[i]The score of the interest point l, i belongs to [1, nj ∈ ]],timesimi(tr,cenj[i]) Representing the current time trWith other times cenj[i]The similarity between them;
step 54: for target user utSorting all the addresses which are not visited according to a prediction score, forming a recommendation list by N positions which are ranked at the top, and enabling the recommendation list to be TopNListtAnd returning to the target user.
7. The method for recommending time-aware adaptive interest points based on K-means clustering according to claim 1, wherein the step 6 comprises:
step 61: randomly selecting NU multiplied by 10% of users from the target data set as a target user set AU, and operating each recommendation algorithm for each target user in the set to generate a recommendation list. Wherein NU represents the total number of users in the check-in dataset;
step 62: evaluating the accuracy of each recommendation system by using the accuracy indexes, wherein the values of Precision, Recall and comprehensive accuracy index F1 of each algorithm running for the target user set AU once are the average value of the indexes of all users in the AU set;
and step 63: repeating (6.a) and (6.b) steps Ntimes, i.e. all algorithms run Ntimes independently;
step 64: setting the Precision, Recall and comprehensive Precision index F1 of the recommendation algorithm as the average value of the Ntimes running results;
step 65: and comparing and analyzing the results of all indexes: if Precision of the time-sensing self-adaptive interest point recommendation algorithm based on the K-means clustering is greater than Precision values of other recommendation algorithms, the fact that the user favorite items are hit by the technology is higher in accuracy; if the Recall rate Recall is larger than the Recall values of other recommended algorithms, the technical Recall capability is stronger; and if the value of the comprehensive accuracy index F1 is larger than the F1 values of other recommended algorithms, the comprehensive capability of the technology in the aspect of recommendation accuracy is stronger.
CN202210071029.3A 2022-01-21 2022-01-21 Time-sensing self-adaptive interest point recommendation method based on K-means clustering Pending CN114528480A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210071029.3A CN114528480A (en) 2022-01-21 2022-01-21 Time-sensing self-adaptive interest point recommendation method based on K-means clustering
CN202211571570.7A CN116166878A (en) 2022-01-21 2022-12-08 Time perception self-adaptive interest point recommendation method based on K-means clustering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210071029.3A CN114528480A (en) 2022-01-21 2022-01-21 Time-sensing self-adaptive interest point recommendation method based on K-means clustering

Publications (1)

Publication Number Publication Date
CN114528480A true CN114528480A (en) 2022-05-24

Family

ID=81620186

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202210071029.3A Pending CN114528480A (en) 2022-01-21 2022-01-21 Time-sensing self-adaptive interest point recommendation method based on K-means clustering
CN202211571570.7A Pending CN116166878A (en) 2022-01-21 2022-12-08 Time perception self-adaptive interest point recommendation method based on K-means clustering

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202211571570.7A Pending CN116166878A (en) 2022-01-21 2022-12-08 Time perception self-adaptive interest point recommendation method based on K-means clustering

Country Status (1)

Country Link
CN (2) CN114528480A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115408618A (en) * 2022-09-26 2022-11-29 南京工业职业技术大学 Interest point recommendation method based on social relationship fusion position dynamic popularity and geographic features
CN115687801A (en) * 2022-09-27 2023-02-03 南京工业职业技术大学 Position recommendation method based on position timeliness characteristics and time perception dynamic similarity

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117635237A (en) * 2023-12-22 2024-03-01 广州方块网络技术有限公司 Advertisement management system based on SaaS information flow and cross-platform crowd data

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107657015B (en) * 2017-09-26 2021-03-19 北京邮电大学 Interest point recommendation method and device, electronic equipment and storage medium
CN111104607A (en) * 2018-10-25 2020-05-05 中国电子科技集团公司电子科学研究院 Location recommendation method and device based on sign-in data
CN114036376A (en) * 2021-10-26 2022-02-11 南京理工大学紫金学院 Time-aware self-adaptive interest point recommendation method based on K-means clustering

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115408618A (en) * 2022-09-26 2022-11-29 南京工业职业技术大学 Interest point recommendation method based on social relationship fusion position dynamic popularity and geographic features
CN115408618B (en) * 2022-09-26 2023-10-20 南京工业职业技术大学 Point-of-interest recommendation method based on social relation fusion position dynamic popularity and geographic features
CN115687801A (en) * 2022-09-27 2023-02-03 南京工业职业技术大学 Position recommendation method based on position timeliness characteristics and time perception dynamic similarity
CN115687801B (en) * 2022-09-27 2024-01-19 南京工业职业技术大学 Position recommendation method based on position aging characteristics and time perception dynamic similarity

Also Published As

Publication number Publication date
CN116166878A (en) 2023-05-26

Similar Documents

Publication Publication Date Title
Logesh et al. Efficient user profiling based intelligent travel recommender system for individual and group of users
Christensen et al. Social group recommendation in the tourism domain
Xie et al. Learning graph-based poi embedding for location-based recommendation
Guo et al. Combining geographical and social influences with deep learning for personalized point-of-interest recommendation
Isinkaye et al. Recommendation systems: Principles, methods and evaluation
CN114036376A (en) Time-aware self-adaptive interest point recommendation method based on K-means clustering
Xing et al. Points-of-interest recommendation based on convolution matrix factorization
CN114528480A (en) Time-sensing self-adaptive interest point recommendation method based on K-means clustering
Mo et al. Event recommendation in social networks based on reverse random walk and participant scale control
Li et al. A multi-dimensional context-aware recommendation approach based on improved random forest algorithm
Lyu et al. iMCRec: A multi-criteria framework for personalized point-of-interest recommendations
CN111475744B (en) Personalized position recommendation method based on ensemble learning
Xia et al. Vrer: context-based venue recommendation using embedded space ranking SVM in location-based social network
CN116244513B (en) Random group POI recommendation method, system, equipment and storage medium
Kang et al. A personalized point-of-interest recommendation system for O2O commerce
Dong et al. Exploiting category-level multiple characteristics for POI recommendation
Yin et al. A fuzzy clustering based collaborative filtering algorithm for time-aware POI recommendation
Lang et al. POI recommendation based on a multiple bipartite graph network model
Cacheda et al. Click through rate prediction for local search results
Yin et al. A tensor decomposition based collaborative filtering algorithm for time-aware POI recommendation in LBSN
Xu et al. Deep convolutional recurrent model for region recommendation with spatial and temporal contexts
CN115408618B (en) Point-of-interest recommendation method based on social relation fusion position dynamic popularity and geographic features
Yoshida et al. New performance index “attractiveness factor” for evaluating websites via obtaining transition of users’ interests
Cao et al. Local experts finding using user comments in location‐based social networks
CN114417166A (en) Continuous interest point recommendation method based on behavior sequence and dynamic social influence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20221130

Address after: 210000 No.1, Yangshan North Road, Xianlin University Town, Qixia District, Nanjing City, Jiangsu Province

Applicant after: Nanjing Vocational University of Industry Technology

Address before: 210046 room 205, building 73, nandaheyuan community, 168 Xianlin Avenue, Nanjing, Jiangsu

Applicant before: Zhu Jun

TA01 Transfer of patent application right
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20220524

WD01 Invention patent application deemed withdrawn after publication