CN116166878A - Time perception self-adaptive interest point recommendation method based on K-means clustering - Google Patents

Time perception self-adaptive interest point recommendation method based on K-means clustering Download PDF

Info

Publication number
CN116166878A
CN116166878A CN202211571570.7A CN202211571570A CN116166878A CN 116166878 A CN116166878 A CN 116166878A CN 202211571570 A CN202211571570 A CN 202211571570A CN 116166878 A CN116166878 A CN 116166878A
Authority
CN
China
Prior art keywords
time
user
interest
time slot
check
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211571570.7A
Other languages
Chinese (zh)
Inventor
朱俊
梁太波
韩立新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Vocational University of Industry Technology NUIT
Original Assignee
Nanjing Vocational University of Industry Technology NUIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Vocational University of Industry Technology NUIT filed Critical Nanjing Vocational University of Industry Technology NUIT
Publication of CN116166878A publication Critical patent/CN116166878A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06395Quality analysis or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0269Targeted advertisements based on user profile or attribute
    • G06Q30/0271Personalized advertisement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Educational Administration (AREA)
  • General Engineering & Computer Science (AREA)
  • Marketing (AREA)
  • Data Mining & Analysis (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Accounting & Taxation (AREA)
  • Probability & Statistics with Applications (AREA)
  • Finance (AREA)
  • Quality & Reliability (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a time perception self-adaptive interest point recommendation method based on K-means clustering, which comprises the following steps: firstly, converting a sign-in data set into a three-dimensional scoring matrix; secondly, counting the number of sign-in users, the number of accessed interest points and the number of sign-in times in each time slot, and constructing three-dimensional sign-in feature vectors of each time slot; thirdly, carrying out K-means clustering on the time slots, and calculating the time similarity among the time slots in the same cluster; fourthly, calculating the similarity of the users at the current time by using scoring information in other time slots in the same time cluster; fifthly, improving the traditional collaborative filtering method based on the user by utilizing a time clustering result and the time similarity in the clusters, so that the method can adaptively generate an interest point prediction score according to the current recommended time; and sixthly, comparing the recommendation accuracy of the recommendation system and other classical recommendation systems provided by the invention, and evaluating the accuracy and effectiveness of the proposed technology.

Description

Time perception self-adaptive interest point recommendation method based on K-means clustering
Technical Field
The invention relates to a time perception self-adaptive interest point recommendation method based on K-means clustering in a position social network, and belongs to the technical field of artificial intelligence and machine learning.
Background
In recent years, communication technology, location technology and mobile internet technology have rapidly developed, and Location-based social networks (Location-based Social Networks, LBSNs) have become a new media form for people to share and transfer information, providing a platform for closely connecting online virtual networks with offline real world. At present, a large number of mature social network platforms based on positions exist at home and abroad, such as Facebook, youTube, twitter, microblog, bean paste, public critique, a group net, a WeChat friend circle and the like. In a location-based social network, users may establish complex social relationships, such as friends, colleagues, relatives, etc.; viewing some places of interest (simply "points of interest") such as restaurants, shops, movie theatres, etc. with the added geographic tags; check-in is performed by a mobile device when points of interest (POIs) are accessed, geographical location information is published, and suggestions and comments of the points of interest (POIs) are shared. LBSNs can bring convenience to users, and can help merchants to know real users behind the network, so that personalized services meeting the requirements of different users can be customized in a 'best' manner, and the method has strong practicability and advancement.
As the number of users communicating in LBSNs increases, LBSNs store and accumulate rich available information such as check-in records, social relationships, spatiotemporal data, and various text, image, video, etc. The massive information provides abundant data resources for users, but also causes the problem of information overload (Information Overload), and increases the difficulty of accurately acquiring target items for users. Therefore, the recommendation system for solving the information overload problem is paid attention to by more researchers, such as the famous Amazon company uses the recommendation system to recommend goods to users, so that the click rate and turnover are improved for merchants; movie recommendation website Netflix attracts many research teams to work on improving recommendation accuracy by hosting recommendation system campaigns. As a special information filtering system, the recommendation system does not need users to actively provide determined keyword information, but models the interests and hobbies of the users by analyzing the existing historical behaviors of the users, and discovers the potential preference of the users, so that goods, services and the like meeting the requirements of the users are actively recommended to the users. Based on a large amount of user information, friend information and position information, researchers realize applications such as friend recommendation, expert discovery, interest point recommendation, activity recommendation, path recommendation and the like for LBSNs. The point of interest recommendation (POIs Recommendation) has become a research hotspot as an inevitable product of collaborative development of a traditional recommendation system and a location social network.
Considering that the point of interest recommendation is an important branch of a recommendation system, whether development history or key technology is carried out in a pulse manner with a traditional recommendation system, part of point of interest recommendation research regards the position as a common item similar to films, music and the like, and a recommendation result is generated by using a traditional recommendation method. The conventional recommendation algorithm mainly comprises a collaborative filtering algorithm, a content-based recommendation algorithm and a mixed recommendation algorithm according to design strategies. Collaborative filtering algorithms in turn include memory-based collaborative filtering algorithms (e.g., user-based collaborative filtering, item-based collaborative filtering) and model-based collaborative filtering algorithms (e.g., singular value decomposition, clustering models, probabilistic latent semantic analysis, etc.). Wherein content-based point-of-interest recommendation techniques extract relevant information, such as tags, classifications, and user reviews, from the accessed location; user preferences are extracted from the user's profile and then matched with the location profile to obtain accurate recommendations. The user-based collaborative filtering (UBCF) technology converts the sign-in behavior of the user into a user-interest point scoring matrix, searches similar users of the current active user by utilizing the existing sign-in records, predicts the score of the active user on the place which is not signed in according to the interest preference of the similar users, and recommends the interest point with the highest predicted score to the current user. Project-based collaborative filtering (IBCF) techniques are based on one assumption that: the user always prefers a position that is highly similar to his previous favorite address. The IBCF technique therefore first calculates the similarity between points of interest and recommends to active users the address most similar to the POIs that the user has visited. Singular Value Decomposition (SVD) is a classical representation of matrix decomposition, the main task of which is to generate low rank approximations. The low-dimensional orthogonal matrix decomposed by the SVD technology reduces noise on the basis of the original matrix, and can more effectively reveal potential association between users and commodities. In various recommendation technologies, the collaborative filtering algorithm does not need too much knowledge in specific fields, avoids complex information collection and content analysis processes, is easy to realize in engineering, and can be conveniently applied to products. Thus, collaborative filtering has become the most widely used and popular recommendation technique in the traditional recommendation field.
The above conventional recommendation techniques ignore the influence of the time context in the point of interest recommendation on the sign-in behavior of the user. However, in fact, the time attribute is a very important context information in the point of interest recommendation application scenario, and the sign-in habit of the user is always closely related to the time attribute. From a macroscopic perspective, the user's favor of points of interest can be affected by the surrounding large-time environment, for example, the beauty platform recommends a dumpling shop for the user in winter, and the travel network recommends a water park for the user in summer. More importantly, user preferences migrate over time, for example, users prefer to go to KTV and movie theatres before, but recently like to go to bookstores and coffee shops. In addition to the above macro features, the fine-grained time effect can better reflect the sign-in preference of the user in a specific time period, for example, the interest points of the catering are accessed most at about 12 points and 18 points, and the popularity of the bar rises from 21 points onwards. How to introduce time information into a recommendation algorithm and provide a suitable point-of-interest recommendation list for a user in a specific time period has become an urgent need for various social application platforms.
At present, some recommendation systems integrate time context into the point of interest recommendation problem, but the existing time-aware point of interest recommendation systems still have some drawbacks and disadvantages, which are summarized as follows:
(1) The related research of the point-of-interest recommendation technology based on the time feature is still relatively less compared with the recommendation technology considering other category contexts such as social relations, geographic features and the like. Most of the point-of-interest recommendation technologies are not good at handling dynamically changing user demands, are difficult to support the correction and adjustment of user preferences generated over time, and cannot give the point-of-interest recommendation results most in line with the current time situation in real time.
(2) The time-dimensional dynamic features of user similarity are ignored. When the user similarity is calculated in the existing research, the time dimension dynamic characteristics of the user similarity are not considered, and the same similarity matrix is shared in different time periods. However, in reality, user similarity may change over time. For example, at noon on a workday, a user often accesses a restaurant near a unit with a colleague where the similarity between the user and the colleague is higher than the similarity between the user and a family, whereas after coming home from work, the user often accesses a supermarket near an address with a family where the similarity between the user and the family is higher. Thus, the use of global user similarity at different times is not in line with the fact law.
(3) Data sparseness problem of user-time-point of interest three-dimensional matrix. The number of addresses visited by the user is very small compared to thousands of geographic locations in a location social network, which results in a very sparse scoring matrix itself. The problem of data sparseness is more pronounced in point of interest recommendation systems that consider space-time context. This is because, in order to explore the behavior pattern of the user in the target period, the present sparse check-in data set needs to be further divided into several subsets according to the time axis, which undoubtedly aggravates the sparseness of the scoring matrix. Therefore, a method capable of alleviating the data sparseness problem must be studied to improve the accuracy and reliability of the recommended results over a certain period of time.
The defects of the conventional time-aware interest point recommendation technology are caused by great defects in the design, development, deployment and operation of social network platforms at different positions, and particularly the service quality of a recommendation system is reduced on the network platform with massive project information, so that the sales performance of an electronic commerce system is affected.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims at constructing an interest point recommendation system with accurate recommendation results, which can generate an interest point list in real time according to time points, and provides a time-aware self-adaptive interest point recommendation method based on K-means clustering. Meanwhile, in consideration of the difference and correlation of user sign-in data characteristics in different time slots, the invention innovatively provides an analysis mode of the distance from a time point to a clustering center, adopts a K-means clustering method to mine the correlation between the time slots, relieves the sparse problem of high-dimensional sign-in data through time clustering, improves the effectiveness of scoring prediction, and strengthens the service quality of a recommendation system.
The technical scheme adopted for solving the technical problems is as follows: dividing a day into 24 time slots, respectively counting the number of checked-in users, the number of accessed interest points and the number of checked-in times in each time slot according to time tags, and carrying out K-means clustering on the time slots based on the third-order data characteristics; calculating the similarity of the users in different time slots according to the time clustering result and the historical sign-in information of the users; the scoring method of the traditional UBCF algorithm is improved by utilizing time clustering, so that the scoring method can adaptively generate the interest point prediction scores according to the time slots; the predictive scores of all non-visited addresses are ranked, and the top ranked addresses are selected for recommendation to the user (as shown in fig. 1).
The method comprises the following specific processes:
step 1: the original sign-in data set of the user is collected and arranged and converted into a three-dimensional scoring matrix of the user-time-interest points.
Step 2: counting the number of checked-in users, the number of accessed interest points and the number of checked-in times in each time slot. And constructing a three-dimensional sign-in feature vector of each time slot based on the statistical result to form a time slot sign-in data feature set.
Step 3: based on the statistical result of the second step, clustering the time slots by adopting a K-means method. And calculating the time similarity between the time slots in the same cluster.
Step 4: and according to the basic principles of high similarity in clusters and low similarity among clusters, calculating the user similarity at the current recommended time by reasonably utilizing the scoring information in other time slots in the same time cluster.
Step 5: and improving the scoring method of the traditional collaborative filtering algorithm based on the user by utilizing the time clustering result and the time similarity in the clusters, so that the scoring method can adaptively generate a point-of-interest prediction score according to the current recommendation time, and recommending a plurality of non-access addresses with top ranking of the current time to the user.
Step 6: and evaluating the recommendation quality by using the recommendation precision index, and comparing the recommendation precision of the recommendation system and other classical recommendation systems provided by the invention, and evaluating the accuracy and effectiveness of the proposed technology.
The beneficial effects are that:
(1) The time-aware self-adaptive interest point recommendation method based on the K-means clustering can generate a real-time interest point recommendation list for the user according to the current behavior habit of the user and the current fashion trend of the interest points at any time, and can help merchants to accurately push advertisements for the user, so that more potential consumers are attracted.
(2) The method creatively clusters time, digs time-dimensional dynamic characteristics of user similarity, searches different similar crowds for users at different times, and the 'time-varying' adjacent user searching mode is more in line with preference change of users in reality, thereby greatly improving the use satisfaction degree of the users on a social network platform, increasing the accuracy and the interpretability of a recommendation system and having very important significance for practical application.
(3) The time is clustered by the K-means method, so that the sharing of scoring data of all time slots in the cluster is realized, the similarity between the time slots is fully mined, and the data sparseness problem of a high-order scoring matrix is relieved. The method has certain universality and portability, can be applied to not only the interest point recommendation system, but also the personalized recommendation field of other traditional projects, and has wide industrial application prospect.
Drawings
FIG. 1 is a flowchart of a time-aware adaptive interest point recommendation method based on K-means clustering.
Fig. 2 is a flowchart of specific steps of a time-aware adaptive interest point recommendation method based on K-means clustering.
FIG. 3 is a schematic diagram of check-in records of a user in a location social network in an embodiment of the present invention.
Fig. 4 is a schematic diagram of statistics of the number of checked-in users, the number of points of interest to be accessed, and the number of check-ins in each time slot in the embodiment of the present invention.
FIG. 5 is a graph showing K-means clustering results for all time slots in an embodiment of the present invention.
FIG. 6 is a bar graph comparing accuracy Precision of a recommendation algorithm and a classical user-based collaborative filtering (UBCF), social relationship-based collaborative filtering (SCF) algorithm in an embodiment of the present invention.
FIG. 7 is a histogram of Recall contrast for a recommendation algorithm and a classical user-based collaborative filtering (UBCF), social relationship-based collaborative filtering (SCF) algorithm in an embodiment of the present invention.
FIG. 8 is a bar graph of the comparison of the integrated accuracy index F1 values of a recommendation algorithm and a classical user-based collaborative filtering (UBCF), social relationship-based collaborative filtering (SCF) algorithm in an embodiment of the present invention.
Detailed Description
The invention will now be described in further detail with reference to the accompanying drawings and specific examples.
The specific flow of the design and implementation of the invention is shown in figure 2, and the main variables and parameters in the process are shown in table 1.
TABLE 1 Functions of the main variables and parameters
Figure BDA0003987727820000051
First, the original sign-in data set of the user is collected and arranged and converted into a three-dimensional scoring matrix of the user-time-interest points. The operation steps are as follows:
(1. A) sorting the original check-in data set C of the user to obtain n check-in records, denoted as C= { C 1 ,c 2 ,…,c n }. Each check-in record is formed as a user ID, check-in time, geographic latitude, geographic longitude, and point of interest ID quintuple. All user sets in the check-in dataset are denoted by U, all interest point sets by L, NU and NL are the number of users and interest points, respectively.
(1. B) dividing the time of day into 24 discrete time slots, the set of time slots being denoted t= {0,1,2, …,23}. And rounding the check-in time in each check-in record to obtain the value (tE [0,23 ]) of the corresponding time slot t.
(1. C) counting check-in times of five-tuple set of check-in records, generating corresponding four-tuple (u) for each pair of user-time-interest points i ,t,l j ,n i,t,j ) Wherein u is i Is the ith user (i.e. [1, NU)]),l j Is the j-th interest point (j E [1, NL)]) T is the time slot value (t E) obtained by rounding the time point in the check-in record[0,23]),n i,t,j Is user u i Accessing point of interest l at time slot t j Is a number of times (1).
(1. D) user u i Accessing point of interest l at time slot t j Number of check-ins n i,t,j Conversion to user u i At time slot t, point of interest l j Score r of (2) i,t,j . If user u i Go past the interest point l in the time slot t j Score r i,t,j =1; conversely, r i,t,j =0:
Figure BDA0003987727820000061
Wherein r is i,t,j Representing user u i For address l at time slot t j Score of n i,t,j Representing user u i Accessing a point of interest l at time slot t j Is a number of times (1).
Summarizing all scores to form a user-time-interest point three-dimensional scoring matrix R= { R i,t,j },i∈[1,NU],t∈[0,23],j∈[1,NL]Wherein i denotes a user number, t denotes a time slot value, j denotes an address number, NU denotes a total number of users, NL denotes a total number of points of interest, r i,t,j Representing user u i For address l at time slot t j Is a score of (2).
And secondly, counting the number of check-in users, the number of accessed interest points and the number of check-in times in each time slot. And constructing a three-dimensional sign-in feature vector of each time slot based on the statistical result to form a time slot sign-in data feature set. The specific operation steps are as follows:
(2. A) counting the number of users Unum whose check-in actions occur in the time slot t in the check-in data set t
Figure BDA0003987727820000062
Where U is a user in the location social network, U represents all user sets in the check-in dataset, and the isCheck function represents whether user U has a check-in behavior in time slot t:
Figure BDA0003987727820000063
where L is a point of interest in the location social network, L represents a set of all points of interest in the check-in dataset, r u,t,l Representing the score of user u for address l at time slot t.
(2. B) counting the number of points of interest Pnum in which the check-in data is concentrated in the time slot t to be accessed t
Figure BDA0003987727820000071
Where L is a certain point of interest in the location social network, L represents a set of all points of interest in the check-in dataset, and the ischcocked function represents whether the point of interest L is accessed within the time slot t:
Figure BDA0003987727820000072
where U is a user in the location social network, U represents a collection of all users in the check-in dataset, r u,t,l Representing the score of user u for address l at time slot t.
(2. C) counting the total number of check-ins Cnum in which the check-in data is concentrated in the time slot t t
Figure BDA0003987727820000073
Where n is the number of check-in records in the check-in dataset C, and the isTime function represents the ith check-in record C i Whether it occurs within time slot t:
Figure BDA0003987727820000074
wherein, time is i in t represents the ith check-in record c i Is the time of check-in time of (C) i The corresponding time slot is t.
(2. D) constructing the three-dimensional check-in feature vector x for each time slot t based on the above statistical result t ={Unum t ,Pnum t ,Cnum t Form a time slot sign-in data feature set x= { X 0 ,x 1 ,…,x 23 }. Wherein t is [0,23 ]],Unum t The number of users, pnum, who have checked-in the time slot t t Is the number of points of interest accessed in time slot t, cnum t Is the total number of check-ins that occur in time slot t.
And thirdly, clustering the time slots by adopting a K-means method based on the statistical result of the second step. And calculating the time similarity between the time slots in the same cluster. The implementation steps are as follows:
(3. A) clustering the 24 time slots by adopting a K-means method with simple algorithm and high convergence speed to generate nc clustering centers Cen= { Cen 1 ,cen 2 ,…,cen nc }(nc∈[2,24])。
(3.b) for any two time slots t and t' in each set of temporal clusters, calculating a temporal similarity between the two:
Figure BDA0003987727820000075
Where U is a user in the location social network, U is a set of all users in the check-in data set, L is a point of interest in the location social network, L is a set of all points of interest in the check-in data set, r u,t,l Representing the score of user u to address l at time slot t, r u,t',l Representing the score of user u to address l at time slot t', NU represents the total number of users in the check-in dataset.
And fourthly, calculating the user similarity at the current recommended time by reasonably utilizing the scoring information in other time slots in the same time cluster according to the basic principles of high similarity in the clusters and low similarity among the clusters. The implementation steps are as follows:
(4. A) selecting a target user u in the location social network t As a recommended service object, the current recommended time is used for time r Conversion to time slot t r
(4. B) determining the time slot t based on the clustering result r Belonging cluster cen j And the number of time slots nj in the cluster, denoted cen j ={t r ,t 2 ,t 3 ,…,t nj }. Computing active user u t And other users v in time slot t r User similarity at time:
Figure BDA0003987727820000081
wherein u is t Is the target object of the current service of the recommendation system, v is one other user in the location social network, t r Is the time slot corresponding to the current recommended time, and nj is the time slot t r The cluster cen j In the data set, NL represents the total number of points of interest in the check-in data set, r ut,cenj[a],l Representing target user u t At cluster cen j Other time slots cen j [a]The point of interest i is scored at the time,
Figure BDA0003987727820000082
representing that user v is clustered in cen j Other time slots cen j [b]Scoring the interest point l, a E [1, nj],b∈[1,nj]。
And fifthly, improving the scoring method of the traditional collaborative filtering algorithm based on the user by utilizing the time clustering result and the time similarity in the clusters, so that the scoring method can adaptively generate interest point prediction scores according to the current recommendation time, and recommending a plurality of non-access addresses with the top ranking of the current time for the user. The implementation steps are as follows:
(5.a) determining a target user u in a location social network t As a recommended service object, the current recommended time is used for time r Conversion to time slot t r
(5. B) determining the time slot t based on the clustering result r Belonging cluster cen j And the number of time slots nj in the cluster, denoted cen j ={t r ,t 2 ,t 3 ,…,t nj }。
(5. C) calculating the target user u t At t r Prediction score for time access point of interest/:
Figure BDA0003987727820000083
wherein u is t Is a target object of the current service of the recommendation system, t r Is the time slot corresponding to the current recommended time, l is an interest point which is not visited by the target user in the location social network, v is one other user in the location social network, U represents all user sets, sim (U) t ,v,t r ) Representing user u t And user v in time slot t r User similarity at time, nj is time slot t r The cluster cen j In the number of time slots in (a),
Figure BDA0003987727820000091
representing that user v is at time cen j [i]Scoring the interest point l, i E [1, nj],timesimi(t r ,cen j [i]) Representing the current time t r With other times cen j [i]Similarity between them.
(5. D) for target user u t All addresses which are not accessed are ordered according to predictive scores, N positions which are ranked at the top are formed into a recommendation list, and the recommendation list TopNList is formed t And returning to the target user.
And sixthly, evaluating the recommendation quality by using the recommendation precision index, and comparing the recommendation precision of the recommendation system and other classical recommendation systems provided by the invention, and evaluating the accuracy and effectiveness of the proposed technology. The implementation steps are as follows:
(6.a) randomly selecting NU×10% users from the target data set as a target user set AU, running a respective recommendation algorithm for each target user in the set, and generating a recommendation list. Where NU represents the total number of users in the check-in dataset.
And (6. B) evaluating the accuracy of each recommendation system by using the accuracy indexes, wherein the values of the accuracy Precision, recall ratio Recall and comprehensive accuracy index F1 of each algorithm running once for the target user set AU are the average value of the indexes of all users in the AU set.
(6. C) repeating steps Ntimes (6.a) and (6. B), i.e., all algorithms run independently Ntimes.
(6.d) the values of the Precision, recall, and integrated Precision index F1 of the set recommendation algorithm are the average of the results of the Ntime runs.
(6.e) comparative analysis of each index results: if the accuracy of the time-aware self-adaptive interest point recommendation algorithm based on the K-means clustering is larger than the accuracy of other recommendation algorithms, the accuracy of the technology provided by the invention for hitting user favorite items is higher; if the Recall ratio Recall of the algorithm provided by the invention is larger than the Recall values of other recommended algorithms, the technical query capability provided by the invention is stronger; if the comprehensive precision index F1 value of the algorithm provided by the invention is larger than the F1 values of other recommendation algorithms, the technology provided by the invention has stronger comprehensive capacity in the aspect of recommendation precision.
In the following, a specific social network based on location is taken as an example to describe in detail how the time-aware adaptive interest point recommendation method based on K-means clustering in the present invention operates.
Gowalla is a location-based social networking service provider where users share their locations by checking in. The Gowalla dataset collected social relationship and check-in information for 196591 users on the website during 2 months 2009 through 10 months 2010. The number of the points of interest in the Gowalla dataset is 1256379, the number of check-in records of users on the points of interest is 6442892, and 950327 social relations are formed among the users. The Gowalla dataset has become one of the most commonly used test datasets by recommendation system researchers.
The invention selects check-in data of five hot areas of Los Angeles, san Francisco, new York, maricopa and King in Gowalla dataset as an example for illustration.
The first step, collecting and sorting the original sign-in data set of the user, converting the original sign-in data set into a three-dimensional scoring matrix of the user-time-interest points, and the operation steps are as follows:
(1. A) collecting and sorting user check-in data of Los Angeles, san Francisco, new York, maricopa and King regions in an example dataset Gowalla, obtaining a check-in dataset C consisting of 50007 historical access records of 1572 users at 1420 addresses, denoted as C= { C 1 ,c 2 ,…,c 50007 }. A schematic diagram of historical access records of users in a location social network in a Gowalla dataset is shown in FIG. 3. 13864 social relations are formed among the users, the average number of check-in records of each user is 31.81, the average number of social relations of each user is 8.82, and the average number of times that each interest point is accessed is 35.22.
Each check-in record is formed as a user ID, check-in time, geographic latitude, geographic longitude, and point of interest ID quintuple. All user sets in the check-in dataset are denoted by U, all interest point sets are denoted by L, the number of users NU is 1572, and the number of interest points NL is 1420.
(1. B) dividing the time of day into 24 discrete time slots, the set of time slots being denoted t= {0,1,2, …,23}. And rounding the check-in time in each check-in record to obtain the value (tE [0,23 ]) of the corresponding time slot t. For example, the time slot corresponding to the check-in time=15:13:23 is t=15, and the time slot corresponding to the check-in time=00:11:20 is t=0.
(1. C) counting check-in times of five-tuple set of check-in records, generating corresponding four-tuple (u) for each pair of user-time-interest points i ,t,l j ,n i,t,j ) Wherein u is i Is the ith user (i e 1,1572]),l j Is the j-th interest point (j E [1,1420)]) T is the value of the time slot obtained by rounding the time point in the check-in record (t e [0,23)]),n i,t,j Is user u i Accessing point of interest l at time slot t j Is a number of times (1).
(1. D) user u i Accessing point of interest l at time slot t j Number of check-ins n i,t,j Conversion to user u i At time slot t, point of interest l j Score r of (2) i,t,j . If user u i Go past the interest point l in the time slot t j Score r i,t,j =1; conversely, r i,t,j =0:
Figure BDA0003987727820000101
Wherein r is i,t,j Representing user u i For address l at time slot t j Score of n i,t,j Representing user u i Accessing a point of interest l at time slot t j Is a number of times (1).
Summarizing all scores to form a user-time-interest point three-dimensional scoring matrix R= { R i,t,j },i∈[1,1572],t∈[0,23],j∈[1,1420]Where i denotes the user number, t denotes the value of the time slot, j denotes the address number, r i,t,j Representing user u i For address l at time slot t j Is a score of (2).
And secondly, counting the number of check-in users, the number of accessed interest points and the number of check-in times in each time slot. And constructing a three-dimensional sign-in feature vector of each time slot based on the statistical result to form a time slot sign-in data feature set. The specific operation steps are as follows:
(2. A) counting the number of users Unum whose check-in actions occur in the time slot t in the check-in data set t
Figure BDA0003987727820000111
Where U is a user in the location social network, U represents all user sets in the check-in dataset, and the isCheck function represents whether user U has a check-in behavior in time slot t:
Figure BDA0003987727820000112
where L is a point of interest in the location social network, L represents a set of all points of interest in the check-in dataset, r u,t,l Representing the score of user u for address l at time slot t.
(2. B) counting the number of points of interest Pnum in which the check-in data is concentrated in the time slot t to be accessed t
Figure BDA0003987727820000113
Where L is a certain point of interest in the location social network, L represents a set of all points of interest in the check-in dataset, and the ischcocked function represents whether the point of interest L is accessed within the time slot t:
Figure BDA0003987727820000114
Where U is a user in the location social network, U represents a collection of all users in the check-in dataset, r u,t,l Representing the score of user u for address l at time slot t.
(2. C) counting the total number of check-ins Cnum in which the check-in data is concentrated in the time slot t t
Figure BDA0003987727820000115
Where n is the number of check-in records in the check-in dataset C, and the isTime function represents the ith check-in record C i Whether it occurs within time slot t:
Figure BDA0003987727820000116
wherein, time is i in t represents the ith check-in record c i Is the time of check-in time of (C) i The corresponding time slot is t.
The statistics of the number of checked-in users, the number of accessed interest points and the number of checked-in times of each time slot are shown in fig. 4.
(2. D) constructing the three-dimensional check-in feature vector x for each time slot t based on the above statistical result t ={Unum t ,Pnum t ,Cnum t Form a time slot sign-in data feature set x= { X 0 ,x 1 ,…,x 23 }. Wherein t is [0,23 ]],Unum t The number of users, pnum, who have checked-in the time slot t t Is the number of points of interest accessed in time slot t, cnum t Is the total number of check-ins that occur in time slot t.
And thirdly, clustering the time slots by adopting a K-means method based on the statistical result of the second step. And calculating the time similarity between the time slots in the same cluster. The implementation steps are as follows:
(3. A) clustering the 24 time slots by adopting a K-means method with simple algorithm and high convergence speed to generate 3 clusters, cen= { Cen 1 ,cen 2 ,cen 3 }. Wherein the first cluster time slot set is {7,8,9,10,11,12,13}, the second cluster time slot set is {0,1,2,3,16,17,18,19,20,21,22,23}, and the third cluster time slot set is {4,5,6,14,15}. A graph of K-means clustering results for 24 time slots is shown in FIG. 5.
(3.b) calculating the temporal similarity between any two time slots t and t' in the three time cluster sets:
Figure BDA0003987727820000121
where U is a user in the location social network, U is a set of all users in the check-in data set, L is a point of interest in the location social network, L is a set of all points of interest in the check-in data set, r u,t,l Representing the score of user u to address l at time slot t, r u,t',l Representing the score of user u for address l at time slot t'.
And fourthly, calculating the user similarity at the current recommended time by reasonably utilizing the scoring information in other time slots in the same time cluster according to the basic principles of high similarity in the clusters and low similarity among the clusters. The implementation steps are as follows:
(4. A) selecting a target user u in the location social network t As a recommended service object, the current recommended time is used for time r Conversion to time slot t r . Assume the current time of day r 20:14:13, then corresponding time slot t r 20.
(4. B) determining the time slot t based on the clustering result r Belonging cluster cen j And the number of time slots nj in the cluster, denoted cen j ={t r ,t 2 ,t 3 ,…,t nj }. For example, when time slot t is recommended r At 20, the cluster is cen j = {20,0,1,2,3,16,17,18,19,21,22,23}, the number of time slots in this cluster is 12 (nj=12).
Computing active user u t And other users v in time slot t r User similarity at time:
Figure BDA0003987727820000122
wherein u is t Is the target object of the current service of the recommendation system, v is one other user in the location social network, t r Is the time slot corresponding to the current recommended time, and nj is the time slot t r The cluster cen j In the number of time slots in (a),
Figure BDA0003987727820000123
representing target user u t At cluster cen j Other time slots cen j [a]Scoring the interest point l at the time, +.>
Figure BDA0003987727820000131
Representing that user v is clustered in cen j Other time slots cen j [b]Scoring the interest point l, a E [1, nj],b∈[1,nj]。
And fifthly, improving the scoring method of the traditional collaborative filtering algorithm based on the user by utilizing the time clustering result and the time similarity in the clusters, so that the scoring method can adaptively generate interest point prediction scores according to the current recommendation time, and recommending a plurality of non-access addresses with the top ranking of the current time for the user. The implementation steps are as follows:
(5.a) determining a target user u in a location social network t As a recommended service object, the current recommended time is used for time r Conversion to time slot t r
(5. B) determining the time slot t based on the clustering result r Belonging cluster cen j And the number of time slots nj in the cluster, denoted cen j ={t r ,t 2 ,t 3 ,…,t nj }。
(5. C) calculating the target user u t At t r Prediction score for time access point of interest/:
Figure BDA0003987727820000132
wherein u is t Is a target object of the current service of the recommendation system, t r Is the time slot corresponding to the current recommended time, l is an interest point which is not visited by the target user in the location social network, v is one other user in the location social network, U represents all user sets, sim (U) t ,v,t r ) Representing user u t And user v in time slot t r User similarity at time, nj is time slot t r The cluster cen j In the number of time slots in (a),
Figure BDA0003987727820000133
representing that user v is at time cen j [i]Scoring the interest point l, i E [1, nj],timesimi(t r ,cen j [i]) Representing the current time t r With other times cen j [i]Similarity between them.
(5. D) for target user u t All addresses which are not accessed are ordered according to predictive scores, N positions which are ranked at the top are formed into a recommendation list, and the recommendation is formedList TopNList t And returned to the target user (N can be a multiple of 5, and N is more than or equal to 5 and less than or equal to 50 in general cases).
And sixthly, evaluating the recommendation quality by using the recommendation precision index, and comparing the recommendation precision of the recommendation system and other classical recommendation systems provided by the invention, and evaluating the accuracy and effectiveness of the proposed technology. The implementation steps are as follows:
(6.a) randomly selecting 157 users from the target data set as a target user set AU, and respectively running a time-aware self-adaptive interest point recommendation algorithm, a classical user-based collaborative filtering algorithm UBCF and a social relationship-based collaborative filtering algorithm SCF for each target user in the set to generate a recommendation list.
And (6. B) evaluating the accuracy of each recommendation system by using the accuracy indexes, wherein the values of the accuracy Precision, recall ratio Recall and comprehensive accuracy index F1 of each algorithm running once for the target user set AU are the average value of the indexes of all users in the AU set.
(6. C) repeating steps (6.a) and (6. B) 100 times, i.e., all algorithms run independently 100 times.
(6.d) setting the values of the accuracy Precision, recall and comprehensive Precision index F1 of the recommendation algorithm and UBCF and SCF algorithms proposed by the invention to be the average value of 100 running results. When N takes different values, the results of Precision, recall, and integrated Precision index F1 of each recommendation algorithm are shown in tables 2, 3, and 4, respectively, where the value of each row with the bold format represents the maximum value of the row index:
TABLE 2 Precision index values for different recommendation algorithms
Figure BDA0003987727820000141
Table 3 Recall index values for different recommendation algorithms
Figure BDA0003987727820000142
TABLE 4 recommendation precision F1 index values of different recommendation algorithms
Figure BDA0003987727820000143
The histogram of the comparison of the accuracy Precision, recall, and integrated accuracy index F1 of the recommended algorithm and the classical UBCF, SCF algorithms in this case are shown in fig. 6, 7, and 8, respectively.
(6.e) comparative analysis of each index results: the accuracy of the time perception self-adaptive interest point recommendation algorithm based on the K-means clustering is larger than that of other recommendation algorithms, so that the accuracy of the technology provided by the invention for hitting user favorite items is higher; the Recall rate Recall of the algorithm provided by the invention is larger than the Recall value of other recommended algorithms, which shows that the technical query capability of the algorithm provided by the invention is stronger; the comprehensive precision index F1 value of the algorithm provided by the invention is larger than the F1 values of other recommendation algorithms, which shows that the technology provided by the invention has stronger comprehensive capability in the aspect of recommendation precision.
Different from the conventional interest point recommendation algorithm, the method aims at constructing the interest point recommendation system which can generate an interest point list according to time points in real time and has accurate recommendation results, considers the difference and the correlation of user sign-in data characteristics in different time slots, innovatively provides an analysis mode of the distance from the time points to a clustering center, adopts a K-means clustering method to mine the correlation between the time slots, relieves the sparse problem of high-dimensional sign-in data through the time clustering, improves the accuracy and the effectiveness of scoring prediction, and strengthens the service quality of the recommendation system. The technology provided by the invention has wide application prospect and is expected to be widely applied to the social network market based on the position.
The above technical process is only a preferred embodiment of the present invention, but not represents all the details of the present invention. Any modification, equivalent replacement, and improvement made by those skilled in the art within the scope of the present disclosure, which is within the spirit and principles of the present invention, should be included in the scope of the present invention.

Claims (7)

1. A time perception self-adaptive interest point recommendation method based on K-means clustering is characterized by comprising the following steps:
step 1: collecting and sorting an original sign-in data set of a user, and converting the original sign-in data set into a three-dimensional scoring matrix of the user-time-interest point;
step 2: counting the number of checked-in users, the number of accessed interest points and the number of checked-in times in each time slot; constructing a three-dimensional sign-in feature vector of each time slot based on the statistical result to form a time slot sign-in data feature set;
step 3: based on the statistical result of the second step, clustering the time slots by adopting a K-means method, and calculating the time similarity between the time slots in the same cluster;
step 4: according to the basic principles of high similarity in clusters and low similarity among clusters, calculating the similarity of users at the current recommended time by reasonably utilizing scoring information in other time slots in the same time cluster;
Step 5: the scoring method of the traditional collaborative filtering algorithm based on the user is improved by utilizing the time clustering result and the time similarity in the clusters, so that the scoring method can adaptively generate a point-of-interest prediction score according to the current recommendation time, and a plurality of non-access addresses with the top ranking of the current time are recommended to the user;
step 6: and evaluating the recommendation quality by using a recommendation precision index, and comparing the recommendation precision with the recommendation precision of other classical recommendation systems to evaluate the accuracy and the effectiveness of the proposed technology.
2. The K-means clustering-based time-aware adaptive interest point recommendation method according to claim 1, wherein step 1 of the method comprises:
step 11: the original check-in data set C of the user is arranged to obtain n check-in records, and the n check-in records are recorded as C= { C 1 ,c 2 ,…,c n -a }; each sign-in record is formed into a user ID, a sign-in time,Geographic latitude, geographic longitude, and point of interest ID quintuple; all user sets in the sign-in dataset are represented by U, all interest point sets are represented by L, and NU and NL are the number of users and interest points respectively;
step 12: dividing the time of day into 24 discrete time slots, the set of time slots being denoted t= {0,1,2, …,23}; rounding the check-in time in each check-in record to obtain the value of the corresponding time slot t, and t epsilon [0,23];
Step 13: counting check-in times of five-tuple set of check-in records, and generating corresponding four-tuple u for each pair of user-time-interest points i ,t,l j ,n i,t,j Wherein u is i Is the ith user (i.e. [1, NU)]),l j Is the j-th interest point, j is E [1, NL]T is the value of the time slot obtained by rounding the time point in the check-in record, t is epsilon [0,23 ]],n i,t,j Is user u i Accessing point of interest l at time slot t j Is a number of times (1);
step 14: user u i Accessing point of interest l at time slot t j Number of check-ins n i,t,j Conversion to user u i At time slot t, point of interest l j Score r of (2) i,t,j If user u i Go past the interest point l in the time slot t j Score r i,t,j =1; conversely, r i,t,j =0:
Figure FDA0003987727810000011
Wherein r is i,t,j Representing user u i For address l at time slot t j Score of n i,t,j Representing user u i Accessing a point of interest l at time slot t j Is a number of times (1);
summarizing all scores to form a user-time-interest point three-dimensional scoring matrix R= { R i,t,j },i∈[1,NU],t∈[0,23],j∈[1,NL]Wherein i denotes a user number, t denotes a time slot value, j denotes an address number, NU denotes a total number of users, NL denotes a total number of points of interest, r i,t,j Representation ofUser u i For address l at time slot t j Is a score of (2).
3. The K-means clustering-based time-aware adaptive interest point recommendation method according to claim 1, wherein step 2 of the method comprises:
Step 21: counting the number of users Unum whose check-in actions occur in the time slot t in the check-in data set t
Unum t =∑ u∈U isCheck(u,t) (2)
Where U is a user in the location social network, U represents all user sets in the check-in dataset, and the isCheck function represents whether user U has a check-in behavior in time slot t:
Figure FDA0003987727810000021
where L is a point of interest in the location social network, L represents a set of all points of interest in the check-in dataset, r u,t,l A score representing the address l of user u at time slot t;
step 22: counting the number of points of interest Pnum in which the check-in data is accessed in time slot t t
Pnum t =∑ l∈L isChecked(l,t) (4)
Where L is a certain point of interest in the location social network, L represents a set of all points of interest in the check-in dataset, and the ischcocked function represents whether the point of interest L is accessed within the time slot t:
Figure FDA0003987727810000022
where U is a user in the location social network, U represents a collection of all users in the check-in dataset, r u,t,l A score representing the address l of user u at time slot t;
step 23: statistics check-in dataThe total number of check-ins Cnum occurring in time slot t t
Figure FDA0003987727810000023
Where n is the number of check-in records in the check-in dataset C, and the isTime function represents the ith check-in record C i Whether it occurs within time slot t:
Figure FDA0003987727810000024
Wherein, time is i in t represents the ith check-in record c i Is the time of check-in time of (C) i The corresponding time slot is t;
step 24: based on the statistical result, constructing a three-dimensional sign-in feature vector x of each time slot t t ={Unum t ,Pnum t ,Cnum t Form a time slot sign-in data feature set x= { X 0 ,x 1 ,…,x 23 T e [0,23 ]],Unum t The number of users, pnum, who have checked-in the time slot t t Is the number of points of interest accessed in time slot t, cnum t Is the total number of check-ins that occur in time slot t.
4. The K-means clustering-based time-aware adaptive interest point recommendation method according to claim 1, wherein the step 3 comprises:
step 31: the 24 time slots are clustered by adopting a K-means method with simple algorithm and high convergence speed, and nc cluster centers Cen= { Cen are generated 1 ,cen 2 ,…,cen nc }(nc∈[2,24]);
Step 32: for any two time slots t and t' in each time cluster set, calculating the time similarity between the two time slots:
Figure FDA0003987727810000031
where U is a user in the location social network, U is a set of all users in the check-in data set, L is a point of interest in the location social network, L is a set of all points of interest in the check-in data set, r u,t,l Representing the score of user u to address l at time slot t, r u,t',l Representing the score of user u to address l at time slot t', NU represents the total number of users in the check-in dataset.
5. The K-means clustering-based time-aware adaptive interest point recommendation method according to claim 1, wherein step 4 of the method comprises:
step 41: selecting a target user u in a location social network t As a recommended service object, the current recommended time is used for time r Conversion to time slot t r
Step 42: determining a time slot t according to the clustering result r Belonging cluster cen j And the number of time slots nj in the cluster, denoted cen j ={t r ,t 2 ,t 3 ,…,t nj Computing active user u t And other users v in time slot t r User similarity at time:
Figure FDA0003987727810000032
wherein u is t Is the target object of the current service of the recommendation system, v is one other user in the location social network, t r Is the time slot corresponding to the current recommended time, and nj is the time slot t r The cluster cen j NL denotes the total number of points of interest in the check-in dataset,
Figure FDA0003987727810000033
representing target user u t At cluster cen j Other time slots cen j [a]Scoring the interest point l at the time, +.>
Figure FDA0003987727810000034
Representing that user v is clustered in cen j Other time slots cen j [b]Scoring the interest point l, a E [1, nj],b∈[1,nj]。
6. The K-means clustering-based time-aware adaptive interest point recommendation method according to claim 1, wherein step 5 of the method comprises:
Step 51: determining a target user u in a location social network t As a recommended service object, the current recommended time is used for time r Conversion to time slot t r
Step 52: determining a time slot t according to the clustering result r Belonging cluster cen j And the number of time slots nj in the cluster, denoted cen j ={t r ,t 2 ,t 3 ,…,t nj };
Step 53: calculating the target user u t At t r Prediction score for time access point of interest/:
Figure FDA0003987727810000041
wherein u is t Is a target object of the current service of the recommendation system, t r Is the time slot corresponding to the current recommended time, l is an interest point which is not visited by the target user in the location social network, v is one other user in the location social network, U represents all user sets, sim (U) t ,v,t r ) Representing user u t And user v in time slot t r User similarity at time, nj is time slot t r The cluster cen j In the number of time slots in (a),
Figure FDA0003987727810000042
representing that user v is at time cen j [i]Scoring the interest point l, i E [1 ],nj],timesimi(t r ,cen j [i]) Representing the current time t r With other times cen j [i]Similarity between;
step 54: for target user u t All addresses which are not accessed are ordered according to predictive scores, N positions which are ranked at the top are formed into a recommendation list, and the recommendation list TopNList is formed t And returning to the target user.
7. The K-means clustering-based time-aware adaptive interest point recommendation method according to claim 1, wherein said step 6 comprises:
Step 61: randomly selecting NU×10% users from a target data set as a target user set AU, and running each recommendation algorithm for each target user in the set to generate a recommendation list, wherein NU represents the total number of users in the signed-in data set;
step 62: using the Precision index to evaluate the accuracy of each recommendation system, wherein the values of the Precision, recall rate Recall and comprehensive Precision index F1 of the target user set AU running once by each algorithm are the average value of the index of all users in the AU set;
step 63: repeating steps (6.a) and (6. B) Ntimes, i.e., all algorithms run independently Ntimes;
step 64: setting the values of the Precision, recall rate Recall and comprehensive Precision index F1 of a recommendation algorithm as the average value of Ntime running results;
step 65: comparing and analyzing the results of each index: if the Precision of the time perception self-adaptive interest point recommendation algorithm based on the K-means clustering is larger than the Precision of other recommendation algorithms, the accuracy of the user preference hit by the technology is higher; if the Recall ratio Recall is larger than the Recall values of other recommendation algorithms, the technical query capability is higher; and if the comprehensive precision index F1 value is larger than the F1 values of other recommendation algorithms, the comprehensive capability of the technology in the aspect of recommendation precision is higher.
CN202211571570.7A 2022-01-21 2022-12-08 Time perception self-adaptive interest point recommendation method based on K-means clustering Pending CN116166878A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210071029.3A CN114528480A (en) 2022-01-21 2022-01-21 Time-sensing self-adaptive interest point recommendation method based on K-means clustering
CN2022100710293 2022-01-21

Publications (1)

Publication Number Publication Date
CN116166878A true CN116166878A (en) 2023-05-26

Family

ID=81620186

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202210071029.3A Pending CN114528480A (en) 2022-01-21 2022-01-21 Time-sensing self-adaptive interest point recommendation method based on K-means clustering
CN202211571570.7A Pending CN116166878A (en) 2022-01-21 2022-12-08 Time perception self-adaptive interest point recommendation method based on K-means clustering

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202210071029.3A Pending CN114528480A (en) 2022-01-21 2022-01-21 Time-sensing self-adaptive interest point recommendation method based on K-means clustering

Country Status (1)

Country Link
CN (2) CN114528480A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117635237A (en) * 2023-12-22 2024-03-01 广州方块网络技术有限公司 Advertisement management system based on SaaS information flow and cross-platform crowd data

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115408618B (en) * 2022-09-26 2023-10-20 南京工业职业技术大学 Point-of-interest recommendation method based on social relation fusion position dynamic popularity and geographic features
CN115687801B (en) * 2022-09-27 2024-01-19 南京工业职业技术大学 Position recommendation method based on position aging characteristics and time perception dynamic similarity

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107657015A (en) * 2017-09-26 2018-02-02 北京邮电大学 A kind of point of interest recommends method, apparatus, electronic equipment and storage medium
CN111104607A (en) * 2018-10-25 2020-05-05 中国电子科技集团公司电子科学研究院 Location recommendation method and device based on sign-in data
CN114036376A (en) * 2021-10-26 2022-02-11 南京理工大学紫金学院 Time-aware self-adaptive interest point recommendation method based on K-means clustering

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107657015A (en) * 2017-09-26 2018-02-02 北京邮电大学 A kind of point of interest recommends method, apparatus, electronic equipment and storage medium
CN111104607A (en) * 2018-10-25 2020-05-05 中国电子科技集团公司电子科学研究院 Location recommendation method and device based on sign-in data
CN114036376A (en) * 2021-10-26 2022-02-11 南京理工大学紫金学院 Time-aware self-adaptive interest point recommendation method based on K-means clustering

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
司亚利: "基于用户签到行为的自适应兴趣点推荐方法研究", 《中国博士学位论文全文数据库》 *
陶永才等: "一种结合时间因子聚类的群组兴趣点推荐模型", 《小型微型计算机系统》, vol. 42, no. 02 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117635237A (en) * 2023-12-22 2024-03-01 广州方块网络技术有限公司 Advertisement management system based on SaaS information flow and cross-platform crowd data

Also Published As

Publication number Publication date
CN114528480A (en) 2022-05-24

Similar Documents

Publication Publication Date Title
Christensen et al. Social group recommendation in the tourism domain
Chu et al. A hybrid recommendation system considering visual information for predicting favorite restaurants
Isinkaye et al. Recommendation systems: Principles, methods and evaluation
Mao et al. Multiobjective e-commerce recommendations based on hypergraph ranking
CN116166878A (en) Time perception self-adaptive interest point recommendation method based on K-means clustering
US20090259606A1 (en) Diversified, self-organizing map system and method
US20120185481A1 (en) Method and Apparatus for Executing a Recommendation
US20140280548A1 (en) Method and system for discovery of user unknown interests
CN114036376A (en) Time-aware self-adaptive interest point recommendation method based on K-means clustering
TW201447797A (en) Method and system for multi-phase ranking for content personalization
EP2353103A2 (en) Method and system for determining topical relatedness of domain names
CN115408618B (en) Point-of-interest recommendation method based on social relation fusion position dynamic popularity and geographic features
Xia et al. Vrer: context-based venue recommendation using embedded space ranking SVM in location-based social network
CN111475744B (en) Personalized position recommendation method based on ensemble learning
Liang et al. Collaborative filtering based on information-theoretic co-clustering
Yang et al. Design and application of handicraft recommendation system based on improved hybrid algorithm
Yin et al. A fuzzy clustering based collaborative filtering algorithm for time-aware POI recommendation
Mohamed et al. Sparsity and cold start recommendation system challenges solved by hybrid feedback
Chen et al. A restaurant recommendation approach with the contextual information
Wen-ying et al. A new framework of a personalized location-based restaurant recommendation system in mobile application
Haruna et al. Location-aware recommender system: a review of application domains and current developmental processes
Liu et al. Using contextual information for service recommendation
CN115687801B (en) Position recommendation method based on position aging characteristics and time perception dynamic similarity
Nath et al. A pragmatic review on different approaches used in e-learning recommender systems
Karmakar A Context-Aware Approach To Restaurant Recommendations: System Algorithm and Case Study

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination