CN107133277B - A kind of tourist attractions recommended method based on Dynamic Theme model and matrix decomposition - Google Patents

A kind of tourist attractions recommended method based on Dynamic Theme model and matrix decomposition Download PDF

Info

Publication number
CN107133277B
CN107133277B CN201710237404.6A CN201710237404A CN107133277B CN 107133277 B CN107133277 B CN 107133277B CN 201710237404 A CN201710237404 A CN 201710237404A CN 107133277 B CN107133277 B CN 107133277B
Authority
CN
China
Prior art keywords
user
photo
matrix
information
tourist
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201710237404.6A
Other languages
Chinese (zh)
Other versions
CN107133277A (en
Inventor
陈岭
徐振兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201710237404.6A priority Critical patent/CN107133277B/en
Publication of CN107133277A publication Critical patent/CN107133277A/en
Application granted granted Critical
Publication of CN107133277B publication Critical patent/CN107133277B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The tourist attractions recommended method based on Dynamic Theme model and matrix decomposition that the invention discloses a kind of provides fine-grained tourism recommendation service by the variation of travel histories information acquisition user travelling preference of the analysis user in different time intervals for user.Method obtains the information of data set first from social networks.Secondly, the implicit features information at user and sight spot is excavated from user history information using Dynamic Theme model.Again, by excavating the explicit features information of user and sight spot to data set analysis, and the implicit features information acquisition user-user and sight spot-sight spot affinity information of user and sight spot are combined.Finally, utilizing matrix disassembling method fusion user-user and sight spot-sight spot affinity information with joint regular terms.This method can obtain the variation of the travelling preference of user, recommend suitable tourist attractions for user.

Description

Scenic spot recommendation method based on dynamic theme model and matrix decomposition
Technical Field
The invention relates to the technical field of information recommendation, in particular to a scenic spot recommendation method based on a dynamic theme model and matrix decomposition.
Background
In recent years, with the rapid development of mobile internet, smart phones, and photo sharing websites (e.g., Flickr, Panoramio, and Instagram), a great amount of photo data with geographical location information appears on the internet. And, the number of photos with geographical location information contributed by the group shows a trend of sharp increase. The photos with the geographical position information can be used for mining tourist attractions in a city, obtaining tourist routes welcomed by tourists in the city, analyzing tourist preferences of the tourists and further providing personalized tourist attraction or tourist route recommendation service for users.
The current scenic spot recommendation method based on photo mining with geographic position information generally obtains similarity information among users by directly utilizing frequency information of visiting scenic spots of the users, and recommends the scenic spots for the users by combining a collaborative filtering method based on the users. However, due to the limitation of travel time or money cost, users usually visit only a few sights in a travel city, resulting in data sparsity problem in the modeling process of the recommendation system based on the user-sight matrix.
In order to solve the problem of data sparsity in the scenic spot recommendation, a scenic spot recommendation method based on data dimension reduction is currently provided, for example: a static topic model. The model is a popular method for obtaining the hidden theme of the document in the field of text mining. In this model, the user's travel history is treated as a document and the tourist attractions are treated as words. Through the model, the topic probability distribution of the user and the sight can be obtained. However, the static topic model considers the travel history information of the user in all time periods (for example, years) as a document when obtaining the topic probability distribution of the travel preference of the user based on the travel history of the user, and ignores the variation problem of the travel preference of the user in different time periods.
The dynamic topic model is an extended form for obtaining document topic changes based on the static topic model. It divides the document set into sub-document sets according to time period, and supposes that the subject of each sub-document set has a front-back dependency relationship, and the subject is evolved along with time. By the model, the topic probability distribution of the document and the words in different time periods can be obtained, and the topics in different time periods embody the topic evolution process of the document and the words. The model provides possibility for solving the problem that the static subject model cannot obtain the change of the user travel preference in the travel recommendation.
Disclosure of Invention
The invention aims to solve the problem of how to obtain the travel preference change information of a user through a dynamic theme model and provide fine-grained tourist attraction recommendation service for the user.
A scenic spot recommendation method based on a dynamic theme model and matrix decomposition comprises the following steps:
and acquiring data set information: obtaining a photo dataset D from a social networkphotoAnd de-noising the image to obtain a data set D of the tourist photophoto-travelThen sign on data D from social networkcheck-inAnd extracting the check-in data Dcheck-inThe category data set D of the check-in place is obtainedcategory
Obtaining user travel preference stage: using dynamic topic model from travel photo data set Dphoto-travelImplicit characteristics of the user and the scenic spots are extracted; then, by comparing the travel photo data set Dphoto-travelThe explicit characteristics of the user and the scenery spot are obtained through statistics, and the similarity information of the user, the user and the scenery spot, is obtained by combining the implicit characteristics of the user and the scenery spot; finally, fusing similarity information of the user-user and the scenic spot-scenic spot by using a matrix decomposition method with joint regular terms, and completing a sparse user-scenic spot matrix Y to obtain a matrix Y' containing user travel preference information;
and recommending tourist attractions: and scoring the scenic spots in the candidate set by using the matrix Y', and recommending the N tourist spots with the top scores to the user.
The specific steps of the data set information acquisition stage are as follows:
(1-1) downloading photo data with geographical position information in a tourist city by using public API of a photo sharing website to form a photo data set Dphoto
(1-2) treating the photograph data set D using the entropy-based fluidity methodphotoFiltering the non-tourist photos to remove the noise photos in the photo set to obtain a tourist photo data set Dphoto-travel
(1-3) downloading user check-in data D in a travel city using a public API of a location-based social media websitecheck-in
(1-4) extracting check-in data Dcheck-inAnd counting category data sets D constituting the check-in placecategory
The photo data set DphotoThe photograph data in (1) contains: the identification information of the photo, the photo shooting time, the latitude and longitude information of the photo shooting place, the text description information added by the user for the photo and the identification information of the photo uploader.
The check-in data Dcheck-inThe check-in data in (1) includes: the method comprises the steps of identifying information of a sign-in behavior of a user, time information of the sign-in of the user, longitude and latitude information of a sign-in place, category information of the sign-in place and identifying information of the sign-in user.
The specific steps of the stage of obtaining the travel preference of the user are as follows:
(2-1) Tourism photograph dataset D Using Density-based clustering methodphoto-travelCarrying out spatial clustering on the photos to obtain a tourist attraction set L;
(2-2) based on the data set D from the tourist photophoto-travelCounting the obtained times of visiting the scenic spots by the user and the scenic spots in the scenic spot set L, and constructing a user-scenic spot matrix Y;
(2-3) representing the document as the user travel history, representing the words as tourist attractions, deducing the potential theme probability distribution of the user and the attractions in different time periods by using a dynamic theme model, and combining the theme probability distribution of all time periods to obtain all implicit characteristics of the userAll implicit features of a harmony scene
(2-4) from the tourist photo data set Dphoto-traveAnd category dataset D of check-in placescategoryeExtracting explicit characteristics of a userExplicit characteristics of a attractionThen all implicit characteristics of the user are combinedAll implicit features of a harmony sceneCreating a user representation FuserScene image FlocationConstructing an m multiplied by m user-user similarity matrix A and an n multiplied by n sight spot-sight spot similarity matrix B by utilizing a cosine function;
(2-5) according to the Y, A and the B, constructing a matrix decomposition model with a joint regular term by utilizing similarity relations between the user and between the scenic spot and the scenic spot, and completing the decomposition of Y;
and (2-6) completing the sparse user-scenery spot matrix Y according to the matrix decomposition result with the joint regular term to obtain a matrix Y' containing the user travel preference information.
The specific steps of the step (2-2) are as follows:
(2-2-1) from the tourist photograph data set Dphoto-traveExtracting all access information v ═ l, u, t, wherein v represents that the user u accesses the tourist attraction l at t time;
(2-2-2) counting the number of times each user visits each attraction and the travel photograph data set D according to all the access information v ═ (l, u, t)photo-traveThe total number m of users in (1);
(2-2-3) constructing a user-sight spot matrix Y according to the number of times each user accesses each sight spot, the total number m of the users and n total sight spots in the sight spot set L, wherein Y belongs to Rm×nThe value at the (i, j) position in the matrix Y is the number of times the ith user accesses the jth attraction.
The specific steps of the step (2-3) are as follows:
(2-3-1) Tourism photograph data set Dphoto-traveAll the access information v ═ l, u, t in the sub data sets are sliced according to the same time length, and M sub data sets corresponding to each time period are obtained;
(2-3-2) using the subdata sets as the input of the dynamic theme model, obtaining the theme probability distribution of the user and the scenic spots in different time periods through training,
wherein,is the topic probability distribution of the user at the T time period,theme probability distribution of the scenic spots in the Tth time period, wherein k is the number of themes;
(2-3-3) concatenating the topic probability distributions of the users in all time periods together in time to form all implicit characteristics of the usersThe theme probability distributions of the scenic spots in all time periods are connected in series according to time to form all implicit characteristics of the scenic spots
The specific steps of the step (2-4) are as follows:
(2-4-1) for tourist photo data set Dphoto-trave(l, u, t) and the category data set D of the check-in placecategoryMaking statistics to obtain user explicit characteristicsExplicit characteristics of a attractionWherein r is the total number of the explicit characteristics of the user, and s is the total number of the explicit characteristics of the scenic spots;
(2-4-2) combining the explicit characteristics and the implicit characteristics of the user to construct the user portraitCombining explicit and implicit features of sights oneStarting from, constructing a scenic spot portrait
(2-4-3) combining the cosine formula to obtain a user-user similarity matrix A:
wherein f ispiAnd fqiThe ith explicit feature representing users p and q, respectively;
the sight-sight similarity matrix B is also obtained by using a cosine formula, and at this time, the cosine formula is:
wherein f isxiAnd fyiI-th representing sights x and y, respectivelyAn explicit feature.
In the step (2-5), in the decomposition process of the matrix Y, the similarity information of A and B is used as an additional regular term to limit the decomposition of Y, and the specific objective function is as follows:
wherein R isijIs the value at the (I, j) position in the matrix Y, IijThe identifier indicating whether the user i accesses the sight j has a value of 1 if the user i accesses the sight j, and 0 if the user i does not access the sight jigInformation indicating the similarity between user i and user g, SimjqRepresenting similarity information between sight j and sight q; u shapeiPotential feature vector, U, denoted as user igPotential feature vector, L, represented as user gjPotential feature vector, L, denoted as sight jgRepresented as a potential feature vector of the sight g,representing the distance between the potential feature vector of the user i and the potential feature vector of the user g, G (i) representing a similarity user group of the user i, Q (j) representing a similarity scenery group of a scenery j, U being the potential feature vector of the user after Y decomposition, and being d multiplied by m dimension; l is a potential feature vector of the scenic spot after Y decomposition and is d multiplied by n dimension; wherein m, n and d respectively represent the number of users, the number of scenic spots and the number of potential feature vectors;
the specific steps of decomposing Y based on the matrix decomposition model with the joint regularization term are as follows:
(a) a parameter U and a parameter L are initialized randomly, and a learning rate α, an error threshold value delta and a parameter lambda are set1And λ2
(b) For each non-zero value R in YijAccording toCalculation of RijEstimated value X ofijAccording toCalculating XijWith the true value RijIs finally based onCounting the total error theta of all nonzero values, wherein w is the number of the nonzero values;
(c) judging whether the total error theta is larger than an error threshold value delta or not, if so, executing the step (d), otherwise, finishing iteration, and finishing the decomposition of the matrix Y, wherein the U and the L are optimal values;
(d) updating the values of U and L by adopting a gradient descent method, and then skipping to execute the step (b), wherein the formula of the gradient descent method is as follows:
in the step (2-6), according to the formulaThe missing values in Y are filled in, and the filled values represent the user's travel preferences.
The tourist attraction recommending stage comprises the following specific steps:
(3-1) acquiring a scenic spot set of the user in the tourist city according to the user input information, wherein the scenic spots are used as a recommended candidate set;
(3-2) obtaining the scores of the recommended candidate concentrated scenic spots by the user according to the matrix Y' to obtain the scenic spots preferred by the user;
and (3-3) sorting the scores of the preferred scenic spots in a descending order, and selecting N tourist attractions with top scores and ranks to recommend to the user.
The invention provides a scenic spot recommendation method based on a dynamic topic model and matrix decomposition aiming at the problem that the traditional static topic model is not enough to obtain the topic evolution preferred by the user for travel, and compared with the existing method, the scenic spot recommendation method has the advantages that:
(1) with the dynamic topic model, topic change information (implicit characteristic information) of the user's travel preference is obtained.
(2) Through the analysis of the data set information, a great deal of explicit characteristic information of the user and the scenic spot is obtained, and the information and the implicit characteristic information can comprehensively describe the characteristics of the user and the scenic spot.
(3) The method fuses the information of similarity between the user and between the scenery spot and the scenery spot through a matrix decomposition method with a joint regular term, can simultaneously limit potential feature vectors of the user and the scenery spot in the process of decomposing the user-scenery spot matrix, and accurately supplements the user-scenery spot matrix.
Drawings
FIG. 1 is a flow chart of a scenic spot recommendation method based on a dynamic topic model and matrix decomposition according to the present invention;
FIG. 2 is a flow chart of a stage of acquiring data set information;
FIG. 3 is a flow chart of a stage of obtaining user travel preferences;
FIG. 4 is a schematic diagram of document generation based on a dynamic topic model;
FIG. 5 is a flow chart of the recommend tourist attraction stage.
Detailed Description
In order to more specifically describe the present invention, the following detailed description is provided for the technical solution of the present invention with reference to the accompanying drawings and the specific embodiments.
As shown in FIG. 1, the scenic spot recommendation method based on the dynamic topic model and matrix decomposition of the present invention is divided into three stages of acquiring data set information, acquiring user travel preference, and recommending scenic spots:
stage of obtaining data set information
The flow chart of the data processing is shown in fig. 2, and the steps are as follows:
s1-1, downloading photo data set D with geographical location information in tourist city by using public API of photo sharing websitephoto
The specific steps of acquiring the photo data include:
s1-1-1, downloading photos taken in the city and corresponding metadata information according to the city through a public API provided by a photo sharing website (such as Flickr). A photo p with geographical location information can be represented as: p ═ p (p)id,pt,pg,px,pu). Wherein p isid,pt,pg,px,puRespectively representing the unique identification number of the photo, the time of taking the photo by the user, the latitude and longitude information of the photo, the text description information added for the photo by the user and the unique identification number of the uploader of the photo;
s1-1-2, obtaining all photo information of each user according to the unique identification information of the photos for the collected photo set, Hi={p1,p2,…,peAnd e is the number of all pictures shot by the user i.
S1-2, utilizing entropy-based fluidity method to photograph data set DphotoFiltering the non-tourism photos to remove the noise photos concentrated in the photosFilm to obtain a travel photo data set Dphoto-travel
The specific steps for removing the non-tourist photos are as follows:
s1-2-1, manually labeling a small number of travel photos and non-travel photos for the photo set of the user according to the photo content and experience knowledge;
s1-2-2, dividing the city into x '× y' small grids (each small grid is expressed as (x)i,yj) I ═ 1,2, …, x; j ═ 1,2, …, y), counting the number of photos in the small lattices, calculating the proportion a of the photos in each small lattice to the whole number of photos, and calculating the fluidity entropy H of the photo set of the user according to the principle of information entropymobilityThe calculation process is shown as the following formula
According to Hmobility>And e, adjusting the value of the e (from 0 to 1, every 0.1), counting the classification accuracy of the manually marked photo set, selecting the e with high accuracy to classify the photos in the whole data set, and removing the non-tourism photos.
S1-3, downloading check-in data D of user in tourist city by using public API of social media website based on positioncheck-in
The specific steps of obtaining the check-in data comprise:
s1-3-1, data for a user check-in a city is downloaded by city through a public API provided by a location-based social media website (e.g., SinaWeibo). The one-time check-in behavior q of a user may be represented as: q ═ qid,qt,qg,qc,qu) (ii) a Wherein q isid,qt,qg,qc,quUnique identification number representing sign-in behavior, time of sign-in of user, and point of interestLatitude and longitude information, category information of the interest points and a unique identification number of the sign-in user;
s1-3-2, extracting all sign-in data of each user according to the user identification number of the sign-in data of the user; qi={q1,q2,…,qoAnd o is all the check-in numbers of the user i in the city.
S1-4, extracting the check-in data Dcheck-inAnd counting category data sets D constituting the check-in placecategory
The user check-in behavior information includes some category information added when the user accesses the interest points, and statistics of the information can obtain all category information of each interest point, which can be specifically expressed as: cPOI=(c1,c2,…,cz). Wherein z is the number of categories contained in the interest point.
Obtaining user travel preference phase
The flow chart of the stage of obtaining the user's travel preference is shown in FIG. 3, and the steps are as follows:
s2-1, carrying out spatial clustering on the photos with the geographic position information by using a density-based clustering method, thereby obtaining a tourist attraction set L in the tourist city.
Users typically take pictures at locations where they are of more interest, and if a large number of users take pictures at a location, the location may be considered a tourist attraction. A clustering method based on density (such as P-DBSCAN) is adopted to perform spatial clustering on a large number of photos with geographical position information, each obtained cluster represents a tourist attraction, and a clustering center is the position of the tourist attraction. Through the process, a travel scene set L ═ L is excavated1,l2,…,lnWhere l ═ Pl,gl},PlIs a collection of all photos belonging to a scene, glIs latitude and longitude information of a scenic spot.
And S2-2, counting the number information of the scenic spots visited by the user according to the scenic spots mined from the photos and the visit history of the user at the scenic spots, and constructing a user-scenic spot matrix Y.
The specific steps of establishing the user-scenery spot matrix comprise:
s2-2-1, extracting historical access information of the user to the attraction from the photo collection, v ═ l, u, t. Wherein l, u, t are tourist attractions visited by the user, the user's logo, and the time when the user visited the attraction, respectively.
And S2-2-2, counting the times of visiting the scenic spots by the user. The constructed user-sight matrix may be represented as Y: y is formed by the element Rm×nM and n respectively represent the number of users and scenic spots, and the value in the matrix is the number of times that the users access the scenic spots.
S2-3, representing the document as the user travel history and the word as the tourist spot, deducing the potential theme probability distribution of the user and the tourist spot in different time periods by using the dynamic theme model, and combining the theme probability distribution of all the time periods to obtain all the implicit characteristics of the userAll implicit features of a harmony scene
A document is divided into several sub-document sets according to time, the topics of the document sets and words in different time periods have a front-back dependency relationship, the topic distribution in the next time period is evolved from the topic in the previous time period, α is a parameter for controlling the topic distribution of the sub-document sets, theta is a parameter for controlling the topic distribution of a single document, β is a probability distribution parameter for controlling topics on words, z represents topic, k represents the number of topics, w represents words, N represents the document, A represents the sub-document sets.
In a travel recommendation based on photo mining with geographic location information, the travel history information of a user can be viewed as a combination of multiple topics, each being a probability distribution of multiple sights. Briefly, when using a dynamic topic model, documents represent a user's travel history and words represent tourist attractions.
S2-3-1, the specific steps of obtaining the implicit characteristics of the user and the scenic spot include:
the travel history information of the user is divided into different sub data sets according to time slices (for example, years).
S2-3-2, using the subdata set as the input of the dynamic theme model, obtaining the theme probability distribution of the user and the sight spot in different time periods through training,
wherein,is the topic probability distribution of the user at the T time period,theme probability distribution of the scenic spots in the Tth time period, wherein k is the number of themes;
s2-3-3, concatenating the topic probability distribution of the user in all time periods in time to form all implicit characteristics of the userThe theme probability distributions of the scenic spots in all time periods are connected in series according to time to form all implicit characteristics of the scenic spots
S2-4, extracting the user 'S explicit characteristics from the user' S travel historyExplicit characteristics of a attractionThen all implicit characteristics of the user are combinedAll implicit features of a harmony sceneCreating a user representation FuserScene image FlocationAnd an m × m user-user similarity matrix A and an n × n scenery spot-scenery spot similarity matrix B are constructed by utilizing a cosine function.
The specific steps for establishing A and B comprise:
s2-4-1, counting the historical information of the scenic spot visited by the user (such as the total number of scenic spots visited by the user, the total number of users visited by a scenic spot, scenic spot category information and the like), obtaining a large amount of display information to describe the characteristics of the user and the scenic spot, and respectively representing:andwherein r and s represent the total number of user and scenery display features, respectively, each specific feature being as in Table 1And table 2. Some of these features are acquired using third-party web services, as described in detail below:
gender and age information: the information of the tourists is obtained by analyzing the content of the photos through a third-party network service. For example: www.alchemyapi.com, when the web service gets an uploaded picture, it calls its API function (i.e., the Alchemy Vision Face Detection and Recognition API) to analyze the picture and then returns the gender and age information of the Face appearing in the picture to the uploader. And counting the gender and age information of the human face in all the photos shot in the scenic spot to obtain the distribution of the gender and age information of the scenic spot. Similarly, by counting the sex and age information of the face in all the photos taken by a user, the distribution of the sex and age information of the face in the photos taken by the user is obtained.
Weather information: the weather information when the photo was taken can be obtained based on the third party weather web service, the latitude and longitude information of the photo, and the time of the photo taken. For example: com, through the API function of the web service, weather information at different time points at each location can be obtained. The photos taken by the user under different weather conditions are counted, so that the popularity of the scenic spot among tourists under different weather conditions can be obtained.
S2-4-2, combining the explicit characteristics and the implicit characteristics of the user and the scenery spot together to construct the portrait of the user and the scenery spot, namely:and
s2-4-3, combining the cosine formula to obtain a user-user similarity matrix A (m x m):
wherein f ispiAnd fqiThe ith explicit feature representing users p and q, respectively;
the sight-sight similarity matrix B (n × n) is also obtained by using the cosine formula, in this case, the cosine formula is:
wherein, f'xiAnd f'yiThe ith explicit feature representing sights x and y, respectively.
S2-5, according to the established Y, A and B, a matrix decomposition model with joint regular terms is constructed by utilizing similarity relations between the user and between the scenic spot and the scenic spot, and decomposition of Y is completed.
In the Y decomposition process, the similarity information of A and B is used as an additional regular term to limit the decomposition of Y, and the specific objective function is as follows:
in this formula SimigRepresenting user uiAnd user ugSimilarity information between, SimjqIndicating a scene ljAnd lqSimilarity information between sights. U (d m) and L (d n) are potential vector representations of the user and the sight, respectively, after Y decomposition. Wherein m, n, and d represent the number of users, the number of sights, and the number of potential feature vectors, respectively.
The specific steps of decomposing Y based on the matrix decomposition model with the joint regularization term are as follows:
(a) a parameter U and a parameter L are initialized randomly, and a learning rate α, an error threshold value delta and a parameter lambda are set1And λ2
(b) For each non-zero value R in YijAccording toCalculation of RijEstimated value X ofijAccording toCalculating XijWith the true value RijIs finally based onCounting the total error theta of all nonzero values, wherein w is the number of the nonzero values;
(c) judging whether the total error theta is larger than an error threshold value delta or not, if so, executing the step (d), otherwise, finishing iteration, and finishing the decomposition of the matrix Y, wherein the U and the L are optimal values;
(d) and (c) updating the values of U and L by adopting a gradient descent method, and then skipping to execute the step (b), wherein the formula of the gradient descent method is as follows:
and S2-6, completing sparse Y according to the matrix decomposition result with the joint regular term, and obtaining the user travel preference.
According to the formulaThe missing values in Y are filled in, and the filled values represent the user's travel preferences.
TABLE 1 user explicit feature information
TABLE 2 explicit characteristic information of scenic spots
Stage of recommending tourist attractions
The process of recommending tourist attractions is shown in fig. 5, and mainly comprises the following steps:
and S3-1, acquiring a (destination) scenic spot set of the user in the tourist city according to the user input information, wherein the scenic spots are used as a recommended candidate set.
And searching in the supplemented user-sight spot matrix according to the ID of the user and the tour city c (destination) of the user to obtain a sight spot set L' in the tour city c.
And S3-2, obtaining scores of the sights of the user according to the result of the step 1 and the supplemented user-sight matrix, namely obtaining the preference of the user for the sights.
Each value in the supplemented user-sight spot matrix represents the preference score of the user for different sight spots, and the preference of the user for different sight spots in the tourist city can be obtained according to the matrix and the L'.
And S3-3, sequencing the scenic spots according to S3-2, and selecting top N tourist attractions to recommend to the user.
And according to the scores of different scenic spots of the user in the tourist city, arranging the scenic spots in a descending order, and recommending the N scenic spots arranged in the front to the user.
The above-mentioned embodiments are intended to illustrate the technical solutions and advantages of the present invention, and it should be understood that the above-mentioned embodiments are only the most preferred embodiments of the present invention, and are not intended to limit the present invention, and any modifications, additions, equivalents, etc. made within the scope of the principles of the present invention should be included in the scope of the present invention.

Claims (7)

1. A scenic spot recommendation method based on a dynamic theme model and matrix decomposition comprises the following steps:
and acquiring data set information: obtaining a photo dataset D from a social networkphotoAnd de-noising the image to obtain a data set D of the tourist photophoto-travelThen obtaining check-in data D from social networkcheck-inAnd extracting the check-in data Dcheck-inThe category data set D of the check-in place is obtainedcategory
Obtaining user travel preference stage: the method comprises the following specific steps:
(2-1) Tourism photograph dataset D Using Density-based clustering methodphoto-travelCarrying out spatial clustering on the photos to obtain a tourist attraction set L;
(2-2) based on the data set D from the tourist photophoto-travelCounting the obtained times of visiting the scenic spots by the user and the scenic spots in the scenic spot set L, and constructing a user-scenic spot matrix Y;
(2-3) representing the document as the user travel history, representing the words as tourist attractions, deducing the potential theme probability distribution of the user and the attractions in different time periods by using a dynamic theme model, and combining the theme probability distribution of all time periods to obtain all implicit characteristics of the userAll implicit features of a harmony scene
(2-4) from the tourist photo data set Dphoto-travelAnd category dataset D of check-in placescategoryExtracting explicit characteristics of a userExplicit characteristics of a attractionThen all implicit characteristics of the user are combinedAll implicit features of a harmony sceneCreating a user representation FuserScene image FlocationAnd constructing mxm user-user by cosine functionA similarity matrix A and an nxn scenery-scenery similarity matrix B;
(2-5) according to the Y, A and the B, constructing a matrix decomposition model with a joint regular term by utilizing similarity relations between the user and between the scenic spot and the scenic spot, and completing the decomposition of Y;
(2-6) completing a sparse user-scenery spot matrix Y according to a matrix decomposition result with a joint regular term to obtain a matrix Y' containing user travel preference information;
and recommending tourist attractions: and scoring the scenic spots in the candidate set by using the matrix Y', and recommending the N tourist spots with the top scores to the user.
2. The method for tourist attraction recommendation based on dynamic topic model and matrix decomposition of claim 1, wherein the specific steps of the stage of obtaining data set information are as follows:
(1-1) downloading photo data with geographical position information in a tourist city by using public API of a photo sharing website to form a photo data set Dphoto
(1-2) treating the photograph data set D using the entropy-based fluidity methodphotoFiltering the non-tourist photos to remove the noise photos in the photo set to obtain a tourist photo data set Dphoto-travel
(1-3) downloading user check-in data D in a travel city using a public API of a location-based social media websitecheck-in
(1-4) extracting check-in data Dcheck-inAnd counting category data sets D constituting the check-in placecategory
3. The method for recommending tourist attractions based on dynamic topic model and matrix decomposition of claim 1 wherein the specific steps of step (2-2) are as follows:
(2-2-1) from the tourist photograph data set Dphoto-travelWherein v represents the user u at time tVisiting a tourist attraction l;
(2-2-2) counting the number of times each user visits each attraction and the travel photograph data set D according to all the access information v ═ (l, u, t)photo-travelThe total number m of users in (1);
(2-2-3) constructing a user-sight spot matrix Y according to the number of times each user accesses each sight spot, the total number m of the users and n total sight spots in the sight spot set L, wherein Y belongs to Rm×nThe value at the (i, j) position in the matrix Y is the number of times the ith user accesses the jth attraction.
4. The method for tourist attraction recommendation based on dynamic theme model and matrix decomposition as claimed in claim 1, wherein said steps (2-3) are specifically:
(2-3-1) Tourism photograph data set Dphoto-travelAll the access information v ═ l, u, t in the sub data sets are sliced according to the same time length, and M sub data sets corresponding to each time period are obtained;
(2-3-2) using the subdata sets as the input of the dynamic theme model, obtaining the theme probability distribution of the user and the scenic spots in different time periods through training,
wherein,is the topic probability distribution of the user at the T time period,is the topic probability distribution of the scenic spot in the Tth time period, k is the number of topics,for the kth topic probability distribution for the user at the T time period,the k topic probability distribution of the scenic spot in the T time period;
(2-3-3) concatenating the topic probability distributions of the users in all time periods together in time to form all implicit characteristics of the usersThe theme probability distributions of the scenic spots in all time periods are connected in series according to time to form all implicit characteristics of the scenic spots
5. The method of claim 4, wherein the steps (2-4) comprise the following steps:
(2-4-1) for tourist photo data set Dphoto-travel(l, u, t) and the category data set D of the check-in placecategoryMaking statistics to obtain user explicit characteristicsExplicit characteristics of a attractionWherein r is the total number of the explicit characteristics of the user, and s is the total number of the explicit characteristics of the scenic spots;
(2-4-2) combining the explicit characteristics and the implicit characteristics of the user to construct the user portraitCombining the explicit characteristics and the implicit characteristics of the scenic spot to construct the scenic spot portrait
(2-4-3) combining the cosine formula to obtain a user-user similarity matrix A:
wherein u isp,uqRepresenting user p and user q, f, respectivelypiAnd fqiRespectively represents upAnd uqR is the total number of explicit features;
the sight-sight similarity matrix B is also obtained by using a cosine formula, and at this time, the cosine formula is:
wherein lx,lyRespectively representing sight x and sight y, fxiAnd fyiRespectively represent lxAnd lyS is the number of explicit features.
6. The method as claimed in claim 5, wherein in the step (2-5), in the decomposition of the matrix Y, the similarity information of A and B is used as an additional regular term to limit the decomposition of Y, and the specific objective function is:
wherein R isijIs a matrixValue at the (I, j) position in Y, IijThe identifier indicating whether the user i accesses the sight j has a value of 1 if the user i accesses the sight j, and 0 if the user i does not access the sight jigInformation indicating the similarity between user i and user g, SimjqIndicating a point of sightjSimilarity information between q scenic spots and the scenic spot; u shapeiPotential feature vector, U, denoted as user igPotential feature vector, L, represented as user gjPotential feature vector, L, denoted as sight jqRepresented as a potential feature vector of the sight q,representing the distance between the potential feature vector of the user i and the potential feature vector of the user g, G (i) representing a similarity user group of the user i, Q (j) representing a similarity scenery group of a scenery j, U being the potential feature vector of the user after Y decomposition, and being d multiplied by m dimension; l is a potential feature vector of the scenic spot after Y decomposition and is d multiplied by n dimension; wherein m, n and d respectively represent the number of users, the number of scenic spots and the number of potential feature vectors;
the specific steps of decomposing Y based on the matrix decomposition model with the joint regularization term are as follows:
(a) randomly initializing U and L, and setting learning rate α, error threshold delta, parameter lambda1And λ2
(b) For each non-zero value R in YijAccording toCalculation of RijEstimated value X ofijAccording toCalculating XijWith the true value RijIs finally based onCounting the total error theta of all nonzero values, wherein w is the number of the nonzero values;
(c) judging whether the total error theta is larger than an error threshold value delta or not, if so, executing the step (d), otherwise, finishing iteration, and finishing the decomposition of the matrix Y, wherein the U and the L are optimal values;
(d) updating the values of U and L by adopting a gradient descent method, and then skipping to execute the step (b), wherein the formula of the gradient descent method is as follows:
7. the method of claim 1, wherein the scenic spot recommendation method based on the dynamic topic model and the matrix factorization comprises the following steps:
(3-1) acquiring a scenic spot set of the user in the tourist city according to the user input information, wherein the scenic spots are used as a recommended candidate set;
(3-2) obtaining the scores of the recommended candidate concentrated scenic spots by the user according to the matrix Y' to obtain the scenic spots preferred by the user;
and (3-3) sorting the scores of the preferred scenic spots in a descending order, and selecting N tourist attractions with top scores and ranks to recommend to the user.
CN201710237404.6A 2017-04-12 2017-04-12 A kind of tourist attractions recommended method based on Dynamic Theme model and matrix decomposition Expired - Fee Related CN107133277B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710237404.6A CN107133277B (en) 2017-04-12 2017-04-12 A kind of tourist attractions recommended method based on Dynamic Theme model and matrix decomposition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710237404.6A CN107133277B (en) 2017-04-12 2017-04-12 A kind of tourist attractions recommended method based on Dynamic Theme model and matrix decomposition

Publications (2)

Publication Number Publication Date
CN107133277A CN107133277A (en) 2017-09-05
CN107133277B true CN107133277B (en) 2019-09-06

Family

ID=59716372

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710237404.6A Expired - Fee Related CN107133277B (en) 2017-04-12 2017-04-12 A kind of tourist attractions recommended method based on Dynamic Theme model and matrix decomposition

Country Status (1)

Country Link
CN (1) CN107133277B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110892232A (en) * 2017-11-10 2020-03-17 宝马股份公司 Method and apparatus for intelligently managing multiple potential travel destinations for a user
CN110119822B (en) * 2018-02-06 2024-03-15 阿里巴巴集团控股有限公司 Scenic spot management, journey planning method, client and server
CN108537691A (en) * 2018-06-08 2018-09-14 延晋 A kind of region visit intelligent management system and method
CN109754305A (en) * 2018-11-13 2019-05-14 北京码牛科技有限公司 The preference method of excavation and device based on matrix decomposition algorithm
CN110263256B (en) * 2019-06-21 2022-12-02 西安电子科技大学 Personalized recommendation method based on multi-mode heterogeneous information
CN110348968B (en) * 2019-07-15 2022-02-15 辽宁工程技术大学 Recommendation system and method based on user and project coupling relation analysis
CN110569447B (en) * 2019-09-12 2022-03-15 腾讯音乐娱乐科技(深圳)有限公司 Network resource recommendation method and device and storage medium
US11402223B1 (en) 2020-02-19 2022-08-02 BlueOwl, LLC Systems and methods for generating scenic routes
US11378410B1 (en) 2020-02-19 2022-07-05 BlueOwl, LLC Systems and methods for generating calm or quiet routes
CN112257517B (en) * 2020-09-30 2023-04-21 中国地质大学(武汉) Tourist attraction recommendation system based on attraction clustering and group emotion recognition
CN112348291B (en) * 2020-12-07 2022-08-26 福州灵和晞科技有限公司 Travel information management method
US11477603B2 (en) 2021-03-03 2022-10-18 International Business Machines Corporation Recommending targeted locations and optimal experience time
CN113505311B (en) * 2021-07-12 2022-03-11 中国科学院地理科学与资源研究所 Scenic spot interaction recommendation method based on' potential semantic space
CN114139052B (en) * 2021-11-19 2022-10-21 北京百度网讯科技有限公司 Ranking model training method for intelligent recommendation, intelligent recommendation method and device
CN117575125A (en) * 2024-01-17 2024-02-20 巢湖学院 Path optimization method based on matrix complement collaborative filtering and quantum approximation optimization

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105045865A (en) * 2015-07-13 2015-11-11 电子科技大学 Kernel-based collaborative theme regression tag recommendation method
CN106055713A (en) * 2016-07-01 2016-10-26 华南理工大学 Social network user recommendation method based on extraction of user interest and social topic

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9542477B2 (en) * 2013-12-02 2017-01-10 Qbase, LLC Method of automated discovery of topics relatedness

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105045865A (en) * 2015-07-13 2015-11-11 电子科技大学 Kernel-based collaborative theme regression tag recommendation method
CN106055713A (en) * 2016-07-01 2016-10-26 华南理工大学 Social network user recommendation method based on extraction of user interest and social topic

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Trip similarity computation for context-aware travel recommendation exploiting geotagged photos;Zhengxing Xu;《2014 IEEE 30th International Conference on Data Engineering Workshops》;20140519;第1-5页
基于主题模型的矩阵分解推荐算法;林晓勇等;《计算机应用》;20151215;第122-125页

Also Published As

Publication number Publication date
CN107133277A (en) 2017-09-05

Similar Documents

Publication Publication Date Title
CN107133277B (en) A kind of tourist attractions recommended method based on Dynamic Theme model and matrix decomposition
CN106997389B (en) Scenic spot recommendation method based on multi-dataset and collaborative tensor decomposition
Jiang et al. Author topic model-based collaborative filtering for personalized POI recommendations
CN110555112B (en) Interest point recommendation method based on user positive and negative preference learning
Wang et al. What your images reveal: Exploiting visual contents for point-of-interest recommendation
Luan et al. Partition-based collaborative tensor factorization for POI recommendation
Zheng et al. GeoLife: A collaborative social networking service among user, location and trajectory.
Sun et al. Tour recommendations by mining photo sharing social media
Wan et al. A hybrid ensemble learning method for tourist route recommendations based on geo-tagged social networks
Zhao et al. Photo2Trip: Exploiting visual contents in geo-tagged photos for personalized tour recommendation
CN107133262A (en) A kind of personalized POI embedded based on many influences recommends method
Xu et al. A dynamic topic model and matrix factorization-based travel recommendation method exploiting ubiquitous data
Li et al. Where you instagram? associating your instagram photos with points of interest
Peng et al. Perceiving Beijing’s “city image” across different groups based on geotagged social media data
CN109062962A (en) A kind of gating cycle neural network point of interest recommended method merging Weather information
Liu et al. Where your photo is taken: Geolocation prediction for social images
CN108897750A (en) Merge the personalized location recommendation method and equipment of polynary contextual information
CN103399900A (en) Image recommending method based on location service
Lee et al. Public bike trip purpose inference using point-of-interest data
Li et al. [Retracted] Research on the Recommendation Algorithm of Rural Tourism Routes Based on the Fusion Model of Multiple Data Sources
CN105069003B (en) A kind of user's perpetual object based on forwarding chain similarity recommends computational methods
CN112015937B (en) Picture geographic positioning method and system
Chen et al. Exploiting aesthetic features in visual contents for movie recommendation
Lian et al. Mining check-in history for personalized location naming
Liao et al. An integrated model based on deep multimodal and rank learning for point-of-interest recommendation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190906

CF01 Termination of patent right due to non-payment of annual fee