CN111882381A

CN111882381A - Travel recommendation method based on collaborative memory network

Info

Publication number: CN111882381A
Application number: CN202010618323.2A
Authority: CN
Inventors: 古天龙; 陈红亮; 宾辰忠
Original assignee: Guilin University of Electronic Technology
Current assignee: Guilin University of Electronic Technology
Priority date: 2020-06-30
Filing date: 2020-06-30
Publication date: 2020-11-03
Anticipated expiration: 2040-06-30
Also published as: CN111882381B

Abstract

本发明公开一种基于协同记忆网络的旅游推荐方法，首先使用神经网络将用户与景点进行向量化，然后引入了记忆网络，在记忆网络中采用了注意力机制的方法，最后在预测用户对景点的得分时，将隐因子模型的全局结构信息与记忆网络的局部邻域结构信息进行结合。本发明同时考虑用户的局部邻域结构信息以及景点的局部邻域结构信息，并与隐因子模型的协同过滤方法进行融合，实现了一种基于协同记忆网络的旅游推荐方法，以此达到高准确性和高个性化旅游推荐的目标。The invention discloses a travel recommendation method based on a collaborative memory network. First, a neural network is used to vectorize users and scenic spots, then a memory network is introduced, and an attention mechanism method is adopted in the memory network. When the score is obtained, the global structure information of the latent factor model is combined with the local neighborhood structure information of the memory network. The invention simultaneously considers the local neighborhood structure information of the user and the local neighborhood structure information of the scenic spot, and integrates it with the collaborative filtering method of the latent factor model to realize a travel recommendation method based on a collaborative memory network, so as to achieve high accuracy The goal of sexual and highly personalized travel recommendations.

Description

A travel recommendation method based on collaborative memory network

技术领域technical field

本发明涉及智能推荐技术领域，具体涉及一种基于协同记忆网络的旅游推荐方法。The invention relates to the technical field of intelligent recommendation, in particular to a travel recommendation method based on a collaborative memory network.

背景技术Background technique

随着社会的发展和人们生活水平的提高，越来越多的人选择外出旅游，由于近年来旅游业的蓬勃发展以及互联网技术的迅速普及，使得目前一些主流旅游信息服务平台的旅游信息过载，人们对于旅游景点的个性化推荐需求变得越来越大，能在人们外出旅游时推荐合适的景点显得尤为重要。With the development of society and the improvement of people's living standards, more and more people choose to travel abroad. Due to the vigorous development of tourism and the rapid popularization of Internet technology in recent years, the current tourism information overload of some mainstream tourism information service platforms, People's demand for personalized recommendation of tourist attractions has become larger and larger, and it is particularly important to be able to recommend suitable attractions when people travel.

传统的基于深度学习的旅游推荐方法，主要利用基于隐因子模型的协同过滤方法来构建深度学习框架，并以此来进行推荐。这些推荐方法仅考虑了隐因子模型的全局结构信息，而没有考虑到局部邻域结构信息，这会造成的景点推荐结果准确性不高和个性化不足等问题。The traditional deep learning-based travel recommendation method mainly uses the collaborative filtering method based on the latent factor model to build a deep learning framework, and uses this to make recommendations. These recommendation methods only consider the global structure information of the latent factor model, but do not consider the local neighborhood structure information, which will cause problems such as low accuracy and insufficient personalization of scenic spot recommendation results.

发明内容SUMMARY OF THE INVENTION

本发明所要解决的是现有基于深度学习的旅游推荐方法存在景点推荐结果准确性不高和个性化不足的问题，提供一种基于协同记忆网络的旅游推荐方法。The invention aims to solve the problems of low accuracy of scenic spot recommendation results and insufficient personalization in the existing deep learning-based travel recommendation methods, and provides a travel recommendation method based on a collaborative memory network.

为解决上述问题，本发明是通过以下技术方案实现的：In order to solve the above-mentioned problems, the present invention is achieved through the following technical solutions:

一种基于协同记忆网络的旅游推荐方法，包括步骤如下：A travel recommendation method based on a collaborative memory network, comprising the following steps:

步骤1、利用爬虫工具从旅游网站上采集用户的游记，从中抽取出各个用户所访问过的景点，并对所有用户及其所有景点进行分别编号；Step 1. Use the crawler tool to collect the travel notes of the user from the travel website, extract the scenic spots visited by each user, and number all the users and all the scenic spots respectively;

步骤2、将每个用户访问过的所有景点都分别作为一个集合，构建关于用户的景点邻域集；同时，将访问过每个景点的所有用户都分别作为一个集合，构建关于景点的用户邻域集；Step 2. Take all the scenic spots visited by each user as a set, and construct a neighborhood set of scenic spots about the user; at the same time, take all the users who have visited each scenic spot as a set, and construct the user neighborhood of scenic spots. domain set;

步骤3、利用神经网络将所有用户和所有景点分别映射到两个特征向量空间中，其中一个特征向量空间为用户向量矩阵和景点向量矩阵，另一个特征向量空间为用户外部记忆矩阵和景点外部记忆矩阵；Step 3. Use neural network to map all users and all scenic spots into two feature vector spaces, one of which is the user vector matrix and the scenic spot vector matrix, and the other feature vector space is the user external memory matrix and the scenic spot external memory. matrix;

步骤4、对于每个用户u，基于步骤3所得到的用户向量矩阵和景点向量矩阵，计算用户u与关于景点i的用户邻域集中各个用户v的相似度q_uiv，并对用户u与其用户邻域集中各个用户v的相似度q_uiv进行归一化处理后得到用户u的权重系数p_uiv；其中：Step 4. For each user u, based on the user vector matrix and the scenic spot vector matrix obtained in step 3, calculate the similarity _qiv between the user u and each user v in the user neighborhood set about the scenic spot i, and compare the user u with its users. The similarity q _uiv of each user v in the neighborhood set is normalized to obtain the weight coefficient p _uiv of user u; where:

步骤5、将用户u的权重系数p_uiv作为注意力机制中的权重，并将关于景点i的用户邻域集中各个用户v在用户外部记忆矩阵中的用户特征向量进行加权求和，得到关于用户u的邻域结构信息o_ui；Step 5. Use the weight coefficient p _uiv of the user u as the weight in the attention mechanism, and perform a weighted summation of the user feature vectors of each user v in the user's external memory matrix in the user neighborhood set about the scenic spot i, to obtain the information about the user. u's neighborhood structure information o _ui ;

步骤6、对于每个景点i，基于步骤3所得到的用户向量矩阵和景点向量矩阵，计算景点i与关于用户u的景点邻域集中各个景点j的相似度q_iuj，并对景点i与其景点邻域集中各个景点j的相似度q_iuj进行归一化处理后得到景点i的权重系数p_iuj；其中：Step 6. For each scenic spot i, based on the user vector matrix and the scenic spot vector matrix obtained in step 3, calculate the similarity q _iuj between the scenic spot i and each scenic spot j in the scenic spot neighborhood set about the user u, and compare the scenic spot i and its scenic spots. The similarity q _iuj of each scenic spot j in the neighborhood set is normalized to obtain the weight coefficient p _iuj of the scenic spot i; wherein:

步骤7、将景点i的权重系数p_iuj作为注意力机制中的权重，并将用户u的景点领域集中各个景点j在景点外部记忆矩阵中的景点特征向量进行加权求和，得到关于景点i的邻域结构信息o_iu；Step 7. Take the weight coefficient p _iuj of the scenic spot i as the weight in the attention mechanism, and perform the weighted summation of the scenic spot feature vectors of each scenic spot j in the scenic spot external memory matrix in the scenic spot field of user u to obtain the information about the scenic spot i. neighborhood structure information o _iu ;

步骤8、将用户u在用户向量矩阵中的用户特征向量与景点i在景点向量矩阵中的景点特征向量进行元素积，得到用户u对景点i的全局结构信息；Step 8. Perform element product of the user feature vector of user u in the user vector matrix and the scenic spot feature vector of scenic spot i in the scenic spot vector matrix to obtain the global structure information of user u to scenic spot i;

步骤9、将步骤5所得到的关于用户u的邻域结构信息o_ui、步骤7所得到的关于景点i的邻域结构信息o_iu、以及步骤8所得到的用户u对景点i的全局结构信息进行拼接，并将拼接后的向量输入到多层神经网络中去得到用户u对景点i的最终评分；Step 9. Combine the neighborhood structure information o _ui of user u obtained in step 5, the neighborhood structure information o _iu of scenic spot i obtained in step 7, and the global structure of user u to scenic spot i obtained in step 8. The information is spliced, and the spliced vector is input into the multi-layer neural network to obtain the final score of the scenic spot i by user u;

步骤10、当需要针对某个目标用户推荐景点时，则将该目标用户对各个景点的最终评分进行排序，并选出排名靠前的K个景点，即得到关于该目标用户的Top-K旅游景点推荐；Step 10. When it is necessary to recommend scenic spots for a target user, sort the final scores of each scenic spot by the target user, and select the top K scenic spots, that is, get the Top-K tourism about the target user. recommended places;

上述，m_u表示用户u在用户向量矩阵的用户特征向量，m_v表示用户v在用户向量矩阵中的用户特征向量；e_i表示景点i在景点向量矩阵中的景点特征向量，e_j表示景点j在景点向量矩阵中的景点特征向量；N(i)表示所有访问过景点i的用户所构成的关于景点i的用户邻域集，S(u)表示用户u访问过的所有景点所构成的关于用户u的景点邻域集；u＝1,2,…,n，n表示用户的个数；i＝1,2,…,m，m表示景点的个数；其中K为设定的推荐景点的个数。In the above, m _u represents the user feature vector of user u in the user vector matrix, m _v represents the user feature vector of user v in the user vector matrix; e _i represents the scenic spot feature vector of scenic spot i in the scenic spot vector matrix, and e _j represents the scenic spot j is the scenic spot feature vector in the scenic spot vector matrix; N(i) represents the user neighborhood set about the scenic spot i formed by all the users who have visited the scenic spot i, and S(u) represents all the scenic spots that the user u has visited. About the scenic spot neighborhood set of user u; u = 1, 2, ..., n, n represents the number of users; i = 1, 2, ..., m, m represents the number of scenic spots; where K is the set recommendation number of attractions.

上述步骤3中，用户向量矩阵和用户外部记忆矩阵的维度均为n×d，景点向量矩阵和景点外部记忆矩阵的维度均为m×d，其中n表示用户的个数，m表示景点的个数，d表示矩阵的维度。In the above step 3, the dimensions of the user vector matrix and the user external memory matrix are both n × d, and the dimensions of the scenic spot vector matrix and the external memory matrix of the scenic spot are both m × d, where n represents the number of users, and m represents the number of scenic spots. number, d represents the dimension of the matrix.

上述步骤3中，还进一步包括如下过程：使用广义矩阵分解方法对用户向量矩阵和景点向量矩阵进行预训练更新。In the above step 3, the following process is further included: using the generalized matrix decomposition method to pre-train and update the user vector matrix and the sights vector matrix.

上述步骤5中，关于用户u的邻域结构信息o_ui为：In the above step 5, the neighborhood structure information o _ui of user u is:

式中，p_uiv表示用户u的权重系数，c_v表示关于景点i的用户邻域集中各个用户v在用户外部记忆矩阵中的用户特征向量，N(i)表示所有访问过景点i的用户所构成的用户邻域集，u＝1,2,…,n，n表示用户的个数；i＝1,2,…,m，m表示景点的个数。In the formula, p _uiv represents the weight coefficient of user u, _cv represents the user feature vector of each user v in the user's external memory matrix in the user neighborhood set about scenic spot i, and N(i) represents all users who have visited scenic spot i. The formed user neighborhood set, u=1,2,...,n, n represents the number of users; i=1,2,...,m, m represents the number of scenic spots.

上述步骤7中，关于景点i的邻域结构信息o_iu为：In the above step 7, the neighborhood structure information o _iu about the scenic spot i is:

式中，p_iuj表示景点i的权重系数，y_j表示关于用户u的景点邻域集中各个景点j在景点外部记忆矩阵中的景点特征向量，S(u)表示用户u访问过的所有景点所构成的景点邻域集，u＝1,2,…,n，n表示用户的个数；i＝1,2,…,m，m表示景点的个数。In the formula, p _iuj represents the weight coefficient of the scenic spot i, y _j represents the scenic spot feature vector of each scenic spot j in the external memory matrix of the scenic spot in the scenic spot neighborhood set about the user u, and S(u) represents all the scenic spots visited by the user u. The formed scenic spot neighborhood set, u = 1, 2, ..., n, n represents the number of users; i = 1, 2, ..., m, m represents the number of scenic spots.

本发明通过利用基于邻域或记忆的协同过滤方法，同时考虑用户的局部邻域结构信息以及景点的局部邻域结构信息，并与隐因子模型的协同过滤方法进行融合，实现了一种基于协同记忆网络的旅游推荐方法，以此达到高准确性和高个性化旅游推荐的目标。The present invention realizes a collaborative filtering method based on collaborative filtering by utilizing the collaborative filtering method based on neighborhood or memory, taking into account the local neighborhood structure information of the user and the local neighborhood structure information of scenic spots, and integrating with the collaborative filtering method of the latent factor model. A travel recommendation method based on memory network, so as to achieve the goal of high accuracy and high personalized travel recommendation.

与现有技术相比，本发明具有如下特点：Compared with the prior art, the present invention has the following characteristics:

1、本发明使用神经网络将用户与景点进行向量化，即将用户和景点映射到低维度向量空间中，通过向量来表示。该方法将自然语言表示为简单的向量，不仅保持了原来数据的含义，还极大地简化了计算，使得用户与景点的表示更加准确、合理地与旅游推荐进行结合；1. The present invention uses a neural network to vectorize users and scenic spots, that is, map users and scenic spots into a low-dimensional vector space, and represent them by vectors. This method represents natural language as a simple vector, which not only maintains the meaning of the original data, but also greatly simplifies the calculation, making the representation of users and attractions more accurate and reasonable to combine with travel recommendation;

2、本发明引入了记忆网络，在记忆网络中采用了注意力机制的方法，既考虑了用户与其邻域内用户的相似性，又考虑了景点与其邻域内景点的相似性，得到了包含用户邻域以及景点邻域的局部邻域结构信息，从而能够增强旅游推荐的个性化和准确性；2. The present invention introduces a memory network, and adopts the method of attention mechanism in the memory network, which not only considers the similarity between the user and the users in the neighborhood, but also considers the similarity between the scenic spots and the scenic spots in the neighborhood, and obtains a neighborhood containing the user. The local neighborhood structure information of the domain and the neighborhood of the scenic spot can enhance the personalization and accuracy of the travel recommendation;

3、本发明在预测用户对景点的得分时，将隐因子模型的全局结构信息与记忆网络的局部邻域结构信息进行结合，充分考虑了全局与局部的信息，从而确保了旅游推荐的有效性。3. The present invention combines the global structure information of the latent factor model with the local neighborhood structure information of the memory network when predicting the user's score for the scenic spot, and fully considers the global and local information, thereby ensuring the validity of the travel recommendation. .

具体实施方式Detailed ways

为使本发明的目的、技术方案和优点更加清楚明白，以下结合具体实例，对本发明进一步详细说明。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to specific examples.

一种基于协同记忆网络的旅游推荐方法，具体包括以下步骤：A travel recommendation method based on collaborative memory network, which specifically includes the following steps:

步骤1、利用爬虫工具从旅游网站上采集用户的游记，从中抽取出各个用户所访问过的景点，并对所有用户及其所有景点进行分别编号。Step 1. Use the crawler tool to collect the travel notes of users from the travel website, extract the scenic spots visited by each user, and number all the users and all the scenic spots respectively.

利用现有的网络爬虫手段，在携程、马蜂窝等旅游网站上采集用户的游记，根据用户的游记中所包含的用户去过景点序列的半结构化数据，抽取出各个用户去过的景点，并对采集到的所有用户及其所有景点进行分别编号，达到数据预处理的目的。Using the existing web crawler means, collect users' travel notes on travel websites such as Ctrip and Mafengwo, and extract the scenic spots visited by each user according to the semi-structured data of the sequence of scenic spots the user has visited in the travel notes. All the collected users and all their scenic spots are numbered separately to achieve the purpose of data preprocessing.

对于游记中用户所访问的景点，可以将游记中所有提及的景点全部进行保留，也可以根据用户评分对景点进行适当剔除，即将该用户访问过景点中评分较高的景点保留，例如，当所抽取的景点综合评分是2.8，3.5，4.5等，即五分制时，则将大于等于3分的景点保留下来。For the scenic spots visited by users in the travel notes, all the scenic spots mentioned in the travel notes can be retained, or the scenic spots can be appropriately eliminated according to the user's rating, that is, the scenic spots with higher ratings that the user has visited are reserved. The comprehensive scores of the extracted scenic spots are 2.8, 3.5, 4.5, etc., that is, when the five-point system is used, the scenic spots with a score of 3 or more will be retained.

由于原始的游记数据无法直接用于后续计算，需要为采集到的所有用户和景点进行统一编号，用唯一的ID值来表示，这样不仅便于后续进一步处理，也能对用户信息进行匿名化处理。例如：第一个用户“03202001”的ID设置为0，第二个用户“03202002”的ID设置为1；第一个景点“七星景区”的ID值设置为0，第二个景点“象山景区”的ID设置为1；以此类推，后续数据的ID值逐个加1。Since the original travel data cannot be directly used for subsequent calculations, it is necessary to uniformly number all the collected users and scenic spots, and represent them with a unique ID value, which not only facilitates subsequent further processing, but also anonymizes user information. For example: the ID of the first user "03202001" is set to 0, the ID of the second user "03202002" is set to 1; the ID value of the first scenic spot "Seven Star Scenic Spot" is set to 0, the second scenic spot "Xiangshan Scenic Spot" ” ID is set to 1; and so on, the ID value of subsequent data is incremented by 1 one by one.

步骤2、将每个用户访问过的所有景点都分别作为一个集合，构建关于用户的景点邻域集。将访问过每个景点的所有用户都分别作为一个集合，构建关于景点的用户邻域集。Step 2. Take all the scenic spots visited by each user as a set respectively, and construct a scenic spot neighborhood set about the user. All users who have visited each scenic spot are regarded as a set respectively, and a user neighborhood set about the scenic spot is constructed.

将同一个用户访问过的所有景点通过集合<P1，P2,…Pn>的形式来表示，构成关于该用户的景点邻域集，其中P1,P2,…Pn表示同一个用户访问过的各景点的ID。例如：用户U1游览过景点P1、P2、……Pn，则关于用户U1的景点邻域集为<P1，P2,…Pn>。All scenic spots visited by the same user are represented in the form of a set <P1, P2,...Pn> to form a neighborhood set of scenic spots about the user, where P1, P2,...Pn represent the scenic spots visited by the same user id. For example, if the user U1 has visited the scenic spots P1, P2, ... Pn, the scenic spot neighborhood set about the user U1 is <P1, P2, ... Pn>.

将访问过同一个景点的所有用户通过集合<U1,U2,…Un>的形式来表示，构成关于该景点的用户邻域集，其中U1,U2,…Un表示访问过同一个景点的各用户的ID。例如：用户U1、U2、……Un都游览过景点P1，则关于景点P1的用户邻域集为<U1,U2,…Un>。All users who have visited the same scenic spot are represented in the form of a set <U1, U2,…Un> to form a user neighborhood set about the scenic spot, where U1, U2,…Un represent the users who have visited the same scenic spot id. For example, if the users U1, U2, ...Un have all visited the scenic spot P1, the user neighborhood set about the scenic spot P1 is <U1, U2, ...Un>.

步骤3、利用神经网络将所有用户和景点分别映射到两个特征向量空间中，其中一个特征向量空间包括用户向量矩阵和景点向量矩阵，另一个特征向量空间包括用户外部记忆矩阵和景点外部记忆矩阵。其中用户向量矩阵和用户外部记忆矩阵的维度为n×d，景点向量矩阵和景点外部记忆矩阵的维度为m×d，n，m分别表示用户和景点的个数，d表示给定的矩阵的维度，在本实施例中，d＝50。Step 3. Use neural network to map all users and attractions into two feature vector spaces, one of which includes user vector matrix and attraction vector matrix, and the other feature vector space includes user external memory matrix and attraction external memory matrix. . The dimension of user vector matrix and user external memory matrix is n×d, and the dimension of scenic spot vector matrix and scenic spot external memory matrix is m×d, n, m represent the number of users and scenic spots respectively, d represents the given matrix. dimension, in this embodiment, d=50.

将用户的数量n和景点的数量m这两个参数输入到一个神经网络模块中，神经网络就会输出一个维度为n×d的有关用户的向量矩阵和维度为m×d的有关景点的向量矩阵。由于这2个向量矩阵中的每一个值是通过自己设定的初始化方法产生的(比如正态分布，随机产生一个0～1的数，这个数是随机的)，所以两次将参数n和m输入到神经网络中会得到不同的结果，第一次得到“n×d的用户向量矩阵和m×d的景点向量矩阵”，第二次得到“n×d的用户外部记忆矩阵和m×d的景点外部记忆矩阵”。Input the two parameters of the number of users n and the number of attractions m into a neural network module, the neural network will output a vector matrix of related users with dimension n×d and a vector of related attractions with dimension m×d matrix. Since each value in these two vector matrices is generated by the initialization method set by itself (such as normal distribution, a number between 0 and 1 is randomly generated, this number is random), so the parameters n and If m is input into the neural network, different results will be obtained. The first time you get "n×d user vector matrix and m×d attraction vector matrix", the second time you get "n×d user external memory matrix and m×d d's attractions external memory matrix".

此外，为了使得用户的向量与该用户访问过的景点的向量之间的相似度比与他没访问过的景点向量之间的相似度更高，其中相似度是通过点积的方式来衡量的，在本实施例中，还需要对上述所得到的用户向量矩阵和景点向量矩阵使用广义矩阵分解方法(一种深度学习模型)来进行预训练更新。In addition, in order to make the similarity between the user's vector and the vector of attractions that the user has visited is higher than the similarity between the vector of attractions he has not visited, the similarity is measured by dot product. , in this embodiment, it is also necessary to use a generalized matrix decomposition method (a deep learning model) to perform pre-training update on the obtained user vector matrix and sights vector matrix.

步骤4、计算用户向量矩阵中的用户向量与其用户邻域集中其他用户向量的相似度，作为权重系数，然后根据用户向量矩阵、相应的权重系数以及用户外部记忆矩阵构建用户记忆网络，获得关于用户的邻域结构信息。Step 4. Calculate the similarity between the user vector in the user vector matrix and other user vectors in the user neighborhood set as a weight coefficient, and then construct a user memory network according to the user vector matrix, the corresponding weight coefficient and the user external memory matrix, and obtain information about the user. the neighborhood structure information.

在统一编号的用户集中选择一个编号为u的用户，在统一编号的景点集中选择一个编号为i的景点，将用户u和景点i作为当前记忆网络构建的用户和景点。Select a user numbered u in the set of uniformly numbered users, select a scenic spot numbered i in the set of uniformly numbered scenic spots, and take user u and scenic spot i as the users and scenic spots constructed by the current memory network.

首先，在用户向量矩阵中，将用户u所对应的向量与其关于景点i的用户邻域集中的各个用户向量进行点乘，再将点乘的结果加上景点i所对应的景点向量矩阵中的向量与该用户邻域集中的用户向量的点乘，得到用户u与其用户邻域集中各个用户的相似度q_uiv：First, in the user vector matrix, the vector corresponding to user u is multiplied by each user vector in the user neighborhood set about the scenic spot i, and then the result of the dot multiplication is added to the scenic spot vector matrix corresponding to the scenic spot i. The dot product of the vector and the user vector in the user neighborhood set can obtain the similarity q _uiv between the user u and each user in the user neighborhood set:

式中，q_uiv表示关于景点i的用户邻域集中用户u和用户v在归一化前的相似度，m_u，m_v分别表示用户向量矩阵中用户u和用户v所对应的特征向量，e_i表示景点向量矩阵中景点i所对应的特征向量，N(i)表示所有访问过景点i的用户所构成的用户邻域集；In the formula, _quiv represents the similarity between user u and user v in the user neighborhood set of scenic spot i before normalization, m _u , m _v represent the feature vectors corresponding to user u and user v in the user vector matrix, respectively, e _i represents the feature vector corresponding to the scenic spot i in the scenic spot vector matrix, and N(i) represents the user neighborhood set composed of all users who have visited the scenic spot i;

接着，将上述所得到的用户u与其用户邻域集中各个用户的相似度q_uiv进行归一化，并将归一化后的相似度作为用户u的权重系数p_uiv：Next, normalize the similarity q _uiv between the user u and each user in the user neighborhood set obtained above, and use the normalized similarity as the weight coefficient p _uiv of the user u:

式中，exp(x)是以e为底x为指数的指数函数。In the formula, exp(x) is an exponential function with e as the base and x as the exponent.

最后，将用户u的权重系数p_uiv作为注意力机制中的权重，并将关于景点i的用户邻域集中各个用户v在用户外部记忆矩阵中的特征向量进行加权求和，通过用户外部记忆矩阵中用户u的用户邻域集中的用户向量来表示用户u，以此来获得包含用户邻域信息的向量表示o_ui，即关于用户u的邻域结构信息。其中：Finally, the weight coefficient p _uiv of user u is used as the weight in the attention mechanism, and the eigenvectors of each user v in the user's external memory matrix in the user's neighborhood set about the scenic spot i are weighted and summed, and through the user's external memory matrix User u is represented by the user vector in the user neighborhood set of user u in , so as to obtain the vector representation o _ui containing user neighborhood information, that is, the neighborhood structure information about user u. in:

式中，p_uiv表示用户u的权重系数，c_v表示关于景点i的用户邻域集中各个用户v在用户外部记忆矩阵中的特征向量。In the formula, p _uiv represents the weight coefficient of user u, and _cv represents the feature vector of each user v in the user's external memory matrix in the user neighborhood set about the scenic spot i.

步骤5、计算景点向量矩阵中的景点向量与其景点邻域集中的其他景点向量的相似度，作为权重系数，然后根据景点向量矩阵、相应的权重系数以及景点外部记忆矩阵构建景点记忆网络，获得关于景点的邻域结构信息。Step 5. Calculate the similarity between the scenic spot vector in the scenic spot vector matrix and other scenic spot vectors in the scenic spot neighborhood set as the weight coefficient, and then construct the scenic spot memory network according to the scenic spot vector matrix, the corresponding weight coefficient and the scenic spot external memory matrix, and obtain the information about the scenic spot. Information about the neighborhood structure of attractions.

在统一编号的用户集中选择一个编号为u的用户，在统一编号的景点集中选择一个编号为i的景点，将用户u和景点i作为当前记忆网络构建的用户和景点。Select a user numbered u in the uniformly numbered user set, select a scenic spot numbered i in the uniformly numbered scenic spot set, and take user u and scenic spot i as the users and scenic spots constructed by the current memory network.

首先，在景点向量矩阵中，将景点i所对应的向量与其关于用户u的景点邻域集中的各个景点向量进行点乘，再将点乘的结果加上用户u所对应的用户向量矩阵中的向量与该景点邻域集中的景点向量的点乘，得到景点i与其景点邻域集中各个景点的相似度q_iuj：First, in the scenic spot vector matrix, the vector corresponding to scenic spot i is multiplied by each scenic spot vector in the scenic spot neighborhood set about user u, and then the result of the dot product is added to the user vector matrix corresponding to user u. The dot product of the vector and the scenic spot vector in the scenic spot neighborhood set can get the similarity q _iuj between the scenic spot i and each scenic spot in the scenic spot neighborhood set:

式中，q_iuj表示关于用户u的景点邻域集中景点i和景点j在归一化前的相似度，e_i，e_j分别表示景点向量矩阵中景点i和景点j所对应的特征向量，m_u表示用户向量矩阵中用户u所对应的特征向量，S(u)表示用户u访问过的所有景点所构成的景点邻域集；In the formula, q _iuj represents the similarity between the scenic spot i and the scenic spot j in the scenic spot neighborhood set of user u before normalization, e _i , e _j represent the feature vector corresponding to the scenic spot i and the scenic spot j in the scenic spot vector matrix, respectively, m _u represents the feature vector corresponding to user u in the user vector matrix, and S(u) represents the scenic spot neighborhood set composed of all scenic spots visited by user u;

接着，将景点i与其景点邻域集中各个景点的相似度q_iuj进行归一化，归一化后的相似度作为景点i的权重系数p_iuj：Next, normalize the similarity _qiuj between the scenic spot i and each scenic spot in its neighborhood set, and use the normalized similarity as the weight coefficient p _iuj of the scenic spot i:

最后，将景点i的权重系数p_iuj作为注意力机制中的权重，并将用户u的景点领域集中各个景点j在景点外部记忆矩阵中的特征向量进行加权求和，即通过景点外部记忆矩阵中景点i的景点邻域集中的景点向量来表示景点i，以此来获得包含景点邻域信息的向量表示o_iu，即关于景点i的邻域结构信息。其中：Finally, the weight coefficient p _iuj of the scenic spot i is used as the weight in the attention mechanism, and the eigenvectors of each scenic spot j in the scenic spot j in the scenic spot field of user u are weighted and summed, that is, through the external memory matrix of the scenic spot The scenic spot vector in the scenic spot neighborhood set of the scenic spot i is used to represent the scenic spot i, so as to obtain the vector representation o _iu containing the scenic spot neighborhood information, that is, the neighborhood structure information about the scenic spot i. in:

式中，p_iuj表示景点i的权重系数，y_j表示关于用户u的景点邻域集中各个景点j在景点外部记忆矩阵中的特征向量。In the formula, p _iuj represents the weight coefficient of the scenic spot i, and y _j represents the feature vector of each scenic spot j in the external memory matrix of the scenic spot in the scenic spot neighborhood set of user u.

步骤6、将用户向量矩阵中用户u所对应的特征向量与景点向量矩阵中景点i所对应的特征向量进行元素积，得到用户对景点的偏好信息的向量表示，即隐因子模型的用户u对景点i的全局结构信息；Step 6. Perform element product of the feature vector corresponding to user u in the user vector matrix and the feature vector corresponding to the scenic spot i in the scenic spot vector matrix to obtain the vector representation of the user's preference information for the scenic spot, that is, the user u pair of the hidden factor model. The global structure information of the scenic spot i;

步骤7、通过非线性的方式将隐因子模型的全局结构信息与包含用户邻域信息的向量表示(关于用户u的邻域结构信息o_ui)和包含景点邻域信息的向量表示(关于景点i的邻域结构信息o_iu)进行拼接。Step 7. In a nonlinear way, combine the global structure information of the latent factor model with the vector representation containing the user neighborhood information (about the neighborhood structure information o _ui of the user u ) and the vector representation containing the scenic spot neighborhood information (about the scenic spot i ). The neighborhood structure information o _iu ) is spliced.

例如，隐因子模型的全局结构信息、包含用户邻域信息的向量表示及包含景点邻域信息的向量表示的向量维度都是100维，则拼接后的向量维度为300维。For example, the vector dimension of the global structure information of the latent factor model, the vector representation containing the user neighborhood information, and the vector representation containing the scenic spot neighborhood information are all 100 dimensions, then the spliced vector dimension is 300 dimensions.

步骤8、将拼接后的向量输入到多层神经网络中去得到用户对景点的最终评分，评分值越小，说明用户去游览该景点的可能性越小，评分值越大，说明用户越有可能游览该景点。具体公式如下：Step 8. Input the spliced vector into the multi-layer neural network to obtain the user's final rating for the scenic spot. The smaller the rating value, the less likely the user is to visit the scenic spot, and the larger the rating value, the more likely the user is. Possibility to visit this attraction. The specific formula is as follows:

r_ui＝v^Tφ(U(m_u⊙e_i)+Wo_ui+Wo_iu+b) (7)r _ui =v ^T φ(U(m _u ⊙e _i )+Wo _ui +Wo _iu +b) (7)

其中，r_ui表示用户对景点的评分，U、W、v以及b都是神经网络中需要学习的参数，⊙表示元素积，即两个向量的对应元素相乘，得到与原向量相同维度的新的向量，φ(x)表示非线性激活函数。Among them, r _ui represents the user's rating of the scenic spot, U, W, v and b are all parameters that need to be learned in the neural network, ⊙ represents the element product, that is, the corresponding elements of the two vectors are multiplied to obtain the same dimension as the original vector. The new vector, φ(x), represents the nonlinear activation function.

步骤9、通过上述一系列方法计算出目标用户对所有景点的评分值，将得到的评分进行排序，选出排名靠前的K个景点，即得到关于该目标用户的Top-K旅游景点推荐。Step 9: Calculate the score values of the target user for all scenic spots through the above series of methods, sort the obtained scores, and select the top K scenic spots to obtain the Top-K tourist attractions recommendation for the target user.

本发明将基于隐因子模型的协同过滤方法与基于邻域或记忆的协同过滤方法进行了融合，能够利用隐因子模型的全局结构信息以及邻域或记忆的局部结构信息的优势，以此达到个性化旅游推荐的目的。The invention integrates the collaborative filtering method based on the latent factor model and the collaborative filtering method based on the neighborhood or memory, and can take advantage of the global structure information of the latent factor model and the local structure information of the neighborhood or memory, so as to achieve individuality. Purpose of travel recommendation.

需要说明的是，尽管以上本发明所述的实施例是说明性的，但这并非是对本发明的限制，因此本发明并不局限于上述具体实施方式中。在不脱离本发明原理的情况下，凡是本领域技术人员在本发明的启示下获得的其它实施方式，均视为在本发明的保护之内。It should be noted that, although the embodiments of the present invention described above are illustrative, they are not intended to limit the present invention, so the present invention is not limited to the above-mentioned specific embodiments. Without departing from the principles of the present invention, all other embodiments obtained by those skilled in the art under the inspiration of the present invention are deemed to be within the protection of the present invention.

Claims

1. A travel recommendation method based on a collaborative memory network is characterized by comprising the following steps:

step 1, collecting travel notes of users from a travel website by using a crawler tool, extracting scenic spots visited by each user from the travel notes, and numbering all the users and all the scenic spots respectively;

step 2, taking all scenic spots visited by each user as a set respectively, and constructing a scenic spot neighborhood set related to the user; meanwhile, all users visiting each scenic spot are respectively used as a set to construct a user neighborhood set about the scenic spot;

step 3, mapping all users and all scenic spots to two feature vector spaces respectively by utilizing a neural network, wherein one feature vector space is a user vector matrix and a scenic spot vector matrix, and the other feature vector space is a user external memory matrix and a scenic spot external memory matrix;

step 4, for each user u, calculating the similarity q between the user u and each user v in the user neighborhood set related to the scenic spot i based on the user vector matrix and the scenic spot vector matrix obtained in the step 3_uivAnd the similarity q between the user u and each user v in the user neighborhood set_uivObtaining the weight coefficient p of the user u after normalization processing_uiv(ii) a Wherein:

step 5, weighting coefficient p of user u_uivAs weights in the attention mechanism, and carrying out weighted summation on user feature vectors of each user v in the user neighborhood set of the sight spot i in the user external memory matrix to obtain neighborhood structure information o of the user u_ui；

Step 6, aiming at each sceneAnd (3) calculating the similarity q between the scenery i and each scenery j in the scenery neighborhood set related to the user u based on the user vector matrix and the scenery vector matrix obtained in the step (3)_iujAnd the similarity q of the scenic spot i and each scenic spot j in the scenic spot neighborhood set_iujObtaining the weight coefficient p of the sight spot i after normalization processing_iuj(ii) a Wherein:

step 7, weighting coefficient p of the scenic spot i_iujAs the weight in the attention mechanism, weighting and summing the sight spot feature vectors of each sight spot j in the sight spot field set of the user u in the sight spot external memory matrix to obtain the neighborhood structure information o about the sight spot i_iu；

Step 8, performing element product on the user feature vector of the user u in the user vector matrix and the scenery spot feature vector of the scenery spot i in the scenery spot vector matrix to obtain the global structure information of the user u on the scenery spot i;

step 9, obtaining the neighborhood structure information o about the user u obtained in the step 5_uiAnd 7, obtaining the neighborhood structure information o about the sight spot i_iuSplicing the global structure information of the scenic spot i by the user u obtained in the step 8, and inputting the spliced vector into the multilayer neural network to obtain the final score of the user u on the scenic spot i;

step 10, when a scenic spot needs to be recommended for a certain target user, sequencing the final scores of all the scenic spots of the target user, and selecting K scenic spots with the Top ranks, namely, the Top-K scenic spot recommendation of the target user is obtained;

above, m_uUser feature vector, m, representing user u in user vector matrix_vRepresenting the user characteristic vector of the user v in the user vector matrix; e.g. of the type_iThe sight feature vector representing the sight i in the sight vector matrix, e_jExpressing the sight spot characteristic vector of the sight spot j in the sight spot vector matrix; v ∈ N (i), N (i) indicating the sights formed by all users who have visited sight ii's user neighborhood set; j belongs to S (u), and S (u) represents a scenery neighborhood set which is formed by all sights visited by the user u and is about the user u; u is 1,2, …, n, n represents the number of users; i is 1,2, …, m, m represents the number of scenic spots; wherein K is the number of the set recommended scenic spots.

2. The method as claimed in claim 1, wherein in step 3, the dimensions of the user vector matrix and the user external memory matrix are both nxd, and the dimensions of the scenery spot vector matrix and the scenery spot external memory matrix are both mxd, wherein n represents the number of users, m represents the number of scenery spots, and d represents the dimensions of the matrices.

3. The travel recommendation method based on the collaborative memory network as claimed in claim 1 or 2, wherein the step 3 further comprises the following steps: and (4) performing pre-training updating on the user vector matrix and the sight spot vector matrix by using a generalized matrix decomposition method.

4. The travel recommendation method based on the collaborative memory network as claimed in claim 1 or 2, wherein in step 5, neighborhood structure information o about user u_uiComprises the following steps:

in the formula, p_uivWeight coefficient representing user u, c_vRepresenting a user feature vector of each user v in a user external memory matrix in a user neighborhood set about the sight point i, n (i) representing a user neighborhood set formed by all users visiting the sight point i, u being 1,2, …, n, n representing the number of the users; i is 1,2, …, m, m indicates the number of sights.

5. The method as claimed in claim 1 or 2, wherein in step 7, the neighborhood structure information o of the scenic spot i is obtained_iuComprises the following steps:

in the formula, p_iujWeight coefficient, y, representing the sight i_jS (u) represents a scenery feature vector in a scenery external memory matrix of each scenery j in a scenery neighborhood set of the user u, wherein u is 1,2, …, n, and n represents the number of users; i is 1,2, …, m, m indicates the number of sights.