CN113159892B - A product recommendation method based on multimodal product feature fusion - Google Patents

A product recommendation method based on multimodal product feature fusion Download PDF

Info

Publication number
CN113159892B
CN113159892B CN202110444726.4A CN202110444726A CN113159892B CN 113159892 B CN113159892 B CN 113159892B CN 202110444726 A CN202110444726 A CN 202110444726A CN 113159892 B CN113159892 B CN 113159892B
Authority
CN
China
Prior art keywords
commodity
user
representation
word
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN202110444726.4A
Other languages
Chinese (zh)
Other versions
CN113159892A (en
Inventor
蔡国永
宋亚飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin University of Electronic Technology
Original Assignee
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology filed Critical Guilin University of Electronic Technology
Priority to CN202110444726.4A priority Critical patent/CN113159892B/en
Publication of CN113159892A publication Critical patent/CN113159892A/en
Application granted granted Critical
Publication of CN113159892B publication Critical patent/CN113159892B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Accounting & Taxation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明属于商品推荐领域,具体涉及一种基于多模态商品特征融合的商品推荐方法。所述商品推荐方法包括:根据用户已经购买的商品序列构建用户‑商品二分图,通过图卷积得到用户节点的向量表示和商品节点的向量表示;通过卷积神经网络对商品得到的评论文本进行特征提取,得到商品评论的向量表示;通过卷积神经网络对商品的标题和描述信息进行特征提取,得到商品内容的向量表示;将商品节点、评论和内容的向量表示连接得到商品的最终表示,将用户节点的向量表示作为用户的最终表示。本发明通过利用商品的多模态特征,能够极大的缓解商品推荐中的数据稀疏性问题,提高推荐准确率。

Figure 202110444726

The invention belongs to the field of commodity recommendation, in particular to a commodity recommendation method based on multimodal commodity feature fusion. The commodity recommendation method includes: constructing a user-commodity bipartite graph according to the commodity sequence that the user has purchased, obtaining the vector representation of the user node and the vector representation of the commodity node through graph convolution; Feature extraction to obtain the vector representation of the product reviews; feature extraction of the title and description information of the product through a convolutional neural network to obtain the vector representation of the product content; the final representation of the product is obtained by connecting the vector representations of product nodes, reviews and content, Take the vector representation of the user node as the final representation of the user. The present invention can greatly alleviate the problem of data sparsity in the product recommendation and improve the recommendation accuracy by utilizing the multi-modal features of the product.

Figure 202110444726

Description

一种基于多模态商品特征融合的商品推荐方法A product recommendation method based on multimodal product feature fusion

技术领域technical field

本发明涉及一种商品推荐方法,属于商品推荐领域。The invention relates to a commodity recommendation method, which belongs to the field of commodity recommendation.

背景技术Background technique

目前的商品推荐方法在对商品进行建模的过程中,大多数仅利用商品的id提取隐含在用户和商品交互中的协同信号,从而对商品进行建模,这通常面临着严重的数据稀疏性问题,极大的制约了推荐系统的性能。尽管也有一些工作将评论信息纳入考虑,以捕获包含在评论中的商品特征信息,同时缓解数据稀疏性问题,但商品本身的标题和描述信息却很少得到利用。然而,评论信息由用户给出,其中由于用户表达习惯、关注点的不同,不同的评论中包含的信息通常具有不同的信息性,甚至有可能包含很多的噪声信息。与评论信息不同,商品标题和描述通常由商家进行撰写,其中会包含更多且更全面的商品特征,并且在表达中会更加的专业和准确。因此,在推荐中对商品建模时,在商品id和商品评论信息的基础上,结合商品标题和描述信息将有助于实现更好的推荐性能。In the process of product modeling, most of the current product recommendation methods only use the id of the product to extract the collaborative signal implied in the interaction between the user and the product, so as to model the product, which usually faces serious data sparsity. This problem greatly restricts the performance of the recommender system. Although there are also some works that take review information into consideration to capture item feature information contained in reviews while alleviating the data sparsity problem, the title and description information of items themselves are rarely utilized. However, the comment information is given by the user, and due to the different expression habits and concerns of the user, the information contained in different comments usually has different informativeness, and may even contain a lot of noise information. Different from the review information, the product title and description are usually written by the merchant, which will contain more and more comprehensive product characteristics, and will be more professional and accurate in the expression. Therefore, when modeling products in recommendation, on the basis of product id and product review information, combining product title and description information will help to achieve better recommendation performance.

发明内容SUMMARY OF THE INVENTION

针对上述问题,本发明提出了一种基于多模态商品特征融合的商品推荐方法,所述推荐方法包括:In view of the above problems, the present invention proposes a product recommendation method based on multimodal product feature fusion, the recommendation method includes:

S1:根据用户历史购买的商品序列构建用户-商品二分图,通过图卷积得到用户节点的向量表示和商品节点的向量表示;S1: Construct a user-commodity bipartite graph according to the sequence of commodities purchased by the user in the past, and obtain the vector representation of the user node and the vector representation of the commodity node through graph convolution;

S2:获取商品的评论文档,通过卷积神经网络提取商品评论的向量表示;S2: Obtain the review document of the product, and extract the vector representation of the product review through the convolutional neural network;

S3:获取商品的标题和描述信息,通过卷积神经网络提取商品内容的向量表示;S3: Obtain the title and description information of the product, and extract the vector representation of the product content through a convolutional neural network;

S4:得到用户的最终表示和商品的最终表示;S4: Get the final representation of the user and the final representation of the product;

S5:计算用户和商品的相似度;S5: Calculate the similarity between the user and the product;

S6:通过贝叶斯个性化排序损失优化提出方法中的参数。S6: Parameters in the proposed method are optimized by Bayesian personalized ranking loss.

进一步的,所述S1中构建用户-商品二分图包括:Further, constructing a user-commodity bipartite graph in S1 includes:

S11:根据隐式反馈或显式反馈获取用户历史购买商品序列,通过历史购买商品序列构建用户-商品二分图,使用用户-商品邻接矩阵

Figure GDA0003560489150000021
Figure GDA0003560489150000022
表示,其中nu和np分别是用户数和商品数,
Figure GDA0003560489150000023
是用户-商品交互矩阵,RT是R的转置,
Figure GDA0003560489150000024
S11: Obtain the user's historically purchased commodity sequence according to the implicit feedback or explicit feedback, construct a user-commodity bipartite graph through the historically purchased commodity sequence, and use the user-commodity adjacency matrix
Figure GDA0003560489150000021
Figure GDA0003560489150000022
represents, where n u and n p are the number of users and the number of items, respectively,
Figure GDA0003560489150000023
is the user-item interaction matrix, R T is the transpose of R,
Figure GDA0003560489150000024

S12:为了利用用户-商品二分图中节点自身的信息,向A添加一个单位矩阵

Figure GDA0003560489150000025
同时,为了避免训练过程中的梯度消失或梯度爆炸,使用对角度矩阵
Figure GDA0003560489150000026
进行归一化处理,其中对角线上的值为用户-商品二分图中各个节点的度,从而得到
Figure GDA0003560489150000027
S12: In order to utilize the information of the node itself in the user-item bipartite graph, add an identity matrix to A
Figure GDA0003560489150000025
At the same time, in order to avoid gradient disappearance or gradient explosion during training, use the pair angle matrix
Figure GDA0003560489150000026
Perform normalization processing, where the value on the diagonal line is the degree of each node in the user-commodity bipartite graph, so as to obtain
Figure GDA0003560489150000027

进一步的,所述S1中得到用户节点的向量表示和商品节点的向量表示包括:Further, obtaining the vector representation of the user node and the vector representation of the commodity node in S1 includes:

S13:通过图卷积对用户-商品二分图进行邻居传播和聚合操作,得到用户节点的向量表示和商品节点的向量表示。S13: Perform neighbor propagation and aggregation operations on the user-commodity bipartite graph through graph convolution to obtain the vector representation of the user node and the vector representation of the commodity node.

进一步的,所述S13中图卷积的具体步骤为:Further, the specific steps of the graph convolution in the S13 are:

S131:通过嵌入层将每个用户和商品唯一对应的id转换为密集向量,得到用户特征向量

Figure GDA0003560489150000028
和商品特征向量
Figure GDA0003560489150000029
其中d是特征向量的维度;S131: Convert the unique id corresponding to each user and product into a dense vector through the embedding layer to obtain a user feature vector
Figure GDA0003560489150000028
and commodity feature vectors
Figure GDA0003560489150000029
where d is the dimension of the feature vector;

S132:建立嵌入表

Figure GDA00035604891500000210
来表示用户-商品二分图的特征矩阵;S132: Create an embedded table
Figure GDA00035604891500000210
to represent the feature matrix of the user-item bipartite graph;

S133:使用t层的图卷积来聚合节点邻居的特征,其中传播过程定义为:

Figure GDA0003560489150000031
S133: Aggregate features of node neighbors using graph convolution at layer t, where the propagation process is defined as:
Figure GDA0003560489150000031

S134:通过t层的图卷积,得到从

Figure GDA0003560489150000032
Figure GDA0003560489150000033
的t个特征矩阵,将这t个特征矩阵连接,得到最终的特征矩阵
Figure GDA0003560489150000034
Figure GDA0003560489150000035
然后将E分割为特征矩阵的两个部分
Figure GDA0003560489150000036
Figure GDA0003560489150000037
Figure GDA0003560489150000038
分别作为用户节点的向量表示
Figure GDA0003560489150000039
和商品节点的向量表示
Figure GDA00035604891500000310
S134: Through the graph convolution of the t layer, the
Figure GDA0003560489150000032
arrive
Figure GDA0003560489150000033
The t feature matrices of , connect the t feature matrices to get the final feature matrix
Figure GDA0003560489150000034
Figure GDA0003560489150000035
Then split E into two parts of the feature matrix
Figure GDA0003560489150000036
and
Figure GDA0003560489150000037
Figure GDA0003560489150000038
Respectively as the vector representation of user nodes
Figure GDA0003560489150000039
and vector representation of commodity nodes
Figure GDA00035604891500000310

进一步的,所述S2中提取商品评论的向量表示包括:Further, the vector representation of the extracted product reviews in S2 includes:

S21:将每个商品得到的评论整合为该商品的评论文档,并对商品的评论文档进行分词、词形还原、移除停用词、移除出现频率极高的单词和出现频率极低的单词等预处理;S21: Integrate the reviews obtained for each product into a review document of the product, and perform word segmentation, morphological restoration, removal of stop words, and removal of words with extremely high frequency and extremely low frequency on the review document of the product. word preprocessing;

S22:通过文本特征提取器对商品评论文档进行特征提取,得到商品评论的向量表示

Figure GDA00035604891500000311
S22: Perform feature extraction on the product review document through a text feature extractor to obtain a vector representation of the product review
Figure GDA00035604891500000311

进一步的,所述S22中文本特征提取器的具体步骤为:Further, the specific steps of the text feature extractor in the S22 are:

S221:将输入文本的词序列表示为[w1,w2,…,wl],其中l是输入文本的长度;S221: Represent the word sequence of the input text as [w 1 ,w 2 ,...,w l ], where l is the length of the input text;

S222:通过单词嵌入层将S5的单词序列表示转换为单词向量表示序列

Figure GDA00035604891500000312
其中dv是单词嵌入维度;S222: Convert the word sequence representation of S5 into a word vector representation sequence through the word embedding layer
Figure GDA00035604891500000312
where d v is the word embedding dimension;

S223:使用卷积神经网络对单词向量表示序列进行处理,从而得到上下文单词向量表示序列[c1,c2,…,cl],其中第i个单词的上下文表示ci的计算方法为:ci=LeakyReLU(Wt×v(i-k):(i+k)+bt);S223: Use the convolutional neural network to process the sequence of word vector representations to obtain the contextual word vector representation sequence [c 1 ,c 2 ,...,c l ], where the calculation method of the context representation c i of the ith word is: c i =LeakyReLU(W t ×v (ik):(i+k) +b t );

S224:使用注意力机制为上下文单词向量表示序列中的每个单词向量表示计算一个权重[α12,…,αl],然后将上下文单词向量表示序列与对应权重相乘得到输入文本的最终表示

Figure GDA0003560489150000041
其中αi的计算方法为:
Figure GDA0003560489150000042
Figure GDA0003560489150000043
S224: Use the attention mechanism to calculate a weight [α 12 ,...,α l ] for each word vector representation in the context word vector representation sequence, and then multiply the context word vector representation sequence with the corresponding weight to obtain the input text the final representation of
Figure GDA0003560489150000041
The calculation method of α i is:
Figure GDA0003560489150000042
Figure GDA0003560489150000043

进一步的,所述S3中提取商品内容的向量表示包括:Further, the vector representation of the extracted commodity content in S3 includes:

S31:获取商品的标题和描述信息,并对这些信息进行分词、词形还原、移除停用词、移除出现频率极高的单词和出现频率极低的单词等预处理;S31: Obtain the title and description information of the product, and perform preprocessing on the information, such as word segmentation, morphological restoration, removal of stop words, removal of words with extremely high frequency and words with extremely low frequency;

S32:通过与S22中相同的文本特征提取器对商品标题和描述信息进行特征提取,得到商品内容的向量表示

Figure GDA0003560489150000044
S32: Perform feature extraction on the product title and description information through the same text feature extractor as in S22 to obtain a vector representation of the product content
Figure GDA0003560489150000044

进一步的,所述S4中得到用户最终表示和商品最终表示包括:Further, obtaining the final representation of the user and the final representation of the product in S4 includes:

S41:将商品节点的向量表示ep、评论的向量表示ar、内容的向量表示at连接得到商品的最终表示p;将用户的节点表示eu作为用户的最终表示u。S41: Connect the vector representation ep of the commodity node, the vector representation a r of the comment, and the vector representation at t of the content to obtain the final representation p of the product; use the node representation e u of the user as the final representation u of the user.

进一步的,所述S5中计算用户和商品的相似度包括:Further, calculating the similarity between the user and the product in S5 includes:

通过用户最终表示和商品最终表示的点积计算用户和商品的相似度:Calculate the similarity between the user and the item by the dot product of the final representation of the user and the final representation of the item:

Figure GDA0003560489150000045
Figure GDA0003560489150000045

进一步的,所述S6中通过贝叶斯个性化排序损失优化提出方法中的参数包括:Further, the parameters in the method proposed by Bayesian personalized ranking loss optimization in S6 include:

Figure GDA0003560489150000046
Figure GDA0003560489150000046

附图说明Description of drawings

图1是本发明的商品推荐方法流程示意图。FIG. 1 is a schematic flow chart of the method for recommending products of the present invention.

图2是本发明的商品推荐方法结构示意图。FIG. 2 is a schematic structural diagram of the method for recommending products of the present invention.

具体实施方式Detailed ways

下面结合附图对本发明作进一步描述。以下实施例仅用于更加清楚地说明本发明的技术方案,而不能以此来限制本发明的保护范围。The present invention will be further described below in conjunction with the accompanying drawings. The following examples are only used to illustrate the technical solutions of the present invention more clearly, and cannot be used to limit the protection scope of the present invention.

如图1所示,本发明提供了一种基于多模态商品特征融合的商品推荐方法,包括如下步骤:As shown in Figure 1, the present invention provides a product recommendation method based on multimodal product feature fusion, including the following steps:

步骤1:根据用户历史购买的商品序列构建用户-商品二分图,通过图卷积得到用户节点的向量表示和商品节点的向量表示;Step 1: Construct a user-commodity bipartite graph according to the sequence of commodities purchased by the user in the past, and obtain the vector representation of the user node and the vector representation of the commodity node through graph convolution;

具体的,图卷积的具体步骤为:Specifically, the specific steps of graph convolution are:

首先根据用户和商品的历史交互构建用户-商品二分图,使用用户-商品邻接矩阵

Figure GDA0003560489150000051
表示,其中nu和np分别是用户数和商品数,
Figure GDA0003560489150000052
是用户-商品交互矩阵,RT是R的转置,
Figure GDA0003560489150000053
Figure GDA0003560489150000054
Firstly, a user-item bipartite graph is constructed based on the historical interaction between users and items, and a user-item adjacency matrix is used.
Figure GDA0003560489150000051
represents, where n u and n p are the number of users and the number of items, respectively,
Figure GDA0003560489150000052
is the user-item interaction matrix, R T is the transpose of R,
Figure GDA0003560489150000053
Figure GDA0003560489150000054

为了利用用户-商品二分图中节点自身的信息,向A添加一个单位矩阵

Figure GDA0003560489150000055
同时,为了避免训练过程中的梯度消失或梯度爆炸,使用对角度矩阵
Figure GDA0003560489150000056
进行归一化处理,其中对角线上的值为用户-商品二分图中各个节点的度,从而得到
Figure GDA0003560489150000057
To take advantage of the information about the nodes themselves in the user-item bipartite graph, add an identity matrix to A
Figure GDA0003560489150000055
At the same time, in order to avoid gradient disappearance or gradient explosion during training, use the pair angle matrix
Figure GDA0003560489150000056
Perform normalization processing, where the value on the diagonal line is the degree of each node in the user-commodity bipartite graph, so as to obtain
Figure GDA0003560489150000057

然后通过嵌入层将每个用户和商品唯一对应的id转换为密集向量,得到用户特征向量

Figure GDA0003560489150000058
和商品特征向量
Figure GDA0003560489150000059
其中d是特征向量的维度;我们建立以下嵌入表E0来表示用户-商品二分图的特征矩阵:Then, the unique corresponding id of each user and product is converted into a dense vector through the embedding layer, and the user feature vector is obtained
Figure GDA0003560489150000058
and commodity feature vectors
Figure GDA0003560489150000059
where d is the dimension of the feature vector ; we build the following embedding table E0 to represent the feature matrix of the user-item bipartite graph:

Figure GDA00035604891500000510
Figure GDA00035604891500000510

然后,我们使用t层的图卷积来聚合节点邻居的特征,其中传播过程定义为:Then, we use graph convolution at layer t to aggregate the features of node neighbors, where the propagation process is defined as:

Figure GDA0003560489150000061
Figure GDA0003560489150000061

其中

Figure GDA0003560489150000062
是可训练的权重矩阵,σ是LeakyRelu激活函数。in
Figure GDA0003560489150000062
is the trainable weight matrix and σ is the LeakyRelu activation function.

通过t层的图卷积,得到从

Figure GDA0003560489150000063
Figure GDA0003560489150000064
的t个特征矩阵,将这t个特征矩阵连接,得到最终的特征矩阵E,然后将E分割为特征矩阵的两个部分
Figure GDA0003560489150000065
Figure GDA0003560489150000066
分别作为用户节点的向量表示
Figure GDA0003560489150000067
和商品节点的向量表示
Figure GDA0003560489150000068
Through the graph convolution of the t layer, we get from
Figure GDA0003560489150000063
arrive
Figure GDA0003560489150000064
The t feature matrices of , connect the t feature matrices to get the final feature matrix E, and then divide E into two parts of the feature matrix
Figure GDA0003560489150000065
and
Figure GDA0003560489150000066
Respectively as the vector representation of user nodes
Figure GDA0003560489150000067
and vector representation of commodity nodes
Figure GDA0003560489150000068

Figure GDA0003560489150000069
Figure GDA0003560489150000069

步骤2:将每个商品得到的评论整合为该商品的评论文档,对商品的评论文档进行分词、词形还原、移除停用词、移除出现频率极高的单词和出现频率极低的单词等处理,然后使用文本特征提取器对商品评论文档进行处理得到商品评论的向量表示

Figure GDA00035604891500000610
Step 2: Integrate the reviews obtained for each product into a review document for the product, perform word segmentation, morphological restoration, remove stop words, and remove words with extremely high frequency and extremely low frequency on the review document of the product. words, etc., and then use the text feature extractor to process the product review document to obtain the vector representation of the product review
Figure GDA00035604891500000610

具体的,文本特征提取器的具体步骤为:Specifically, the specific steps of the text feature extractor are:

首先是单词嵌入,将输入文本的单词序列表示为[w1,w2,…,wl],其中l是输入文本的长度,然后通过单词嵌入层将其转换为单词向量表示序列

Figure GDA00035604891500000611
其中dv是单词嵌入维度。The first is word embedding, which represents the sequence of words in the input text as [w 1 ,w 2 ,…,w l ], where l is the length of the input text, which is then converted to a word vector representation sequence through a word embedding layer
Figure GDA00035604891500000611
where d v is the word embedding dimension.

为了利用输入文本中的局部上下文信息,我们使用卷积神经网络对单词向量表示序列进行处理,从而得到上下文单词向量表示序列[c1,c2,…,cl],其中第i个单词的上下文表示ci的计算方法为:In order to utilize the local contextual information in the input text, we use a convolutional neural network to process the sequence of word vector representations to obtain a sequence of contextual word vector representations [c 1 ,c 2 ,…,c l ], where the i-th word’s The context representation ci is calculated as:

ci=LeakyReLU(Wt×v(i-k):(i+k)+bt)c i =LeakyReLU(W t ×v (ik):(i+k) +b t )

其中v(i-k):(i+k)是从第i-k个单词到第i+k个单词的单词嵌入的拼接,Wt和bt分别是卷积核和偏置。where v (ik):(i+k) is the concatenation of word embeddings from the ikth word to the i+kth word, and W t and b t are the convolution kernel and bias, respectively.

考虑到输入文本中的不同单词具有不同的信息性,因此我们使用注意力机制为上下文单词向量表示序列中的每个单词向量表示计算一个权重[α12,…,αl],然后将上下文单词向量表示序列与对应权重相乘得到输入文本的最终表示

Figure GDA0003560489150000071
其中αi的计算方法为:Considering that different words in the input text have different informativeness, we use the attention mechanism to calculate a weight [α 12 ,…,α l ] for each word vector representation in the sequence of context word vector representations, and then Multiply the sequence of context word vector representations with the corresponding weights to get the final representation of the input text
Figure GDA0003560489150000071
The calculation method of α i is:

Figure GDA0003560489150000072
Figure GDA0003560489150000072

其中Wa和ba分别是可训练的权重矩阵和偏置,q是注意力查询向量。where W a and b a are the trainable weight matrices and biases, respectively, and q is the attention query vector.

步骤3:获取商品的标题和描述信息,对商品的标题和描述信息进行分词、词形还原、移除停用词、移除出现频率极高的单词和出现频率极低的单词等处理,然后使用与步骤2相同的文本特征提取器对商品标题和描述信息进行处理得到商品内容的向量表示

Figure GDA0003560489150000073
Step 3: Obtain the title and description information of the product, perform word segmentation, morphological restoration, remove stop words, remove words with extremely high frequency and words with extremely low frequency, etc. on the title and description information of the product, and then Use the same text feature extractor as step 2 to process the product title and description information to obtain the vector representation of the product content
Figure GDA0003560489150000073

步骤4:将商品节点的向量表示ep、评论的向量表示ar、内容的向量表示at连接得到商品的最终表示p;将用户的节点表示eu作为用户的最终表示u。Step 4: Connect the vector representation ep of the commodity node, the vector representation ar of the comment, and the vector representation at t of the content to obtain the final representation p of the product; use the node representation e u of the user as the final representation u of the user.

步骤5:通过用户最终表示和商品最终表示的点积计算用户和商品的相似度:Step 5: Calculate the similarity between the user and the product through the dot product of the final representation of the user and the final representation of the product:

Figure GDA0003560489150000074
Figure GDA0003560489150000074

步骤6:使用贝叶斯个性化排序损失对提出方法中的参数进行优化:Step 6: The parameters in the proposed method are optimized using a Bayesian personalized ranking loss:

Figure GDA0003560489150000075
Figure GDA0003560489150000075

其中

Figure GDA0003560489150000076
表示成对训练数据,
Figure GDA0003560489150000077
表示用户u购买过的商品集,
Figure GDA0003560489150000078
表示用户u未购买过的商品集;σ是sigmoid函数;θ表示所有可训练模型参数,λ控制L2正则化强度以防止过拟合。in
Figure GDA0003560489150000076
represents paired training data,
Figure GDA0003560489150000077
represents the set of items purchased by user u,
Figure GDA0003560489150000078
represents the set of items that user u has not purchased; σ is the sigmoid function; θ represents all trainable model parameters, and λ controls the L2 regularization strength to prevent overfitting.

实验数据集为Amazon评论数据集的CDs_and_Vinyl、Movies_and_TV和Books三个板块。下表描述了三个数据集的统计信息:The experimental datasets are CDs_and_Vinyl, Movies_and_TV and Books of the Amazon review dataset. The following table describes the statistics for the three datasets:

Figure GDA0003560489150000081
Figure GDA0003560489150000081

对于每个数据集,将其所有交互的70%作为训练集、10%作为验证集、20%作为测试集。For each dataset, take 70% of all its interactions as the training set, 10% as the validation set, and 20% as the test set.

选择Recall@K和NDCG@K作为评估标准,实验中,K=20。Select Recall@K and NDCG@K as evaluation criteria, in the experiment, K=20.

选择的对比方法包括:BPRMF、NGCF、DeepCoNN,下表展示了相应的实验结果:The selected comparison methods include: BPRMF, NGCF, DeepCoNN, and the following table shows the corresponding experimental results:

Figure GDA0003560489150000082
Figure GDA0003560489150000082

从实验结果可以看出,本发明提供的方法在三个数据集上都实现了优于对比方法的性能。It can be seen from the experimental results that the method provided by the present invention achieves better performance than the comparison method on all three data sets.

Claims (7)

1. A commodity recommendation method based on multi-mode commodity feature fusion is characterized by comprising the following steps:
1.1, constructing a user-commodity bipartite graph, and obtaining vector representation of user nodes and vector representation of commodity nodes through graph convolution;
1.2, obtaining a comment document of a commodity, and extracting vector representation of the commodity comment through a convolutional neural network, wherein the vector representation of the commodity comment is extracted by the following method: integrating all comments obtained by each commodity into a comment document of the commodity, and performing word segmentation, word form restoration, stop word removal, extremely high-frequency word removal and extremely low-frequency word pretreatment on the comment document of the commodity; then, feature extraction is carried out on the commodity comment document through a text feature extractor, and vector representation of the commodity comment is obtained
Figure FDA0003560489140000011
The text feature extractor comprises the following steps: first, a word sequence of an input text is represented as [ w ]1,w2,…,wl]Where l is the length of the input text; the word sequence representation is then converted into a word vector representation sequence by a word embedding layer
Figure FDA0003560489140000012
Wherein d isvIs the word embedding dimension; the word vector representation sequence is then processed using a convolutional neural network to obtain a context word vector representation sequence c1,c2,…,cl]Wherein the context of the ith word represents ciThe calculation method comprises the following steps: c. Ci=LeakyReLU(Wt×v(i-k):(i+k)+bt) (ii) a Finally, a weight [ alpha ] is calculated for each word vector representation in the sequence of context word vector representations using the attention mechanism12,…,αl]The sequence of context word vector representations is then multiplied by the corresponding weights to obtain a final representation of the input text
Figure FDA0003560489140000013
Wherein alpha isiThe calculation method comprises the following steps:
Figure FDA0003560489140000014
Figure FDA0003560489140000015
1.3, acquiring the title and description information of the commodity, and extracting the vector representation of the commodity content through a convolutional neural network, wherein the extraction method of the vector representation of the commodity content is as follows: firstly, acquiring the title and description information of a commodity, and performing word segmentation, word shape restoration, stop word removal, extremely high-frequency word removal and extremely low-frequency word preprocessing on the information; then, feature extraction is carried out on the commodity title and the description information through a text feature extractor, and vector representation of the commodity content is obtained
Figure FDA0003560489140000021
1.4, obtaining a user final representation and a commodity final representation;
1.5 calculating the similarity between the user and the commodity;
1.6 parameters in the method are proposed through Bayes personalized ranking loss optimization.
2. The method for recommending commodities based on multi-modal fusion of commodity features according to claim 1, wherein the specific method for obtaining the vector representation of the user node and the commodity node in 1.1 is as follows:
2.1, constructing a user-commodity bipartite graph according to the historical records of commodities purchased by users;
and 2.2, carrying out neighbor propagation and aggregation on the user-commodity bipartite graph through graph convolution to obtain vector representation of the user node and vector representation of the commodity node.
3. The method for recommending commodities based on multi-modal commodity feature fusion according to claim 2, wherein the specific steps for constructing the user-commodity bipartite graph in 2.1 are as follows:
3.1 historical interaction records from user and merchandiseConstructing a user-commodity bipartite graph using a user-commodity adjacency matrix
Figure FDA0003560489140000022
Is represented by the formula (I) in which nuAnd npThe number of users and the number of commodities,
Figure FDA0003560489140000023
is a matrix of user-goods interactions,
Figure FDA0003560489140000029
is the transpose of the R, and,
Figure FDA0003560489140000024
Figure FDA0003560489140000025
3.2 to exploit the information of the nodes themselves in the user-commodity bipartite graph, an identity matrix is added to A
Figure FDA0003560489140000026
Meanwhile, to avoid gradient disappearance or gradient explosion during training, a diagonal matrix is used
Figure FDA0003560489140000027
Carrying out normalization processing, wherein the value on the diagonal line is the degree of each node in the user-commodity bipartite graph, thereby obtaining
Figure FDA0003560489140000028
4. The commodity recommendation method based on multi-modal commodity feature fusion as claimed in claim 2, wherein the specific steps of the 2.2 middle graph convolution are as follows:
4.1 converting the unique corresponding id of each user and each commodity into a dense vector through an embedding layer to obtain the user characteristics(Vector)
Figure FDA0003560489140000031
And commodity feature vector
Figure FDA0003560489140000032
Where d is the dimension of the feature vector;
4.2 building Embedded tables
Figure FDA0003560489140000033
To represent a feature matrix of the user-commodity bipartite graph;
4.3 use the graph convolution of the t layers to aggregate the characteristics of the node neighbors, wherein the propagation process is defined as:
Figure FDA0003560489140000034
4.4 obtaining the data from t layers by graph convolution
Figure FDA0003560489140000035
To
Figure FDA0003560489140000036
The t feature matrixes are connected to obtain a final feature matrix
Figure FDA0003560489140000037
Figure FDA0003560489140000038
E is then divided into two parts of the feature matrix
Figure FDA0003560489140000039
And
Figure FDA00035604891400000310
Figure FDA00035604891400000311
vector representations as user nodes, respectively
Figure FDA00035604891400000312
And vector representation of commodity nodes
Figure FDA00035604891400000313
5. The method for recommending commodities based on multi-modal fusion of commodity features according to claim 1, wherein the specific method for obtaining the final user representation and the final commodity representation in 1.4 is as follows:
5.1 representing the vectors of the commodity nodes by epVector representation of comments arVector representation of content atConnecting to obtain a final expression p of the commodity; representing the vector of the user node as euAs the final representation u of the user.
6. The method for recommending commodities based on multi-modal commodity feature fusion according to claim 1, wherein the calculation formula of the similarity between the user and the commodity is as follows:
Figure FDA00035604891400000314
where u is the final representation of the user,
Figure FDA00035604891400000315
representing the transpose of u and p the final representation of the good.
7. The method for recommending commodities based on multi-modal commodity feature fusion according to claim 1, wherein the specific formula of the parameters in the method for proposing loss optimization through Bayesian personalized ranking is as follows:
Figure FDA0003560489140000041
wherein
Figure FDA0003560489140000042
Representing the training data in pairs of training data,
Figure FDA0003560489140000043
representing the set of items purchased by user u,
Figure FDA0003560489140000044
indicating a set of goods not purchased by user u; σ is a sigmoid function; θ represents all trainable model parameters and λ controls the L2 regularization strength to prevent overfitting.
CN202110444726.4A 2021-04-24 2021-04-24 A product recommendation method based on multimodal product feature fusion Expired - Fee Related CN113159892B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110444726.4A CN113159892B (en) 2021-04-24 2021-04-24 A product recommendation method based on multimodal product feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110444726.4A CN113159892B (en) 2021-04-24 2021-04-24 A product recommendation method based on multimodal product feature fusion

Publications (2)

Publication Number Publication Date
CN113159892A CN113159892A (en) 2021-07-23
CN113159892B true CN113159892B (en) 2022-05-06

Family

ID=76870143

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110444726.4A Expired - Fee Related CN113159892B (en) 2021-04-24 2021-04-24 A product recommendation method based on multimodal product feature fusion

Country Status (1)

Country Link
CN (1) CN113159892B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114155512A (en) * 2021-12-07 2022-03-08 南京理工大学 Fatigue detection method and system based on multi-feature fusion of 3D convolutional network
CN114936901B (en) * 2022-05-21 2024-05-28 山东大学 Visual perception recommendation method and system based on cross-modal semantic reasoning and fusion
CN114943588B (en) * 2022-06-15 2024-07-02 厦门大学 Commodity recommendation method based on neural network noise data
CN117786234B (en) * 2024-02-28 2024-04-26 云南师范大学 Multimode resource recommendation method based on two-stage comparison learning

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106227815A (en) * 2015-07-22 2016-12-14 Tcl集团股份有限公司 The personalized application program function of a kind of multi-modal clue recommends method and system thereof
CN109559209A (en) * 2019-01-18 2019-04-02 深圳创新奇智科技有限公司 A kind of electric business clothes based on multi-modal information, which are worn, takes recommended method
CN110263256A (en) * 2019-06-21 2019-09-20 西安电子科技大学 Personalized recommendation method based on multi-modal heterogeneous information
EP3557499A1 (en) * 2018-04-20 2019-10-23 Facebook, Inc. Assisting users with efficient information sharing among social connections
CN111222332A (en) * 2020-01-06 2020-06-02 华南理工大学 A Product Recommendation Method Combining Attention Network and User Sentiment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106227815A (en) * 2015-07-22 2016-12-14 Tcl集团股份有限公司 The personalized application program function of a kind of multi-modal clue recommends method and system thereof
EP3557499A1 (en) * 2018-04-20 2019-10-23 Facebook, Inc. Assisting users with efficient information sharing among social connections
CN109559209A (en) * 2019-01-18 2019-04-02 深圳创新奇智科技有限公司 A kind of electric business clothes based on multi-modal information, which are worn, takes recommended method
CN110263256A (en) * 2019-06-21 2019-09-20 西安电子科技大学 Personalized recommendation method based on multi-modal heterogeneous information
CN111222332A (en) * 2020-01-06 2020-06-02 华南理工大学 A Product Recommendation Method Combining Attention Network and User Sentiment

Also Published As

Publication number Publication date
CN113159892A (en) 2021-07-23

Similar Documents

Publication Publication Date Title
CN113159892B (en) A product recommendation method based on multimodal product feature fusion
CN110648163B (en) Recommendation algorithm based on user comments
CN113468227A (en) Information recommendation method, system, device and storage medium based on graph neural network
WO2023065859A1 (en) Item recommendation method and apparatus, and storage medium
CN109033294B (en) Mixed recommendation method for integrating content information
CN109241424A (en) A kind of recommended method
EP3300002A1 (en) Method for determining the similarity of digital images
CN108563755A (en) A kind of personalized recommendation system and method based on bidirectional circulating neural network
CN107357793A (en) Information recommendation method and device
CN112085525A (en) User network purchasing behavior prediction research method based on hybrid model
CN116150480A (en) User personalized demand prediction method integrating multi-mode comment information
CN114238758A (en) A user portrait prediction method based on multi-source cross-border data fusion
CN112734519B (en) Commodity recommendation method based on convolution self-encoder network
CN114330291A (en) Text recommendation system based on dual attention mechanism
CN113159891A (en) Commodity recommendation method based on fusion of multiple user representations
CN112818256B (en) A recommendation method based on neural collaborative filtering
CN111930926B (en) Personalized recommendation algorithm combined with comment text mining
Gu et al. Fashion coordinates recommendation based on user behavior and visual clothing style
CN117745371A (en) A fair recommendation method and system based on conditional diffusion model
CN110321565B (en) Real-time text emotion analysis method, device and equipment based on deep learning
CN115204967A (en) Recommendation method integrating implicit feedback of long-term and short-term interest representation of user
CN115344794A (en) A tourist attraction recommendation method based on knowledge graph semantic embedding
CN108984551A (en) A kind of recommended method and system based on the multi-class soft cluster of joint
CN114022233A (en) A Novel Product Recommendation Method
CN116738035B (en) Recommendation rearrangement method based on window sliding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20210723

Assignee: Guangxi wisdom Valley Technology Co.,Ltd.

Assignor: GUILIN University OF ELECTRONIC TECHNOLOGY

Contract record no.: X2022450000202

Denomination of invention: A Product Recommendation Method Based on Multimodal Product Feature Fusion

Granted publication date: 20220506

License type: Common License

Record date: 20221125

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20220506