CN117495511A

CN117495511A - A product recommendation system and method based on comparative learning and community perception

Info

Publication number: CN117495511A
Application number: CN202311681385.8A
Authority: CN
Inventors: 郭昆; 李欣莹; 林家琪
Original assignee: Fuzhou University
Current assignee: Fuzhou University
Priority date: 2023-12-08
Filing date: 2023-12-08
Publication date: 2024-02-02

Abstract

The present invention proposes a product recommendation system and method based on contrastive learning and community perception. First, an adaptive graph enhancement strategy is designed. When performing data enhancement on the original graph, the importance of nodes and edges is considered and edges with high importance are retained. and node attributes, requiring the enhanced graph to be significantly different from the original graph; secondly, an encoder based on graph neural network and multi-layer perceptron is used to generate the representation vectors of the original graph and the enhanced graph; thirdly, the design is based on the relative distance of nodes. Comparing the selection strategy, select multiple nodes with the closest relative distance to each node as its positive samples, and use the remaining nodes as negative samples; then, use a clustering algorithm to divide the learned node representation vector into communities; finally, Based on the obtained community division results, product recommendations are made; the present invention can improve the accuracy of final personalized product recommendation while improving the cohesion of the generated community structure, thereby improving user satisfaction and product sales.

Description

A product recommendation system and method based on comparative learning and community perception

技术领域Technical field

本发明涉及表示学习社区发现技术领域，尤其是一种基于对比学习和社区感知的商品推荐系统及方法。The present invention relates to the technical field of representation learning community discovery, and in particular to a product recommendation system and method based on comparative learning and community perception.

背景技术Background technique

随着科技的迅猛发展和网络的普及，人们的社交和购物等行为也逐渐转为线上。现实世界中的人、事、物及其联系可以抽象为复杂网络。社区结构特性是复杂网络的一个重要的特性。社区结构是指由联系紧密、互动频繁且相似性较高的个体构成的群体组织，社区内部连接紧密，社区之间连接稀疏。研究社交结构可以帮助我们识别出具有类似兴趣、职业等特征的用户群体，帮助企业更好的了解用户偏好，为其提供更加个性化的服务和商品，提高用户满意度。目前基于表示学习社区发现的商品推荐方法的研究和技术仍然存在着以下不足：随机的数据增强策略可能会干扰甚至破坏图中的关键社区结构，导致社区发现精度下降，进而影响商品推荐的准确度。With the rapid development of technology and the popularity of the Internet, people's social networking and shopping behaviors have gradually shifted online. People, things, objects and their connections in the real world can be abstracted into complex networks. Community structure characteristics are an important characteristic of complex networks. Community structure refers to a group organization composed of individuals with close connections, frequent interactions, and high similarity. The connections within the community are tight and the connections between communities are sparse. Studying social structure can help us identify user groups with similar interests, occupations and other characteristics, help companies better understand user preferences, provide them with more personalized services and products, and improve user satisfaction. Current research and technology on product recommendation methods based on representation learning community discovery still have the following shortcomings: random data enhancement strategies may interfere with or even destroy the key community structure in the graph, leading to a decrease in community discovery accuracy, which in turn affects the accuracy of product recommendation. .

发明内容Contents of the invention

本发明提出一种基于对比学习和社区感知的商品推荐系统及方法，能在提升所生成的社区结构的内聚性的情况下，提高最终个性化商品推荐的精确度，进而提高用户满意度和商品销量，具体较好的价值。The present invention proposes a product recommendation system and method based on comparative learning and community perception, which can improve the accuracy of final personalized product recommendation while improving the cohesion of the generated community structure, thereby improving user satisfaction and Product sales, specific better value.

本发明采用以下技术方案。The present invention adopts the following technical solutions.

一种基于对比学习和社区感知的商品推荐系统，用于结合社区发现结果进行商品推荐，来帮助企业更有效和准确地获取用户的喜好、兴趣和需求信息，提供更加个性化的服务和商品；所述系统包括以下步骤；A product recommendation system based on comparative learning and community perception, which is used to recommend products based on community discovery results to help enterprises obtain user preferences, interests and demand information more effectively and accurately, and provide more personalized services and products; The system includes the following steps;

首先、设计自适应图增强策略，在对原始图进行数据增强时，考虑节点和边的重要性，保留重要性高的边和节点属性，同时，还要求增强后的图与原始图具有大的差异，防止模型陷入局部最优；First, design an adaptive graph enhancement strategy. When performing data enhancement on the original graph, consider the importance of nodes and edges and retain the attributes of highly important edges and nodes. At the same time, it is also required that the enhanced graph has a large difference from the original graph. Differences prevent the model from falling into local optimality;

其次，采用基于图神经网络和多层感知机的编码器来生成原始图和增强图的表示向量；Secondly, an encoder based on graph neural network and multi-layer perceptron is used to generate the representation vectors of the original graph and the enhanced graph;

再次，设计基于节点相对距离的对比对选择策略，为每个节点选择与其相对距离最近的多个节点作为其正样本，再将其余节点作为负样本，以保证所生成的社区结构具有较高的内聚性；Thirdly, a comparison selection strategy based on the relative distance of nodes is designed to select multiple nodes with the closest relative distance to each node as its positive samples, and then use the remaining nodes as negative samples to ensure that the generated community structure has a high cohesion;

从次，使用聚类算法将学习得到的节点表示向量划分为社区；Next, a clustering algorithm is used to divide the learned node representation vectors into communities;

最后，基于得到的社区划分结果，获取用户的喜好、兴趣和需求信息，进行社区内和跨社区的商品推荐，向用户提供更加个性化的服务。Finally, based on the obtained community division results, the user's preferences, interests and demand information are obtained, product recommendations within the community and across communities are carried out, and more personalized services are provided to users.

一种基于对比学习和社区感知的商品推荐系统，包括自适应图数据增强模块、表示向量生成模块、对比对选择模块、损失函数计算模块、社区生成模块和商品推荐模块；A product recommendation system based on contrastive learning and community perception, including an adaptive graph data enhancement module, a representation vector generation module, a comparison selection module, a loss function calculation module, a community generation module and a product recommendation module;

所述自适应图数据增强模块，用于生成与原始图G差异较大的增强图G′；其以考虑节点重要性的数据增强策略来避免在数据增强过程中破坏图中的社区结构；其对原始图进行拓扑级别的数据增强以移除社交网络中不重要的边，和属性级别的数据增强以屏蔽社交网络中不重要的节点的属性；同时要求增强后的图与原始图有较大的差异，避免模型陷入局部最优；The adaptive graph data enhancement module is used to generate an enhanced graph G′ that is significantly different from the original graph G; it uses a data enhancement strategy that considers the importance of nodes to avoid destroying the community structure in the graph during the data enhancement process; Perform topology-level data enhancement on the original graph to remove unimportant edges in the social network, and attribute-level data enhancement to shield the attributes of unimportant nodes in the social network; at the same time, the enhanced graph is required to be larger than the original graph. differences to prevent the model from falling into local optimality;

所述表示向量生成模块，用于对原始图G和增强图G′进行编码；使用图卷积神经GCN和多层感知机MLP组成的编码器对原始图和增强图编码，分别得到节点表示向量z和z′；The representation vector generation module is used to encode the original graph G and the enhanced graph G′; an encoder composed of graph convolutional neural network GCN and multi-layer perceptron MLP is used to encode the original graph and the enhanced graph to obtain node representation vectors respectively. z and z′;

所述对比对选择模块，用于选择有利于提高社区发现精度和使社区边界更加清晰的正样本和负样本；将与目标节点相对距离最近的多个节点作为其正样本，其余节点作为负样本；其中，节点相对距离由拓扑距离和属性距离组成；The comparison selection module is used to select positive samples and negative samples that are beneficial to improving community discovery accuracy and making community boundaries clearer; multiple nodes with the closest relative distance to the target node are used as their positive samples, and the remaining nodes are used as negative samples. ; Among them, the node relative distance consists of topological distance and attribute distance;

所述损失函数计算模块，用于计算对比损失并通过反向传播优化GCN和MLP的参数；其中对比损失/>由原始图G中节点u的损失/>和增强图G′中节点u′的损失组成；The loss function calculation module is used to calculate the contrast loss And optimize the parameters of GCN and MLP through backpropagation; where contrast loss/> From the loss of node u in the original graph G/> and the loss of node u′ in the enhanced graph G′ composition;

所述社区生成模块，用于生成社区；使用KMeans聚类算法对编码器所学习到的节点表示向量z进行聚类，得到社区划分结果{C_i}；The community generation module is used to generate communities; use the KMeans clustering algorithm to cluster the node representation vector z learned by the encoder to obtain community division results {C _i };

所述商品推荐模块，用于根据得到的社区划分结果对用户进行个性化商品推荐；其能够向用户推荐同一社区内其他用户喜爱的商品，还能推荐该社区内的热门商品，亦能向用户推荐关联性强的其他社区中的相关商品。The product recommendation module is used to recommend personalized products to users based on the obtained community division results; it can recommend to users products that other users in the same community like, and can also recommend popular products in the community, and can also recommend products to users. Recommend related products in other communities with strong relevance.

一种基于对比学习和社区感知的商品推荐方法，采用基于对比学习和社区感知的商品推荐系统，包括以下步骤；A product recommendation method based on contrastive learning and community perception, using a product recommendation system based on contrastive learning and community perception, including the following steps;

步骤S1：根据用户的社交记录构建社交网络G＝{V，E，A，X}，其中，V＝{v₁，v₂，…，v_n}是社交网络的节点集，E表示社交网络的边集，e_ij＝(v_i，v_j)∈E表示节点v_i和节点v_j之间存在边；矩阵是网络的邻接矩阵，当e_ij∈E时，A_ij＝1，否则A_ij＝0。/>是社交网络中节点的属性矩阵，m是节点属性的维度，X_ij表示节点i的第j维属性的值；Step S1: Construct _a social network G={ _V , _E , A, The edge set of e _ij = (v _i , v _j )∈E indicates that there is an edge between node v _i and node v _j ; matrix is the adjacency matrix of the network. When e _ij ∈E, A _ij =1, otherwise A _ij =0. /> is the attribute matrix of the node in the social network, m is the dimension of the node attribute, and X _ij represents the value of the j-th dimension attribute of node i;

步骤S2：对输入图G分别进行拓扑级别的数据增强和属性级别的数据增强，同时要求增强后的图与原始图存在大的差异，最终得到增强图G′；Step S2: Perform topology-level data enhancement and attribute-level data enhancement on the input graph G respectively. At the same time, the enhanced graph is required to be significantly different from the original graph, and finally the enhanced graph G′ is obtained;

步骤S3：通过一对由图卷积神经GCN和多层感知机MLP组成的编码器，分别生成原始图G和增强图G′的节点表示向量z和z′；Step S3: Generate the node representation vectors z and z′ of the original graph G and the enhanced graph G′ respectively through a pair of encoders composed of graph convolutional neural GCN and multi-layer perceptron MLP;

步骤S4：计算原始图G节点间的相对距离，根据计算结果，为每个节点选择相对距离最近的多个节点作为其正样本集，将其余节点作为负样本集；Step S4: Calculate the relative distance between the nodes of the original graph G. Based on the calculation results, select the nodes with the closest relative distance for each node as its positive sample set, and use the remaining nodes as the negative sample set;

步骤S5：根据选择的正样本集和负样本集，计算对比损失并通过反向传播优化GCN和MLP的参数；Step S5: Calculate the contrast loss based on the selected positive sample set and negative sample set And optimize the parameters of GCN and MLP through backpropagation;

步骤S6：使用KMeans聚类算法对编码器所学习到的节点表示向量z进行聚类，将聚类产生的簇作为社区，以生成社区划分结果{C_i}；Step S6: Use the KMeans clustering algorithm to cluster the node representation vector z learned by the encoder, and use the clusters generated by the clustering as communities to generate community division results {C _i };

步骤S7：结合社区发现结果提供商品推荐服务，包括推荐同一社区内其他用户感兴趣的商品和关联性强的其他社区中的热门商品，进而使商品生产企业进一步了解目标用户的需求，实现更加精准的商品推荐。Step S7: Combine the community discovery results to provide product recommendation services, including recommending products that other users are interested in in the same community and popular products in other closely related communities, so that product production companies can further understand the needs of target users and achieve more accurate results. product recommendations.

步骤S2具体为：Step S2 is specifically as follows:

步骤S21：对图G进行拓扑级别的数据增强；根据下述公式(1)从原始边集E中采样重要性较高的节点的边，形成增强后图的边集E′；Step S21: Perform topology-level data enhancement on the graph G; sample the edges of nodes with higher importance from the original edge set E according to the following formula (1) to form the edge set E′ of the enhanced graph;

其中，是采样边(u，v)的概率，由边的重要性决定，根据公式(2)和公式(3)进行计算；in, is the probability of sampling edge (u, v), which is determined by the importance of the edge and is calculated according to formula (2) and formula (3);

其中，为节点v的度中心性，大小等于节点的度数，即/> 为边(u，v)的重要性，大小等于它所连接的两个节点的重要性的平均值；/>是图中所有边的重要性的最大值，η₁是用于控制边移除概率的系数，是人工指定的参数；in, is the degree centrality of node v, and its size is equal to the degree of the node, that is, /> is the importance of edge (u, v), and its size is equal to the average importance of the two nodes it connects;/> is the maximum value of the importance of all edges in the graph, η ₁ is the coefficient used to control the probability of edge removal, and is a manually specified parameter;

步骤S22：得到步骤S21采样的边子集E′之后，将其转换为邻接矩阵A′用于后续流程；Step S22: After obtaining the edge subset E′ sampled in step S21, convert it into an adjacency matrix A′ for subsequent processes;

步骤S23：对图G进行属性级别的数据增强。首先，对图G中的每个节点u，根据下述公式(4)，从伯努利分布中以的概率采样一个值，所有采样出来的值形成一个n维的向量b∈{0，1}ⁿ，即/>n为图G的节点数量；Step S23: Perform attribute-level data enhancement on graph G. First, for each node u in the graph G, according to the following formula (4), from the Bernoulli distribution Sample a value with probability, and all sampled values form an n-dimensional vector b∈{0, 1} ⁿ , that is, /> n is the number of nodes in graph G;

是保留节点u的属性的概率，重要性高的节点被保留的概率更高。/>是节点u的重要性，/>是图中所有节点的重要性的最大值，η₂是用于控制节点属性的掩盖概率的系数，是人工指定的参数； is the probability of retaining the attributes of node u. Nodes with high importance have a higher probability of being retained. /> is the importance of node u,/> is the maximum value of the importance of all nodes in the graph, eta ₂ is the coefficient used to control the masking probability of node attributes, and is a manually specified parameter;

步骤S24：基于步骤S23得到的向量b和原始特征矩阵X，根据公式(5)，计算增强后的属性矩阵X′；Step S24: Based on the vector b obtained in step S23 and the original feature matrix X, calculate the enhanced attribute matrix X′ according to formula (5);

X′＝diag(b)·X 公式(5)X′＝diag(b)·X Formula (5)

其中，diag(·)表示将一个向量展开为对角矩阵。Among them, diag(·) means expanding a vector into a diagonal matrix.

步骤S25：得到增强后的图G′，增强图可由增强后的邻接矩阵A′和属性矩阵X′表示，即G′＝(A′，X′)；Step S25: Obtain the enhanced graph G′. The enhanced graph can be represented by the enhanced adjacency matrix A′ and attribute matrix X′, that is, G′ = (A′, X′);

步骤S26：为防止因为增强后的图与原始图过于相似，导致模型陷入局部最优解，进而影响社区发现任务的精度，要求增强后的图G′与原始图G有较大差异；根据下述公式(6)，计算G′与G的邻接矩阵和属性矩阵欧式距离之和；Step S26: In order to prevent the model from falling into a local optimal solution because the enhanced graph is too similar to the original graph, thereby affecting the accuracy of the community discovery task, the enhanced graph G′ is required to be significantly different from the original graph G; according to the following According to formula (6), calculate the sum of the adjacency matrix and attribute matrix Euclidean distance between G′ and G;

其中，n为节点数，m为节点属性的维度；Among them, n is the number of nodes, m is the dimension of node attributes;

步骤S27：上述图数据增强步骤将反复进行，直到邻接矩阵和属性矩阵欧式距离之和g大于给定的阈值σ或图数据增强的执行次数达到最大增强次数I_max。Step S27: The above graph data enhancement steps will be repeated until the sum g of the Euclidean distance of the adjacency matrix and the attribute matrix is greater than the given threshold σ or the number of times of graph data enhancement reaches the maximum number of enhancements I _max .

所述步骤S3具体为：The step S3 is specifically:

步骤S31：给定一个网络，对其进行表示学习指学习一个转换函数并且d＜＜|V|，|V|表示网络G的节点数。即网络中的每个节点经过f(v)函数后被转换为一个d维的表示向量z；Step S31: Given a network, representation learning refers to learning a transformation function And d＜＜|V|, |V| represents the number of nodes in network G. That is, each node in the network is converted into a d-dimensional representation vector z after passing the f(v) function;

步骤S32：使用由图卷积神经网络GCN和多层感知机MLP组成的编码器f(·)对原始图G和增强图G′进行编码，分别得到节点表示向量z和z′，用于后续流程。Step S32: Use the encoder f(·) composed of the graph convolutional neural network GCN and the multi-layer perceptron MLP to encode the original graph G and the enhanced graph G′, and obtain the node representation vectors z and z′ respectively for subsequent use. process.

所述步骤S4具体为：The step S4 is specifically:

步骤S41：计算目标节点u与其他节点的拓扑距离假设节点v₁到节点v_n的最短路径是P＝(v₁，v₂，...，v_n)∈V×V×...×V，则这两个节点间的距离/>就是这两个节点的最短距离；其中，/>是节点权值映射函数，这里取单位权重f：E→{1}；以节点间的最短距离，作为节点间的拓扑距离；Step S41: Calculate the topological distance between target node u and other nodes Assume that the shortest path from node v ₁ to node v _n is P=(v ₁ , v ₂ ,..., v _n )∈V×V×...×V, then the distance between these two nodes/> is the shortest distance between these two nodes; where, /> is the node weight mapping function, where the unit weight f is taken: E→{1}; the shortest distance between nodes is used as the topological distance between nodes;

步骤S42：根据公式(7)，计算节点间的属性距离 Step S42: Calculate the attribute distance between nodes according to formula (7)

其中，x_u和x_v分别是节点u和节点V的属性向量；Among them, x _u and x _v are the attribute vectors of node u and node V respectively;

步骤S43：基于步骤S41得到的节点拓扑距离和步骤S42得到的节点属性距离根据公式(8)，计算节点相对距离；Step S43: Based on the node topological distance obtained in step S41 and the node attribute distance obtained in step S42 According to formula (8), calculate the relative distance of nodes;

其中，和/>分别表示图中节点拓扑距离的最小值和最大值。/>和分别表示图中节点间属性距离的最小值和最大值。λ是用于调节节点拓扑距离和属性距离所占比重的参数；in, and/> Respectively represent the minimum and maximum values of the topological distance of nodes in the graph. /> and Respectively represent the minimum and maximum value of the attribute distance between nodes in the graph. λ is a parameter used to adjust the proportion of node topological distance and attribute distance;

步骤S44：基于步骤S43得到的节点相对距离，将这些节点按照相对距离升序排序，得到节点序列R_u；Step S44: Based on the relative distance of the nodes obtained in step S43, sort these nodes in ascending order of relative distance to obtain the node sequence _Ru ;

步骤S45：选择节点序列中的前个节点作为目标节点u的正样本集P_u，其他节点作为节点u的负样本集N_u，如公式(9)和公式(10)所示。Step S45: Select the first node in the sequence Nodes serve as the positive sample set P _u of the target node u, and other nodes serve as the negative sample set N _u of the node u, as shown in formula (9) and formula (10).

所述步骤S5具体为：The step S5 is specifically:

步骤S51：基于对比学习常用的InfoNCE损失函数，根据公式(11)，计算节点u的损失函数；Step S51: Based on the commonly used InfoNCE loss function in comparative learning, calculate the loss function of node u according to formula (11);

其中，z_u表示节点u的表示向量，z_i和z′_i分别是节点i在原图G和增强图G′中的表示向量，sim(z_u，z_v)表示节点u和v之间的余弦相似度，τ是用于调整节点之间相似度的温度参数；Among them, z _u represents the representation vector of node u, z _i and z′ _i are the representation vectors of node i in the original graph G and the enhanced graph G′ respectively, and sim(z _u , z _v ) represents the relationship between nodes u and v. Cosine similarity, τ is the temperature parameter used to adjust the similarity between nodes;

步骤S52：根据公式(12)，计算模型的总损失函数，大小为和在所有节点上的平均和。Step S52: According to formula (12), calculate the total loss function of the model, with a size of and Average sum across all nodes.

步骤S53：通过反向传播调整编码器的参数，不断重复训练过程直至模型收敛。Step S53: Adjust the parameters of the encoder through backpropagation, and repeat the training process until the model converges.

所述步骤S6具体为：The specific step S6 is:

步骤S61：随机选择k个节点作为初始的簇中心；Step S61: Randomly select k nodes as initial cluster centers;

步骤S62：计算每个节点到簇中心的欧氏距离，将每个节点分配到距离最近的簇中心所在的簇中；Step S62: Calculate the Euclidean distance between each node and the cluster center, and assign each node to the cluster with the nearest cluster center;

步骤S63：将每个簇的簇中心更新为该簇所有节点的平均值；Step S63: Update the cluster center of each cluster to the average value of all nodes in the cluster;

步骤S64：重复步骤S62和步骤S63，直到簇中心不再发生变化或达到最大迭代次数，得到社区划分结果{C_i}。Step S64: Repeat steps S62 and S63 until the cluster center no longer changes or the maximum number of iterations is reached, and the community division result {C _i } is obtained.

所述步骤S7具体为：The step S7 is specifically:

步骤S71：根据步骤S6，将具有相似兴趣和行为模式的用户划分到同一个社区中；基于社区发现结果，向用户推荐同一社区中相似性和关联性强的其他用户喜爱的商品，以提高推荐的准确性和用户满意度；Step S71: According to step S6, users with similar interests and behavior patterns are divided into the same community; based on the community discovery results, recommend to the user products that are liked by other users with strong similarity and correlation in the same community to improve recommendations. accuracy and user satisfaction;

步骤S72：基于社区发现结果，识别出在某个社区中特别受欢迎的商品，通过向社区内的其他用户推荐热门商品，来提高用户参与度和帮助商品生产企业提高销售量；Step S72: Based on the community discovery results, identify products that are particularly popular in a certain community, and recommend popular products to other users in the community to increase user participation and help product manufacturers increase sales;

步骤S73：当不同的社区之间存在一些共同的兴趣点或关联性时，通过利用这种关联性，向用户推荐关联性强的其他社区中的商品，以跨社区推荐帮助用户发现新的兴趣领域，丰富其购物体验。Step S73: When there are some common points of interest or correlations between different communities, use this correlation to recommend products in other communities with strong correlations to the user, and use cross-community recommendations to help users discover new interests. areas to enrich their shopping experience.

本发明的优点在于：应用本发明能够避免在数据增强的过程中破坏图中的社区结构，并选择合适的正负样本进行对比学习，提升所生成的社区结构的内聚性，最终提高商品推荐的准确度。The advantage of the present invention is that the application of the present invention can avoid destroying the community structure in the graph during the data enhancement process, select appropriate positive and negative samples for comparative learning, improve the cohesion of the generated community structure, and ultimately improve product recommendation. accuracy.

附图说明Description of the drawings

下面结合附图和具体实施方式对本发明进一步详细的说明：The present invention will be described in further detail below in conjunction with the accompanying drawings and specific embodiments:

附图1是本发明的流程示意图。Figure 1 is a schematic flow diagram of the present invention.

具体实施方式Detailed ways

如图所示，一种基于对比学习和社区感知的商品推荐系统，用于结合社区发现结果进行商品推荐，来帮助企业更有效和准确地获取用户的喜好、兴趣和需求信息，提供更加个性化的服务和商品；所述系统包括以下步骤；As shown in the figure, a product recommendation system based on comparative learning and community perception is used to recommend products based on community discovery results to help enterprises obtain user preferences, interests and demand information more effectively and accurately, and provide more personalized Services and goods; the system includes the following steps;

最后，基于得到的社区划分结果，获取用户的喜好、兴趣和需求信息，进行社区内和跨社区的商品推荐，向用户提供更加个性化的服务。Finally, based on the obtained community division results, the user's preferences, interests and demand information are obtained, product recommendations within the community and across communities are made, and more personalized services are provided to the user.

步骤S2具体为：Step S2 is specifically as follows:

X′＝diag(b)·X 公式(5)X′＝diag(b)·X Formula (5)

所述步骤S3具体为：The specific step S3 is:

所述步骤S4具体为：The step S4 is specifically:

步骤S41：计算目标节点u与其他节点的拓扑距离假设节点v₁到节点v_n的最短路径是P＝(v₁，v_2，...，v_n)∈V×V×...×V，则这两个节点间的距离/>就是这两个节点的最短距离；其中，/>是节点权值映射函数，这里取单位权重f：E→{1}；以节点间的最短距离，作为节点间的拓扑距离；Step S41: Calculate the topological distance between target node u and other nodes Assume that the shortest path from node v ₁ to node v _n is P = (v ₁ , v _{2 ,} ..., v _n )∈V×V×...×V, then the distance between these two nodes/> is the shortest distance between these two nodes; where, /> is the node weight mapping function, where the unit weight f is taken: E→{1}; the shortest distance between nodes is used as the topological distance between nodes;

所述步骤S5具体为：The step S5 is specifically:

所述步骤S6具体为：The specific step S6 is:

所述步骤S7具体为：The step S7 is specifically:

Claims

1. A product recommendation system based on comparative learning and community perception, which is used to recommend products based on community discovery results to help enterprises obtain user preferences, interests and demand information more effectively and accurately, and provide more personalized services and Commodity; characterized in that: the system includes the following steps;

First, design an adaptive graph enhancement strategy. When performing data enhancement on the original graph, consider the importance of nodes and edges and retain the attributes of highly important edges and nodes. At the same time, it is also required that the enhanced graph has a large difference from the original graph. Differences prevent the model from falling into local optimality;

Secondly, an encoder based on graph neural network and multi-layer perceptron is used to generate the representation vectors of the original graph and the enhanced graph;

Thirdly, a comparison selection strategy based on the relative distance of nodes is designed to select multiple nodes with the closest relative distance to each node as its positive samples, and then use the remaining nodes as negative samples to ensure that the generated community structure has a high cohesion;

Next, a clustering algorithm is used to divide the learned node representation vectors into communities;

Finally, based on the obtained community division results, the user's preferences, interests and demand information are obtained, product recommendations within the community and across communities are carried out, and more personalized services are provided to users.

2. A product recommendation system based on contrastive learning and community perception according to claim 1, characterized by: including an adaptive graph data enhancement module, a representation vector generation module, a comparison selection module, a loss function calculation module, a community Generation module and product recommendation module;

The adaptive graph data enhancement module is used to generate an enhanced graph G′ that is significantly different from the original graph G; it uses a data enhancement strategy that considers the importance of nodes to avoid destroying the community structure in the graph during the data enhancement process; Perform topology-level data enhancement on the original graph to remove unimportant edges in the social network, and attribute-level data enhancement to shield the attributes of unimportant nodes in the social network; at the same time, the enhanced graph is required to be larger than the original graph. differences to prevent the model from falling into local optimality;

The representation vector generation module is used to encode the original graph G and the enhanced graph G′; an encoder composed of graph convolutional neural network GCN and multi-layer perceptron MLP is used to encode the original graph and the enhanced graph to obtain node representation vectors respectively. z and z′;

The comparison selection module is used to select positive samples and negative samples that are beneficial to improving community discovery accuracy and making community boundaries clearer; multiple nodes with the closest relative distance to the target node are used as their positive samples, and the remaining nodes are used as negative samples. ; Among them, the node relative distance consists of topological distance and attribute distance;

The loss function calculation module is used to calculate the contrast loss And optimize the parameters of GCN and MLP through backpropagation; where contrast loss/> From the loss of node u in the original graph G/> and the loss of node u′ in the enhanced graph G′ composition;

The community generation module is used to generate communities; use the KMeans clustering algorithm to cluster the node representation vector z learned by the encoder to obtain community division results {C _i };

The product recommendation module is used to recommend personalized products to users based on the obtained community division results; it can recommend to users products that other users in the same community like, and can also recommend popular products in the community, and can also recommend products to users. Recommend related products in other communities with strong relevance.

3. A product recommendation method based on comparative learning and community perception, using a product recommendation system based on comparative learning and community perception, which is characterized by: including the following steps;

Step _Sl : Construct _a social network G={ _V , E, A, The edge set of e _ij = (v _i , v _j )∈E indicates that there is an edge between node v _i and node v _j ; matrix is the adjacency matrix of the network. When e _ij ∈E, A _ij =1, otherwise A _ij =0. /> is the attribute matrix of the node in the social network, m is the dimension of the node attribute, and X _ij represents the value of the j-th dimension attribute of node i;

Step S2: Perform topology-level data enhancement and attribute-level data enhancement on the input graph G respectively. At the same time, the enhanced graph is required to be significantly different from the original graph, and finally the enhanced graph G′ is obtained;

Step S3: Generate the node representation vectors z and z′ of the original graph G and the enhanced graph G′ respectively through a pair of encoders composed of graph convolutional neural GCN and multi-layer perceptron MLP;

Step S4: Calculate the relative distance between the nodes of the original graph G. Based on the calculation results, select the nodes with the closest relative distance for each node as its positive sample set, and use the remaining nodes as the negative sample set;

Step S5: Calculate the contrast loss based on the selected positive sample set and negative sample set And optimize the parameters of GCN and MLP through backpropagation;

Step S6: Use the KMeans clustering algorithm to cluster the node representation vector z learned by the encoder, and use the clusters generated by the clustering as communities to generate community division results {C _i };

Step S7: Combine the community discovery results to provide product recommendation services, including recommending products that other users are interested in in the same community and popular products in other closely related communities, so that product production companies can further understand the needs of target users and achieve more accurate results. product recommendations.

4. A product recommendation method based on comparative learning and community perception according to claim 3, characterized in that step S2 is specifically:

Step S21: Perform topology-level data enhancement on the graph G; sample the edges of nodes with higher importance from the original edge set E according to the following formula (1) to form the edge set E′ of the enhanced graph;

in, is the probability of sampling edge (u, v), which is determined by the importance of the edge and is calculated according to formula (2) and formula (3);

in, is the degree centrality of node v, and its size is equal to the degree of the node, that is, /> is the importance of edge (u, v), and its size is equal to the average importance of the two nodes it connects;/> is the maximum value of the importance of all edges in the graph, η ₁ is the coefficient used to control the probability of edge removal, and is a manually specified parameter;

Step S22: After obtaining the edge subset E′ sampled in step S21, convert it into an adjacency matrix A′ for subsequent processes;

Step S23: Perform attribute-level data enhancement on graph G. First, for each node u in the graph G, according to the following formula (4), from the Bernoulli distribution Sample a value with probability, and all sampled values form an n-dimensional vector b∈{0, 1} ⁿ , that is, /> n is the number of nodes in graph G;

is the probability of retaining the attributes of node u. Nodes with high importance have a higher probability of being retained. /> is the importance of node u,/> is the maximum value of the importance of all nodes in the graph, eta ₂ is the coefficient used to control the masking probability of node attributes, and is a manually specified parameter;

Step S24: Based on the vector b obtained in step S23 and the original feature matrix X, calculate the enhanced attribute matrix X′ according to formula (5);

X′＝diag(b)·X Formula (5)

Among them, diag(·) means expanding a vector into a diagonal matrix.

Step S25: Obtain the enhanced graph G′. The enhanced graph can be represented by the enhanced adjacency matrix A′ and attribute matrix X′, that is, G′ = (A′, X′);

Step S26: In order to prevent the model from falling into a local optimal solution because the enhanced graph is too similar to the original graph, thereby affecting the accuracy of the community discovery task, the enhanced graph G′ is required to be significantly different from the original graph G; according to the following According to formula (6), calculate the sum of the adjacency matrix and attribute matrix Euclidean distance between G′ and G;

Among them, n is the number of nodes, m is the dimension of node attributes;

Step S27: The above graph data enhancement steps will be repeated until the sum g of the Euclidean distance of the adjacency matrix and the attribute matrix is greater than the given threshold σ or the number of times of graph data enhancement reaches the maximum number of enhancements I _max .

5. A product recommendation method based on comparative learning and community perception according to claim 3, characterized in that: the step S3 is specifically:

Step S31: Given a network, representation learning refers to learning a transformation function f(v): v→z, v∈V, and d<<|V|, |V| represents the number of nodes of network G. That is, each node in the network is converted into a d-dimensional representation vector z after passing the f(v) function;

Step S32: Use the encoder f(·) composed of the graph convolutional neural network GCN and the multi-layer perceptron MLP to encode the original graph G and the enhanced graph G′, and obtain the node representation vectors z and z′ respectively for subsequent use. process.

6. A product recommendation method based on comparative learning and community perception according to claim 3, characterized in that: the step S4 is specifically:

Step S41: Calculate the topological distance between target node u and other nodes Assume that the shortest path from node v ₁ to node v _n is P=(v ₁ , v ₂ ,..., v _n )∈V×V×...×V, then the distance between these two nodes/> is the shortest distance between these two nodes; where, /> is the node weight mapping function, where the unit weight f is taken: E→{1}; the shortest distance between nodes is used as the topological distance between nodes;

Step S42: Calculate the attribute distance between nodes according to formula (7)

Among them, x _u and x _v are the attribute vectors of node u and node v respectively;

Step S43: Based on the node topological distance obtained in step S41 and the node attribute distance obtained in step S42/> According to formula (8), calculate the relative distance of nodes;

in, and/> Respectively represent the minimum and maximum values of the topological distance of nodes in the graph. /> and/> Respectively represent the minimum and maximum value of the attribute distance between nodes in the graph. λ is a parameter used to adjust the proportion of node topological distance and attribute distance;

Step S44: Based on the relative distance of the nodes obtained in step S43, sort these nodes in ascending order of relative distance to obtain the node sequence _Ru ;

Step S45: Select the first node in the sequence Nodes serve as the positive sample set P _u of the target node u, and other nodes serve as the negative sample set N _u of the node u, as shown in formula (9) and formula (10).

7. A product recommendation method based on comparative learning and community perception according to claim 3, characterized in that: the step S5 is specifically:

Step S51: Based on the commonly used InfoNCE loss function in comparative learning, calculate the loss function of node u according to formula (11);

Among them, z _u represents the representation vector of node u, z _i and z′ _i are the representation vectors of node i in the original graph G and the enhanced graph G′ respectively, and sim(z _u , z _v ) represents the relationship between nodes u and v. Cosine similarity, τ is the temperature parameter used to adjust the similarity between nodes;

Step S52: According to formula (12), calculate the total loss function of the model, with a size of and/> Average sum across all nodes.

Step S53: Adjust the parameters of the encoder through backpropagation, and repeat the training process until the model converges.

8. A product recommendation method based on comparative learning and community perception according to claim 3, characterized in that: the step S6 is specifically:

Step S61: Randomly select k nodes as initial cluster centers;

Step S62: Calculate the Euclidean distance between each node and the cluster center, and assign each node to the cluster with the nearest cluster center;

Step S63: Update the cluster center of each cluster to the average value of all nodes in the cluster;

Step S64: Repeat steps S62 and S63 until the cluster center no longer changes or the maximum number of iterations is reached, and the community division result {C _i } is obtained.

9. A product recommendation method based on comparative learning and community perception according to claim 3, characterized in that: the step S7 is specifically:

Step S71: According to step S6, users with similar interests and behavior patterns are divided into the same community; based on the community discovery results, recommend to the user products that are liked by other users with strong similarity and correlation in the same community to improve recommendations. accuracy and user satisfaction;

Step S72: Based on the community discovery results, identify products that are particularly popular in a certain community, and recommend popular products to other users in the community to increase user participation and help product manufacturers increase sales;

Step S73: When there are some common points of interest or correlations between different communities, use this correlation to recommend products in other communities with strong correlations to the user, and use cross-community recommendations to help users discover new interests. areas to enrich their shopping experience.