CN117495511A - Commodity recommendation system and method based on contrast learning and community perception - Google Patents
Commodity recommendation system and method based on contrast learning and community perception Download PDFInfo
- Publication number
- CN117495511A CN117495511A CN202311681385.8A CN202311681385A CN117495511A CN 117495511 A CN117495511 A CN 117495511A CN 202311681385 A CN202311681385 A CN 202311681385A CN 117495511 A CN117495511 A CN 117495511A
- Authority
- CN
- China
- Prior art keywords
- node
- graph
- nodes
- community
- attribute
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 230000008447 perception Effects 0.000 title claims abstract description 23
- 239000013598 vector Substances 0.000 claims abstract description 50
- 238000013528 artificial neural network Methods 0.000 claims abstract description 4
- 239000011159 matrix material Substances 0.000 claims description 39
- 230000006870 function Effects 0.000 claims description 22
- 238000004364 calculation method Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 7
- 210000005036 nerve Anatomy 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 6
- 230000003044 adaptive effect Effects 0.000 claims description 4
- 230000001174 ascending effect Effects 0.000 claims description 3
- 230000006399 behavior Effects 0.000 claims description 3
- 238000013527 convolutional neural network Methods 0.000 claims description 3
- 230000006378 damage Effects 0.000 claims description 3
- 238000010586 diagram Methods 0.000 claims description 3
- 230000002708 enhancing effect Effects 0.000 claims description 3
- 230000002349 favourable effect Effects 0.000 claims description 3
- 238000004519 manufacturing process Methods 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 230000000873 masking effect Effects 0.000 claims description 3
- 238000012549 training Methods 0.000 claims description 3
- 238000012546 transfer Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0631—Item recommendations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/042—Knowledge-based neural networks; Logical representations of neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Business, Economics & Management (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- General Business, Economics & Management (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Probability & Statistics with Applications (AREA)
- Strategic Management (AREA)
- Marketing (AREA)
- Economics (AREA)
- Development Economics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a commodity recommendation system and method based on contrast learning and community perception, firstly, a self-adaptive graph enhancement strategy is designed, when data enhancement is carried out on an original graph, the importance of nodes and edges is considered, the edges and the node attributes with high importance are reserved, and the enhanced graph is required to have a large difference from the original graph; secondly, generating representation vectors of an original graph and an enhancement graph by adopting an encoder based on a graph neural network and a multi-layer perceptron; thirdly, designing a comparison selection strategy based on the relative distance of the nodes, selecting a plurality of nodes closest to each node in relative distance as positive samples of the nodes, and taking the rest nodes as negative samples; from time to time, dividing the learned node representation vector into communities by using a clustering algorithm; finally, based on the obtained community division result, recommending the commodity; the method and the device can improve the accuracy of final personalized commodity recommendation under the condition of improving the cohesiveness of the generated community structure, thereby improving the user satisfaction and commodity sales.
Description
Technical Field
The invention relates to the technical field of presentation learning community discovery, in particular to a commodity recommendation system and method based on contrast learning and community perception.
Background
With the rapid development of science and technology and the popularization of networks, social and shopping activities of people are gradually changed into online activities. People, things, and their connections in the real world can be abstracted into complex networks. Community structure characteristics are an important characteristic of complex networks. The community structure is a group organization formed by individuals with close connection, frequent interaction and higher similarity, the internal connection of communities is close, and the connection between communities is sparse. Studying social structures can help us to identify user groups with characteristics like interests, professions and the like, help enterprises to better understand user preferences, provide more personalized services and goods for the enterprises, and improve user satisfaction. The following disadvantages still exist in current research and technology based on commodity recommendation methods representing learning community discovery: the random data enhancement strategy may interfere with or even destroy the key community structure in the graph, resulting in a decrease in community discovery accuracy, which in turn affects the accuracy of commodity recommendation.
Disclosure of Invention
The commodity recommendation system and method based on contrast learning and community perception provided by the invention can improve the accuracy of final personalized commodity recommendation under the condition of improving the cohesiveness of the generated community structure, thereby improving the user satisfaction and commodity sales, and has a particularly good value.
The invention adopts the following technical scheme.
The commodity recommending system based on contrast learning and community perception is used for recommending commodities in combination with community finding results to help enterprises to obtain preference, interest and demand information of users more effectively and accurately and provide more personalized services and commodities; the system comprises the following steps;
firstly, designing an adaptive graph enhancement strategy, considering the importance of nodes and edges when the data enhancement is carried out on an original graph, reserving the edges and node attributes with high importance, and simultaneously, also having large difference between the enhanced graph and the original graph to prevent the model from sinking into local optimum;
secondly, generating representation vectors of the original graph and the enhancement graph by adopting encoders based on the graph neural network and the multi-layer perceptron;
thirdly, designing a comparison selection strategy based on the relative distance of the nodes, selecting a plurality of nodes with the nearest relative distance to each node as positive samples of the nodes, and taking the rest nodes as negative samples of the nodes so as to ensure that the generated community structure has higher cohesiveness;
from time to time, dividing the learned node representation vector into communities by using a clustering algorithm;
and finally, acquiring preference, interest and demand information of the user based on the obtained community division result, recommending commodities in and across communities, and providing more personalized service for the user.
A commodity recommendation system based on contrast learning and community perception comprises a self-adaptive graph data enhancement module, a representation vector generation module, a contrast selection module, a loss function calculation module, a community generation module and a commodity recommendation module;
the self-adaptive image data enhancement module is used for generating an enhancement image G' with larger difference from the original image G; the method is characterized in that a data enhancement strategy considering the importance of nodes is used for avoiding the destruction of community structures in the graph in the data enhancement process; the method comprises the steps of performing topology-level data enhancement on an original graph to remove unimportant edges in a social network, and attribute-level data enhancement to shield attributes of unimportant nodes in the social network; meanwhile, the enhanced graph is required to have larger difference from the original graph, so that the model is prevented from sinking into local optimum;
the representation vector generation module is used for encoding the original graph G and the enhancement graph G'; encoding the original graph and the enhancement graph by using an encoder consisting of a graph convolution nerve GCN and a multi-layer perceptron MLP to respectively obtain node representation vectors z and z';
the comparison selection module is used for selecting positive samples and negative samples which are favorable for improving community discovery accuracy and enabling community boundaries to be clearer; taking a plurality of nodes which are closest to the target node in relative distance as positive samples, and taking the rest nodes as negative samples; the node relative distance consists of a topological distance and an attribute distance;
the loss function calculation module is used for calculating contrast lossOptimizing parameters of GCN and MLP through back propagation; wherein contrast loss->Loss of node u in original graph G>And enhancing the loss of node u' in graph GComposition;
the community generation module is used for generating communities; clustering the node representation vector z learned by the encoder by using a KMeans clustering algorithm to obtain a community division result { C } i };
The commodity recommending module is used for recommending personalized commodities to the user according to the obtained community dividing result; the method can recommend commodities which are loved by other users in the same community to the user, can recommend popular commodities in the community, and can also recommend related commodities in other communities with strong relevance to the user.
A commodity recommendation method based on contrast learning and community perception adopts a commodity recommendation system based on contrast learning and community perception, comprising the following steps;
step S1: constructing a social network G= { V, E, A, X } according to the social records of the user, wherein V= { V 1 ,v 2 ,…,v n The node set of the social network, E represents the edge set of the social network, E ij =(v i ,v j ) E represents node v i And node v j There is an edge between them; matrix arrayIs an adjacency matrix of the network, when e ij E, A ij =1, otherwise a ij =0。/>Is the attribute matrix of the nodes in the social network, m is the dimension of the node attribute, X ij A value representing a j-th dimension attribute of node i;
step S2: respectively carrying out data enhancement of a topology level and data enhancement of an attribute level on an input graph G, and simultaneously requiring large difference between the enhanced graph and an original graph to finally obtain an enhanced graph G';
step S3: generating node representation vectors z and z 'of an original graph G and an enhanced graph G' respectively by a pair of encoders consisting of a graph convolution nerve GCN and a multi-layer perceptron MLP;
step S4: calculating the relative distance between the nodes of the original graph G, selecting a plurality of nodes with the nearest relative distance for each node as positive sample sets according to the calculation result, and taking the rest nodes as negative sample sets;
step S5: calculating a contrast loss from the selected positive and negative sample setsOptimizing parameters of GCN and MLP through back propagation;
step S6: clustering the node representation vector z learned by the encoder by using a KMeans clustering algorithm, and taking clusters generated by clustering as communities to generate a community division result { C ] i };
Step S7: and providing commodity recommendation services by combining community discovery results, wherein the commodity recommendation services comprise recommending commodities which are interested by other users in the same community and popular commodities in other communities with strong relevance, so that commodity production enterprises further know the demands of target users, and more accurate commodity recommendation is realized.
The step S2 specifically comprises the following steps:
step S21: performing topology-level data enhancement on the graph G; sampling edges of nodes with higher importance from the original edge set E according to the following formula (1) to form an edge set E' of the enhanced graph;
wherein,the probability of sampling the edge (u, v), which is determined by the importance of the edge, is calculated according to the formulas (2) and (3);
wherein,for the degree centrality of node v, the magnitude is equal to the degree of node, i.e. +.> The importance of an edge (u, v) is equal in size to the average of the importance of the two nodes to which it is connected; />Is the maximum value of the importance of all edges in the graph, η 1 Is a coefficient for controlling the edge removal probability, is a manually specified parameter;
step S22: after obtaining the edge subset E 'sampled in step S21, converting it into an adjacency matrix a' for subsequent flows;
step S23: the data enhancement at attribute level is performed on graph G. First, for each node u in the graph G, from the Bernoulli distribution, according to the following equation (4)A value, all sampled values forming an n-dimensional vector b e {0,1} n I.e. +.>n is the number of nodes of graph G;
the probability of preserving the attribute of the node u is higher, and the probability of preserving the node with high importance is higher. />Is the importance of node u->Is the maximum value of the importance of all nodes in the graph, η 2 Is a coefficient for controlling the masking probability of the node attribute, is a manually specified parameter;
step S24: based on the vector b and the original feature matrix X obtained in the step S23, calculating an enhanced attribute matrix X' according to a formula (5);
x' =diag (b) ·X formula (5)
Wherein diag (·) represents the expansion of one vector into a diagonal matrix.
Step S25: the enhanced graph G 'is obtained, and the enhanced graph may be represented by an enhanced adjacency matrix a' and an attribute matrix X ', i.e., G' = (a ', X');
step S26: in order to prevent the model from sinking into a local optimal solution because the enhanced graph is too similar to the original graph, thereby affecting the accuracy of the community discovery task, the enhanced graph G' is required to have a larger difference from the original graph G; calculating the sum of the Euclidean distance between the adjacency matrix and the attribute matrix of G' and G according to the following formula (6);
wherein n is the number of nodes, and m is the dimension of the node attribute;
step S27: the above-mentioned image data enhancement step is repeated until the sum g of Euclidean distances of adjacent matrix and attribute matrix is greater than given threshold sigma or the maximum enhancement number I is reached max 。
The step S3 specifically comprises the following steps:
step S31: given a network, performing representation learning on the network means learning a transfer functionAnd d < |v|, v| represents the number of nodes of the network G. That is, each node in the network is converted into a d-dimensional representation vector z after f (v) functions;
step S32: the original graph G and the enhancement graph G 'are encoded using an encoder f (·) consisting of a graph convolutional neural network GCN and a multi-layer perceptron MLP, yielding node representation vectors z and z', respectively, for the subsequent flows.
The step S4 specifically includes:
step S41: calculating the topological distance between the target node u and other nodesSuppose node v 1 To node v n The shortest path is p= (v 1 ,v 2 ,...,v n ) E V x V, the distance between these two nodes is +.>The shortest distance between the two nodes; wherein (1)>Is a node weight mapping function, where unit weight f is taken: e→ {1}; taking the shortest distance between nodes as the topological distance between the nodes;
step S42: calculating the attribute distance between nodes according to formula (7)
Wherein x is u And x v Attribute vectors of the node u and the node V respectively;
step S43: node topology distance based on step S41And the node attribute distance obtained in step S42According to formula (8), calculating the node relative distance;
wherein,and->Representing the minimum and maximum values of the topological distances of the nodes in the diagram, respectively. />Andrepresenting the minimum and maximum values of the attribute distance between nodes in the graph, respectively. λ is a parameter for adjusting the ratio of the node topology distance and the attribute distance;
step S44: based on the relative distances of the nodes obtained in the step S43, the nodes are sorted according to the ascending order of the relative distances to obtain a node sequence R u ;
Step S45: selecting a front in a node sequencePositive sample set P with individual nodes as target nodes u u Other nodes serve as negative sample set N of node u u As shown in equation (9) and equation (10).
The step S5 specifically comprises the following steps:
step S51: based on the InfoNCE loss function commonly used in contrast learning, calculating a loss function of the node u according to a formula (11);
wherein z is u Representing a representation vector of node u, z i And z' i The representation vectors of node i in original graph G and enhanced graph G', sim (z u ,z v ) The cosine similarity between the nodes u and v is represented, and tau is a temperature parameter for adjusting the similarity between the nodes;
step S52: according to formula (12), the total loss function of the model is calculated to be as followsAndaverage sum over all nodes.
Step S53: the parameters of the encoder are adjusted by back propagation, and the training process is repeated until the model converges.
The step S6 specifically includes:
step S61: randomly selecting k nodes as initial cluster centers;
step S62: calculating Euclidean distance from each node to the cluster center, and distributing each node to the cluster where the cluster center closest to the node is located;
step S63: updating the cluster center of each cluster to be the average value of all nodes of the cluster;
step S64: repeating the step S62 and the step S63 until the cluster center is not changed any more or the maximum iteration number is reached, and obtaining a community division result { C } i }。
The step S7 specifically includes:
step S71: according to step S6, users with similar interests and behavior patterns are divided into the same community; recommending other favorite commodities with strong similarity and relevance in the same community to the user based on the community discovery result so as to improve the accuracy of recommendation and the satisfaction of the user;
step S72: based on the community discovery result, identifying a commodity which is particularly popular in a certain community, and improving the user participation degree and helping commodity manufacturers to improve sales by recommending hot commodities to other users in the community;
step S73: when some common interest points or relevance exists among different communities, the commodity in other communities with strong relevance is recommended to the user by utilizing the relevance, so that the user is helped to find a new interest field by means of community recommendation, and shopping experience is enriched.
The invention has the advantages that: the method and the device can avoid destroying the community structure in the graph in the data enhancement process, select proper positive and negative samples for comparison learning, improve the cohesiveness of the generated community structure, and finally improve the commodity recommendation accuracy.
Drawings
The invention is described in further detail below with reference to the attached drawings and detailed description:
fig. 1 is a schematic flow chart of the present invention.
Detailed Description
As shown in the figure, the commodity recommendation system based on contrast learning and community perception is used for recommending commodities by combining community discovery results to help enterprises to acquire user preference, interest and demand information more effectively and accurately and provide more personalized services and commodities; the system comprises the following steps;
firstly, designing an adaptive graph enhancement strategy, considering the importance of nodes and edges when the data enhancement is carried out on an original graph, reserving the edges and node attributes with high importance, and simultaneously, also having large difference between the enhanced graph and the original graph to prevent the model from sinking into local optimum;
secondly, generating representation vectors of the original graph and the enhancement graph by adopting encoders based on the graph neural network and the multi-layer perceptron;
thirdly, designing a comparison selection strategy based on the relative distance of the nodes, selecting a plurality of nodes with the nearest relative distance to each node as positive samples of the nodes, and taking the rest nodes as negative samples of the nodes so as to ensure that the generated community structure has higher cohesiveness;
from time to time, dividing the learned node representation vector into communities by using a clustering algorithm;
and finally, acquiring preference, interest and demand information of the user based on the obtained community division result, recommending commodities in and across communities, and providing more personalized service for the user.
A commodity recommendation system based on contrast learning and community perception comprises a self-adaptive graph data enhancement module, a representation vector generation module, a contrast selection module, a loss function calculation module, a community generation module and a commodity recommendation module;
the self-adaptive image data enhancement module is used for generating an enhancement image G' with larger difference from the original image G; the method is characterized in that a data enhancement strategy considering the importance of nodes is used for avoiding the destruction of community structures in the graph in the data enhancement process; the method comprises the steps of performing topology-level data enhancement on an original graph to remove unimportant edges in a social network, and attribute-level data enhancement to shield attributes of unimportant nodes in the social network; meanwhile, the enhanced graph is required to have larger difference from the original graph, so that the model is prevented from sinking into local optimum;
the representation vector generation module is used for encoding the original graph G and the enhancement graph G'; encoding the original graph and the enhancement graph by using an encoder consisting of a graph convolution nerve GCN and a multi-layer perceptron MLP to respectively obtain node representation vectors z and z';
the comparison selection module is used for selecting positive samples and negative samples which are favorable for improving community discovery accuracy and enabling community boundaries to be clearer; taking a plurality of nodes which are closest to the target node in relative distance as positive samples, and taking the rest nodes as negative samples; the node relative distance consists of a topological distance and an attribute distance;
the loss function calculation module is used for calculating contrast lossOptimizing parameters of GCN and MLP through back propagation; wherein contrast loss->Loss of node u in original graph G>And enhancing the loss of node u' in graph GComposition;
the community generation module is used for generating communities; clustering the node representation vector z learned by the encoder by using a KMeans clustering algorithm to obtain a community division result { C } i };
The commodity recommending module is used for recommending personalized commodities to the user according to the obtained community dividing result; the method can recommend commodities which are loved by other users in the same community to the user, can recommend popular commodities in the community, and can also recommend related commodities in other communities with strong relevance to the user.
A commodity recommendation method based on contrast learning and community perception adopts a commodity recommendation system based on contrast learning and community perception, comprising the following steps;
step S1: constructing a social network G= { V, E, A, X } according to the social records of the user, wherein V= { V 1 ,v 2 ,…,v n The node set of the social network, E represents the edge set of the social network, E ij =(v i ,v j ) E represents node v i And node v j There is an edge between them; matrix arrayIs an adjacency matrix of the network, when e ij E, A ij =1, otherwise a ij =0。/>Is the attribute matrix of the nodes in the social network, m is the dimension of the node attribute, X ij A value representing a j-th dimension attribute of node i;
step S2: respectively carrying out data enhancement of a topology level and data enhancement of an attribute level on an input graph G, and simultaneously requiring large difference between the enhanced graph and an original graph to finally obtain an enhanced graph G';
step S3: generating node representation vectors z and z 'of an original graph G and an enhanced graph G' respectively by a pair of encoders consisting of a graph convolution nerve GCN and a multi-layer perceptron MLP;
step S4: calculating the relative distance between the nodes of the original graph G, selecting a plurality of nodes with the nearest relative distance for each node as positive sample sets according to the calculation result, and taking the rest nodes as negative sample sets;
step S5: calculating a contrast loss from the selected positive and negative sample setsOptimizing parameters of GCN and MLP through back propagation;
step S6: clustering the node representation vector z learned by the encoder by using a KMeans clustering algorithm, and taking clusters generated by clustering as communities to generate a community division result { C ] i };
Step S7: and providing commodity recommendation services by combining community discovery results, wherein the commodity recommendation services comprise recommending commodities which are interested by other users in the same community and popular commodities in other communities with strong relevance, so that commodity production enterprises further know the demands of target users, and more accurate commodity recommendation is realized.
The step S2 specifically comprises the following steps:
step S21: performing topology-level data enhancement on the graph G; sampling edges of nodes with higher importance from the original edge set E according to the following formula (1) to form an edge set E' of the enhanced graph;
wherein,the probability of sampling the edge (u, v), which is determined by the importance of the edge, is calculated according to the formulas (2) and (3);
wherein,for the degree centrality of node v, the magnitude is equal to the degree of node, i.e. +.> The importance of an edge (u, v) is equal in size to the average of the importance of the two nodes to which it is connected; />Is the maximum value of the importance of all edges in the graph, η 1 Is a coefficient for controlling the edge removal probability, is a manually specified parameter;
step S22: after obtaining the edge subset E 'sampled in step S21, converting it into an adjacency matrix a' for subsequent flows;
step S23: the data enhancement at attribute level is performed on graph G. First, for each node u in the graph G, from the Bernoulli distribution, according to the following equation (4)A value, all sampled values forming an n-dimensional vector b e {0,1} n I.e. +.>n is the number of nodes of graph G;
the probability of preserving the attribute of the node u is higher, and the probability of preserving the node with high importance is higher. />Is the importance of node u->Is the maximum value of the importance of all nodes in the graph, η 2 Is a coefficient for controlling the masking probability of the node attribute, is a manually specified parameter;
step S24: based on the vector b and the original feature matrix X obtained in the step S23, calculating an enhanced attribute matrix X' according to a formula (5);
x' =diag (b) ·X formula (5)
Wherein diag (·) represents the expansion of one vector into a diagonal matrix.
Step S25: the enhanced graph G 'is obtained, and the enhanced graph may be represented by an enhanced adjacency matrix a' and an attribute matrix X ', i.e., G' = (a ', X');
step S26: in order to prevent the model from sinking into a local optimal solution because the enhanced graph is too similar to the original graph, thereby affecting the accuracy of the community discovery task, the enhanced graph G' is required to have a larger difference from the original graph G; calculating the sum of the Euclidean distance between the adjacency matrix and the attribute matrix of G' and G according to the following formula (6);
wherein n is the number of nodes, and m is the dimension of the node attribute;
step S27: the above graph data enhancement step is repeated until the sum g of the Euclidean distances of the adjacency matrix and the attribute matrix is greater than a given threshold sigma or graphThe execution times of data enhancement reach the maximum enhancement times I max 。
The step S3 specifically comprises the following steps:
step S31: given a network, performing representation learning on the network means learning a transfer functionAnd d < |v|, v| represents the number of nodes of the network G. That is, each node in the network is converted into a d-dimensional representation vector z after f (v) functions;
step S32: the original graph G and the enhancement graph G 'are encoded using an encoder f (·) consisting of a graph convolutional neural network GCN and a multi-layer perceptron MLP, yielding node representation vectors z and z', respectively, for the subsequent flows.
The step S4 specifically includes:
step S41: calculating the topological distance between the target node u and other nodesSuppose node v 1 To node v n The shortest path is p= (v 1 ,v 2, ...,v n ) E V x V, the distance between these two nodes is +.>The shortest distance between the two nodes; wherein (1)>Is a node weight mapping function, where unit weight f is taken: e→ {1}; taking the shortest distance between nodes as the topological distance between the nodes;
step S42: calculating the attribute distance between nodes according to formula (7)
Wherein x is u And x v Attribute vectors of node u and node v, respectively;
step S43: node topology distance based on step S41And the node attribute distance obtained in step S42According to formula (8), calculating the node relative distance;
wherein,and->Representing the minimum and maximum values of the topological distances of the nodes in the diagram, respectively. />Andrepresenting the minimum and maximum values of the attribute distance between nodes in the graph, respectively. λ is a parameter for adjusting the ratio of the node topology distance and the attribute distance;
step S44: based on the relative distances of the nodes obtained in the step S43, the nodes are sorted according to the ascending order of the relative distances to obtain a node sequence R u ;
Step S45: selecting a front in a node sequencePositive sample set P with individual nodes as target nodes u u Other nodes as node uNegative sample set N u As shown in equation (9) and equation (10).
The step S5 specifically comprises the following steps:
step S51: based on the InfoNCE loss function commonly used in contrast learning, calculating a loss function of the node u according to a formula (11);
wherein z is u Representing a representation vector of node u, z i And z' i The representation vectors of node i in original graph G and enhanced graph G', sim (z u ,z v ) The cosine similarity between the nodes u and v is represented, and tau is a temperature parameter for adjusting the similarity between the nodes;
step S52: according to formula (12), the total loss function of the model is calculated to be as followsAndaverage sum over all nodes.
Step S53: the parameters of the encoder are adjusted by back propagation, and the training process is repeated until the model converges.
The step S6 specifically includes:
step S61: randomly selecting k nodes as initial cluster centers;
step S62: calculating Euclidean distance from each node to the cluster center, and distributing each node to the cluster where the cluster center closest to the node is located;
step S63: updating the cluster center of each cluster to be the average value of all nodes of the cluster;
step S64: repeating the step S62 and the step S63 until the cluster center is not changed any more or the maximum iteration number is reached, and obtaining a community division result { C } i }。
The step S7 specifically includes:
step S71: according to step S6, users with similar interests and behavior patterns are divided into the same community; recommending other favorite commodities with strong similarity and relevance in the same community to the user based on the community discovery result so as to improve the accuracy of recommendation and the satisfaction of the user;
step S72: based on the community discovery result, identifying a commodity which is particularly popular in a certain community, and improving the user participation degree and helping commodity manufacturers to improve sales by recommending hot commodities to other users in the community;
step S73: when some common interest points or relevance exists among different communities, the commodity in other communities with strong relevance is recommended to the user by utilizing the relevance, so that the user is helped to find a new interest field by means of community recommendation, and shopping experience is enriched.
Claims (9)
1. The commodity recommending system based on contrast learning and community perception is used for recommending commodities in combination with community finding results to help enterprises to obtain preference, interest and demand information of users more effectively and accurately and provide more personalized services and commodities; the method is characterized in that: the system comprises the following steps;
firstly, designing an adaptive graph enhancement strategy, considering the importance of nodes and edges when the data enhancement is carried out on an original graph, reserving the edges and node attributes with high importance, and simultaneously, also having large difference between the enhanced graph and the original graph to prevent the model from sinking into local optimum;
secondly, generating representation vectors of the original graph and the enhancement graph by adopting encoders based on the graph neural network and the multi-layer perceptron;
thirdly, designing a comparison selection strategy based on the relative distance of the nodes, selecting a plurality of nodes with the nearest relative distance to each node as positive samples of the nodes, and taking the rest nodes as negative samples of the nodes so as to ensure that the generated community structure has higher cohesiveness;
from time to time, dividing the learned node representation vector into communities by using a clustering algorithm;
and finally, acquiring preference, interest and demand information of the user based on the obtained community division result, recommending commodities in and across communities, and providing more personalized service for the user.
2. The commodity recommendation system based on contrast learning and community perception according to claim 1, wherein: the system comprises an adaptive graph data enhancement module, a representation vector generation module, a comparison selection module, a loss function calculation module, a community generation module and a commodity recommendation module;
the self-adaptive image data enhancement module is used for generating an enhancement image G' with larger difference from the original image G; the method is characterized in that a data enhancement strategy considering the importance of nodes is used for avoiding the destruction of community structures in the graph in the data enhancement process; the method comprises the steps of performing topology-level data enhancement on an original graph to remove unimportant edges in a social network, and attribute-level data enhancement to shield attributes of unimportant nodes in the social network; meanwhile, the enhanced graph is required to have larger difference from the original graph, so that the model is prevented from sinking into local optimum;
the representation vector generation module is used for encoding the original graph G and the enhancement graph G'; encoding the original graph and the enhancement graph by using an encoder consisting of a graph convolution nerve GCN and a multi-layer perceptron MLP to respectively obtain node representation vectors z and z';
the comparison selection module is used for selecting positive samples and negative samples which are favorable for improving community discovery accuracy and enabling community boundaries to be clearer; taking a plurality of nodes which are closest to the target node in relative distance as positive samples, and taking the rest nodes as negative samples; the node relative distance consists of a topological distance and an attribute distance;
the loss function calculation module is used for calculating contrast lossOptimizing parameters of GCN and MLP through back propagation; wherein contrast loss->Loss of node u in original graph G>And enhancing the loss of node u' in graph GComposition;
the community generation module is used for generating communities; clustering the node representation vector z learned by the encoder by using a KMeans clustering algorithm to obtain a community division result { C } i };
The commodity recommending module is used for recommending personalized commodities to the user according to the obtained community dividing result; the method can recommend commodities which are loved by other users in the same community to the user, can recommend popular commodities in the community, and can also recommend related commodities in other communities with strong relevance to the user.
3. A commodity recommendation method based on contrast learning and community perception adopts a commodity recommendation system based on contrast learning and community perception, which is characterized in that: comprises the following steps of;
step Sl: constructing a social network G= { V, E, A, X } according to the social records of the user, wherein V= { V 1 ,v 2 ,…,v n The node set of the social network, E represents the edge set of the social network, E ij =(v i ,v j ) E represents node v i And node v j There is an edge between them; matrix arrayIs an adjacency matrix of the network, when e ij E, A ij =1, otherwise a ij =0。/>Is the attribute matrix of the nodes in the social network, m is the dimension of the node attribute, X ij A value representing a j-th dimension attribute of node i;
step S2: respectively carrying out data enhancement of a topology level and data enhancement of an attribute level on an input graph G, and simultaneously requiring large difference between the enhanced graph and an original graph to finally obtain an enhanced graph G';
step S3: generating node representation vectors z and z 'of an original graph G and an enhanced graph G' respectively by a pair of encoders consisting of a graph convolution nerve GCN and a multi-layer perceptron MLP;
step S4: calculating the relative distance between the nodes of the original graph G, selecting a plurality of nodes with the nearest relative distance for each node as positive sample sets according to the calculation result, and taking the rest nodes as negative sample sets;
step S5: calculating a contrast loss from the selected positive and negative sample setsOptimizing parameters of GCN and MLP through back propagation;
step S6: clustering the node representation vector z learned by the encoder by using a KMeans clustering algorithm, and taking clusters generated by clustering as communities to generate a community division result { C ] i };
Step S7: and providing commodity recommendation services by combining community discovery results, wherein the commodity recommendation services comprise recommending commodities which are interested by other users in the same community and popular commodities in other communities with strong relevance, so that commodity production enterprises further know the demands of target users, and more accurate commodity recommendation is realized.
4. A commodity recommendation method based on contrast learning and community perception according to claim 3, wherein: the step S2 specifically comprises the following steps:
step S21: performing topology-level data enhancement on the graph G; sampling edges of nodes with higher importance from the original edge set E according to the following formula (1) to form an edge set E' of the enhanced graph;
wherein,the probability of sampling the edge (u, v), which is determined by the importance of the edge, is calculated according to the formulas (2) and (3);
wherein,for the degree centrality of node v, the magnitude is equal to the degree of node, i.e. +.> The importance of an edge (u, v) is equal in size to the average of the importance of the two nodes to which it is connected; />Is the maximum value of the importance of all edges in the graph, η 1 Is a coefficient for controlling the edge removal probability, is a manually specified parameter;
step S22: after obtaining the edge subset E 'sampled in step S21, converting it into an adjacency matrix a' for subsequent flows;
step S23: the data enhancement at attribute level is performed on graph G. First, for each node u in the graph G, from the Bernoulli distribution, according to the following equation (4)A value, all sampled values forming an n-dimensional vector b e {0,1} n I.e. +.>n is the number of nodes of graph G;
the probability of preserving the attribute of the node u is higher, and the probability of preserving the node with high importance is higher. />Is the importance of node u->Is the maximum value of the importance of all nodes in the graph, η 2 Is a coefficient for controlling the masking probability of the node attribute, is a manually specified parameter;
step S24: based on the vector b and the original feature matrix X obtained in the step S23, calculating an enhanced attribute matrix X' according to a formula (5);
x' =diag (b) ·X formula (5)
Wherein diag (·) represents the expansion of one vector into a diagonal matrix.
Step S25: the enhanced graph G 'is obtained, and the enhanced graph may be represented by an enhanced adjacency matrix a' and an attribute matrix X ', i.e., G' = (a ', X');
step S26: in order to prevent the model from sinking into a local optimal solution because the enhanced graph is too similar to the original graph, thereby affecting the accuracy of the community discovery task, the enhanced graph G' is required to have a larger difference from the original graph G; calculating the sum of the Euclidean distance between the adjacency matrix and the attribute matrix of G' and G according to the following formula (6);
wherein n is the number of nodes, and m is the dimension of the node attribute;
step S27: the above-mentioned image data enhancement step is repeated until the sum g of Euclidean distances of adjacent matrix and attribute matrix is greater than given threshold sigma or the maximum enhancement number I is reached max 。
5. A commodity recommendation method based on contrast learning and community perception according to claim 3, wherein: the step S3 specifically comprises the following steps:
step S31: given a network, performing representation learning on it means learning a transfer function f (v): v- & gt z,v.epsilon.V, and d<<V represents the number of nodes of the network G. That is, each node in the network is converted into a d-dimensional representation vector z after f (v) functions;
step S32: the original graph G and the enhancement graph G 'are encoded using an encoder f (·) consisting of a graph convolutional neural network GCN and a multi-layer perceptron MLP, yielding node representation vectors z and z', respectively, for the subsequent flows.
6. A commodity recommendation method based on contrast learning and community perception according to claim 3, wherein: the step S4 specifically includes:
step S41: calculating the topological distance between the target node u and other nodesSuppose node v 1 To node v n The shortest path is p= (v 1 ,v 2 ,...,v n ) E V x V, the distance between these two nodes is +.>The shortest distance between the two nodes; wherein (1)>Is a node weight mapping function, where unit weight f is taken: e→ {1}; taking the shortest distance between nodes as the topological distance between the nodes;
step S42: calculating the attribute distance between nodes according to formula (7)
Wherein x is u And x v Attribute vectors of node u and node v, respectively;
step S43: node topology distance based on step S41And the node attribute distance +.>According to formula (8), calculating the node relative distance;
wherein,and->Representing the minimum and maximum values of the topological distances of the nodes in the diagram, respectively. />And->Representing the minimum and maximum values of the attribute distance between nodes in the graph, respectively. λ is a parameter for adjusting the ratio of the node topology distance and the attribute distance;
step S44: based on the relative distances of the nodes obtained in the step S43, the nodes are sorted according to the ascending order of the relative distances to obtain a node sequence R u ;
Step S45: selecting a front in a node sequencePositive sample set P with individual nodes as target nodes u u Other nodes serve as negative sample set N of node u u As shown in equation (9) and equation (10).
7. A commodity recommendation method based on contrast learning and community perception according to claim 3, wherein: the step S5 specifically comprises the following steps:
step S51: based on the InfoNCE loss function commonly used in contrast learning, calculating a loss function of the node u according to a formula (11);
wherein z is u Representing a representation vector of node u, z i And z' i The representation vectors of node i in original graph G and enhanced graph G', sim (z u ,z v ) The cosine similarity between the nodes u and v is represented, and tau is a temperature parameter for adjusting the similarity between the nodes;
step S52: according to formula (12), the total loss function of the model is calculated to be as followsAnd->Average sum over all nodes.
Step S53: the parameters of the encoder are adjusted by back propagation, and the training process is repeated until the model converges.
8. A commodity recommendation method based on contrast learning and community perception according to claim 3, wherein: the step S6 specifically includes:
step S61: randomly selecting k nodes as initial cluster centers;
step S62: calculating Euclidean distance from each node to the cluster center, and distributing each node to the cluster where the cluster center closest to the node is located;
step S63: updating the cluster center of each cluster to be the average value of all nodes of the cluster;
step S64: repeating the step S62 and the step S63 until the cluster center is not changed any more or the maximum iteration number is reached, and obtaining a community division result { C } i }。
9. A commodity recommendation method based on contrast learning and community perception according to claim 3, wherein: the step S7 specifically includes:
step S71: according to step S6, users with similar interests and behavior patterns are divided into the same community; recommending other favorite commodities with strong similarity and relevance in the same community to the user based on the community discovery result so as to improve the accuracy of recommendation and the satisfaction of the user;
step S72: based on the community discovery result, identifying a commodity which is particularly popular in a certain community, and improving the user participation degree and helping commodity manufacturers to improve sales by recommending hot commodities to other users in the community;
step S73: when some common interest points or relevance exists among different communities, the commodity in other communities with strong relevance is recommended to the user by utilizing the relevance, so that the user is helped to find a new interest field by means of community recommendation, and shopping experience is enriched.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311681385.8A CN117495511A (en) | 2023-12-08 | 2023-12-08 | Commodity recommendation system and method based on contrast learning and community perception |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311681385.8A CN117495511A (en) | 2023-12-08 | 2023-12-08 | Commodity recommendation system and method based on contrast learning and community perception |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117495511A true CN117495511A (en) | 2024-02-02 |
Family
ID=89667436
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311681385.8A Pending CN117495511A (en) | 2023-12-08 | 2023-12-08 | Commodity recommendation system and method based on contrast learning and community perception |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117495511A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117808616A (en) * | 2024-02-28 | 2024-04-02 | 中国传媒大学 | Community discovery method and system based on graph embedding and node affinity |
-
2023
- 2023-12-08 CN CN202311681385.8A patent/CN117495511A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117808616A (en) * | 2024-02-28 | 2024-04-02 | 中国传媒大学 | Community discovery method and system based on graph embedding and node affinity |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111428147B (en) | Social recommendation method of heterogeneous graph volume network combining social and interest information | |
CN109785062B (en) | Hybrid neural network recommendation system based on collaborative filtering model | |
CN110955834B (en) | Knowledge graph driven personalized accurate recommendation method | |
CN109582864B (en) | Course recommendation method and system based on big data science and dynamic weight adjustment | |
CN110263236B (en) | Social network user multi-label classification method based on dynamic multi-view learning model | |
CN113807422B (en) | Weighted graph convolutional neural network scoring prediction model integrating multi-feature information | |
WO2004017178A9 (en) | Statistical personalized recommendation system | |
CN103106279A (en) | Clustering method simultaneously based on node attribute and structural relationship similarity | |
CN111428127B (en) | Personalized event recommendation method and system integrating theme matching and bidirectional preference | |
CN117495511A (en) | Commodity recommendation system and method based on contrast learning and community perception | |
CN113609398A (en) | Social recommendation method based on heterogeneous graph neural network | |
CN115408621B (en) | Interest point recommendation method considering auxiliary information characteristic linear and nonlinear interaction | |
Vellaichamy et al. | Hybrid Collaborative Movie Recommender System Using Clustering and Bat Optimization. | |
CN112149000B (en) | Online social network user community discovery method based on network embedding | |
CN112100514B (en) | Friend recommendation method based on global attention mechanism representation learning | |
CN110321492A (en) | A kind of item recommendation method and system based on community information | |
CN116340646A (en) | Recommendation method for optimizing multi-element user representation based on hypergraph motif | |
CN113536097A (en) | Recommendation method and device based on automatic feature grouping | |
CN114064627A (en) | Knowledge graph link completion method and system for multiple relations | |
CN115270004A (en) | Education resource recommendation method based on field factor decomposition | |
CN112084418B (en) | Microblog user community discovery method based on neighbor information and attribute network characterization learning | |
CN117056763A (en) | Community discovery method based on variogram embedding | |
CN117078312A (en) | Advertisement putting management method and system based on artificial intelligence | |
CN117035059A (en) | Efficient privacy protection recommendation system and method for communication | |
CN116541592A (en) | Vector generation method, information recommendation method, device, equipment and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |