CN116703529A

CN116703529A - Contrast learning recommendation method based on feature space semantic enhancement

Info

Publication number: CN116703529A
Application number: CN202310959903.1A
Authority: CN
Inventors: 程志勇; 赵帅; 张宇; 刘帆; 卓涛; 高赞
Original assignee: Qilu University of Technology; Shandong Institute of Artificial Intelligence
Current assignee: Qilu University of Technology; Shandong Institute of Artificial Intelligence
Priority date: 2023-08-02
Filing date: 2023-08-02
Publication date: 2023-09-05
Anticipated expiration: 2043-08-02
Also published as: CN116703529B

Abstract

The invention relates to the technical field of recommendation systems and deep learning, in particular to a contrast learning recommendation method based on feature space semantic enhancement. The method comprises the following steps: firstly, preprocessing a data set of an object purchased by a user, constructing an interaction diagram, removing noise and interference information in the interaction diagram information, improving the accuracy of a model, then carrying out spectral feature interference on a learned feature matrix, generating two enhanced sample matrices, integrating a multi-task learning strategy into an original feature matrix, unifying the enhanced feature samples, improving the robustness of node embedding, and finally calculating the score between the user and the object embedding by using an inner product, and recommending the object to the user by using score sorting. The method provided by the invention is used for disturbing the spectral characteristics of the feature matrix, so that noise and redundant information in the embedded representation of the original user and the object can be effectively filtered, and the generalization capability and accuracy of the recommendation system model are improved.

Description

Contrast learning recommendation method based on feature space semantic enhancement

Technical Field

The invention relates to the technical field of recommendation systems and deep learning, in particular to a contrast learning recommendation method based on feature space semantic enhancement.

Background

The recommendation system plays an important role in the fields of e-commerce, social networks and the like, collaborative filtering is one of the common technologies in the recommendation system, and an important subject for learning user preference academia and industry in implicit feedback based on collaborative filtering (Collaborative Filtering, CF). In recent years, the graph neural network (Graph Neural Network, GNN) provides a new technical approach for developing the CF method, and the GNN-based CF model has a substantial improvement in recommended performance. Recently, a contrast learning method based on structure enhancement and feature enhancement achieves significant performance improvement in a recommendation system, and the structure enhancement method performs different enhancement operations on a user object interaction diagram, such as randomly performing edge disturbance. However, structural reinforcement is susceptible to data sparsity, and the dominant approach to feature reinforcement is to use gaussian distributed noise injection node embedding to learn a more robust feature representation.

Although successful, the problems of noise in contrast learning tasks, etc. may adversely affect downstream tasks. Therefore, we propose a simple and efficient data enhancement method aimed at feature enhancement, which complements existing enhancement strategies. The method selects to disturb in the spectral features of the feature map, because the enhancement of the spectral features does not change the orthogonal basis of the feature map, so that semantic relativity is kept, and meanwhile, the method obtains the approximate representation of the matrix features through dimension reduction, so that the method is beneficial to reducing the storage space and the computational complexity, and can effectively filter noise and redundant information in the embedded representation of the original user and the object. Our approach is compatible with other enhancement strategies and contrast losses and can improve the generalization ability and accuracy of the recommendation system model.

Therefore, in order to solve the problems, a contrast learning recommendation method based on feature space semantic enhancement is proposed.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides a contrast learning recommendation method based on feature space semantic enhancement, which interferes with the spectral features of a feature matrix, can effectively filter noise and redundant information in embedded representation of an original user and an object, and improves generalization capability and accuracy of a recommendation system model.

In order to achieve the above purpose, the present invention provides the following technical solutions:

the invention provides a contrast learning recommendation method based on feature space semantic enhancement, which comprises the following steps:

(a) Dividing an e-commerce data set into a training set, a verification set and a test set through preprocessing, wherein the training set comprises user information and article information, constructing a purchase interaction bipartite graph of a user and an article by utilizing the training set information, initializing embedding parameters of nodes of the user and the article, and generating node embedding information of the user and the article;

(b) According to node embedded information of a user and an article, node similarity calculation is carried out, a threshold value is set, and noise information with lower edge score of the interaction graph is removed through operation;

(c) Embedding and learning the nodes by establishing a contrast learning recommendation method model based on feature space semantic enhancement to obtain final representation of the user and the article, and interfering spectral features of the user and the article feature matrix according to the learned final representation to generate a sample matrix for enhancing the user and the article;

(d) Iterative with a BPR loss function and contrast-based InfoNCE contrast loss function in a learning recommendation systemSecondly, obtaining a trained contrast learning recommendation method model based on feature space semantic enhancement;

(e) And calculating a predicted score of the user on the article through a formula, calculating the score between the user and the article by using the inner product, and recommending the article to the user according to the score sequence.

Further, in step (a), for the e-commerce dataset, the item interacted with by the user is performed according to 8:1:1, randomly dividing the proportion to generate a training set, a verification set and a test set; building a set of user sets in a training setAnd a group of items->Is->Wherein->And->The number of users and items, respectively +.>Representing user +.>And articles->Interactive (I)>Representing user +.>And articles->No interaction; respectively initializing an embedding parameter Xavier of a user node and an article node, wherein the embedding dimension is 64, and generating embedding information of the user node>And embedded information of item node->Xavier initialization is an initialization method that solves the random problem.

Further, in step (b), the formula is passed throughThe similarity value of the user and the item node is calculated,

in the middle ofUser-embedded information indicating that there is an interaction in the original interaction matrix,/->Representing the embedded information of the object corresponding to the interactive user in the original interaction matrix,/for>Is a Sigmoid activation function,/->And->Is two trainable parameter matrices, +.>And->Indicating the number of users and items respectively,/>representing dimensions, mapping node features, +.>Representing the inner product operation of the vector,>representing user +.>And articles->Is a similarity value of (1);

by the formulaRemoving low-reliability interaction in the interaction information of the graph structure,

in the middle ofFor judging true or false binary indication function +.>Similarity value representing user and item, +.>Is a threshold value for comparison with the similarity of the user and the item, < >>Representing the final generation of noise reduction map->Is provided with the interactive edge information.

Further, in step (c), the formula is used forAndobtaining the user node and the item node at +.>Graph convolution results of layers->And->，

In the middle ofIndicate the convolution->Layer (S)>Indicating the number of layers that need to be convolved in the graph, +.>For a set of items to interact with a user +.>A set of users interacting with the item;

by the formulaAnd->Respectively get the final embedding of the user->And final embedding of the article->，

In the middle ofEmbedded information indicating that the user node has learned at the current layer,/->Embedded information indicating that item node is learned at current layer,/->Representing the importance degree of the embedded information of the current layer;

by the formulaPerforming original feature matrix decomposition to obtain an original matrix +.>Is used for the spectral characteristics of the (c),

in the middle ofThe table indicates that the learned user and item are embedded with information, < >>And->Representing the original feature matrix +.>A left singular vector matrix and a right singular vector matrix,

is a diagonal matrix, and the elements meet the conditionThe elements on the diagonal are the original matrix +.>Characteristic value of>The transpose is represented by the number,representing singular value decomposition of the matrix;

by the formulaAn approximate decomposition matrix is obtained and is used,

in the middle ofAnd->Representing the reduced dimension decomposition matrix, +.>Representing the original dimension to reduce the dimension toDimension (L)>Is a diagonal matrix after dimension reduction, and the elements meet the condition +.>，/>Representing an approximate decomposition of the matrix;

by the formulaAndthe original features are added with interference respectively, and the two matrix elements meet the condition +.>，，

In the middle of，/>Is a decreasing sequence and +.>，/>Satisfy->Is a decreasing sequence and +.>By the formula->And->Node embedded representation of the user and the item after the perturbation is obtained respectively +.>And->，/>And->Respectively representing the spectral characteristics after interference.

Further, in step (d), the method is carried out by the formulaThe BPR loss is calculated and obtained,

in the middle ofFor a set of items to interact with a user +.>Representation and user->Interactive positive sample item, +_>Representation and userNegative sample item, < >>Transpose representing the final embedding of the user, +.>Representing Sigmoid activation function,/->The BPR loss function is a function for learning personalized preferences of users in a recommendation system;

by the formulaThe contrast loss function is calculated and obtained,

the same user enhancement node in two embedded matrixes with the same essence，/>) The enhanced embedding of the other user in the other embedding is regarded as a negative sample (++)>，/>) Wherein，/>Is a temperature superparameter, < >>Is a cosine similarity function; by the formula->Optimizing, wherein->Representing the BPR loss function, < >>Represents InfoNCE vs. loss function, which is a contrast-based function, ++>Is regularized intensity->，Is a model parameter set, +.>Is->Norms (F/F)>Is a contrast loss function->Parameter of->Is regularized intensity->Is a super parameter of (2); parameters in a model of a contrast learning recommendation method based on feature space semantic enhancement are optimized through Adam by using BPR loss and InfoNCE contrast loss, wherein Adam is an algorithm for optimizing a random objective function based on first-order gradient.

Further, in step (e), the method is carried out by the formulaCalculating to obtain a predictive score of the user on the item>In the formula->For transposition->And->Representing the final embedding of the user and the item respectively, calculating the score between the user and the item by using the inner product, and recommending the item to the user according to the score sorting.

The invention has the beneficial effects that:

the invention provides a contrast learning recommendation method based on feature space semantic enhancement, which is used for disturbing in the spectral features of feature mapping, because the spectral feature enhancement does not change the orthogonal basis of the feature mapping, semantic relevance is maintained, meanwhile, the approximate representation of matrix features is obtained through dimension reduction, noise and redundant information in embedded representation of an original user and an object can be effectively filtered, and generalization capability and accuracy of a recommendation system model can be improved.

Drawings

The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention.

FIG. 1 is a flow chart of the method of the present invention;

FIG. 2 is a model diagram of a contrast learning recommendation method based on feature space semantic enhancement.

Detailed Description

The following detailed description of the embodiments of the invention is provided for the purpose of enabling those of ordinary skill in the art to make and use the invention without undue burden.

Embodiment 1 contrast learning recommendation method based on feature space semantic enhancement includes the following steps:

According to embodiment 1, step (a) comprises the step of, for the electronic commerce dataset, following 8:1:1, randomly dividing the proportion to generate a training set, a verification set and a test set; building a set of user sets in a training setAnd a set of items/>Is->Wherein->And->The number of users and items, respectively +.>Representing user +.>And articles->Interactive (I)>Representing user +.>And articles->No interaction; respectively initializing an embedding parameter Xavier of a user node and an article node, wherein the embedding dimension is 64, and generating embedding information of the user node>And embedded information of item node->The invention relates to a recommendation system and the technical field of deep learning, in particular to a method for initializing Xavier, which is an initialization method for solving random problems and is an effective neural network initialization method, and the invention relates to a contrast learning recommendation method based on feature space semantic enhancement, and the method is frequently used in a deep network。

According to embodiment 1, step (b) comprises the step of passing through the formulaThe similarity value of the user and the item node is calculated,

in the middle ofUser-embedded information indicating that there is an interaction in the original interaction matrix,/->Representing the embedded information of the object corresponding to the interactive user in the original interaction matrix,/for>Is a Sigmoid activation function,/->And->Is two trainable parameter matrices, +.>And->Respectively indicating the number of users and items +.>Representing dimensions, mapping node features, +.>Representing the inner product operation of the vector,>representing user +.>And articles->Is a similarity value of (1);

in the middle ofBinary indicator function for judging true or false +.>Similarity value representing user and item, +.>Is a threshold value used for comparing with the similarity between the user and the object, when the similarity value between the user and the object is lower than the set threshold value, the value of the interaction edge information is set to 0, namely the interaction information which is considered as noise is directly removed, otherwise, the value of the original interaction edge is replaced by the value of the similarity, so that different importance degrees of interaction are represented, and the user is better than the object>Representing the final generation of noise reduction map->Is provided with the interactive edge information. The activation frequency is well interpreted by the Sigmoid activation function, and the value range of the similarity is controlled to be +.>And the Sigmoid function is suitable for the field of deep learning.

According to embodiment 1, step (c) comprises the step of passing through the formulaAndobtaining the user node and the item node at +.>Graph convolution results of layers->And，

In the middle ofEmbedded information indicating that the user node has learned at the current layer,/->Representing embedded information learned by the object node at the current layer, summarizing the embedded information of the user and the object at all layers, and generating a final embedded representation of the user and the object node, < + >>Representing the importance degree of the embedded information of the current layer;

in the middle ofRepresenting embedding the learned user and item into the information, stitching at dimension 0,and->Representing the original feature matrix +.>The left singular vector matrix and the right singular vector matrix of the matrix are orthogonal matrices and represent row eigenvectors and column eigenvectors corresponding to the original matrix,is a diagonal matrix, and the elements meet the conditionsThe elements on the diagonal are the original matrix +.>Characteristic value of (i.e.)>Called singular value features, also called spectral features, +.>Indicating transpose,/->Representing singular value decomposition of the matrix;

by the formulaAn approximate decomposition matrix is obtained and is used,

in the middle ofAnd->Representing the reduced dimension decomposition matrix, +.>Representing the original dimension reduced to +.>Dimension, its value is much smaller than +.>And->，/>Is a diagonal matrix after dimension reduction, and the elements meet the condition +.>，/>Representing an approximate decomposition of the matrix;

In the middle of，/>Is a decreasing sequence and +.>，/>Satisfy->Is a decreasing sequence and +.>By the formula->And->Node embedded representation of the user and the item after the perturbation is obtained respectively +.>And->The orthogonal matrix of the original matrix is still used here, without changing the essential characteristics of the matrix,and->Respectively represent the spectral characteristics after interference, retain +.>，The essence of the matrix is kept unchanged, and the essence information in the original matrix is kept.

According to embodiment 1, step (d) includes the step of passing through the formulaThe BPR loss is calculated and obtained,

by the formulaThe contrast loss function is calculated and obtained,

the same user enhancement node in two embedded matrixes with the same essence，/>) The enhanced embedding of the other user in the other embedding is regarded as a negative sample (++)>，/>) Wherein，/>Is a temperature superparameter, < >>Is a cosine similarity function; by the formula->Optimizing, wherein->Representing the BPR loss function, < >>Represents InfoNCE vs. loss function, which is a contrast-based function, ++>Is regularized intensity->，Is a model parameter set, +.>Is->Norms (F/F)>Is a contrast loss function->Parameter of->Is regularized intensity->Is a super parameter of (2); parameters in a model of a contrast learning recommendation method based on feature space semantic enhancement are optimized through Adam by using BPR loss and InfoNCE contrast loss, wherein Adam is an algorithm for optimizing a random objective function based on first-order gradient;

parameters in a model of a contrast learning recommendation method based on feature space semantic enhancement are optimized through Adam by using BPR loss and InfoNCE contrast loss, wherein Adam is an algorithm for optimizing a random objective function based on first-order gradient. In a recommendation system, historical behavior data of a user usually exists in an implicit feedback form, a BPR loss function is provided for solving recommendation problems under the implicit feedback data, and compared with other recommendation algorithms, the BPR has better performance and expandability, can process large-scale implicit feedback data, and guides a model to learn and process positive and negative sample pairs by comparing the loss function InfoNCE.

According to embodiment 1, step (e) comprises the step of passing through the formulaCalculating to obtain a predictive score of the user on the item>In the formula->For transposition->And->Representing the final embedding of the user and the item respectively, calculating the score between the user and the item by using the inner product, and recommending the item to the user according to the score sorting.

By adopting the technical scheme, the invention provides a simple and efficient data enhancement method, which aims at feature enhancement, and the method complements the existing enhancement strategy, and is used for selecting disturbance in the spectrum features of feature mapping, changing the magnitude of spectrum feature values to change the feature importance described by each vector in an orthogonal base, influencing the importance degree of matrix representation, reasonably controlling the range of the spectrum features, keeping the essence of the matrix unchanged, keeping the essence information in an original matrix, and enhancing the spectrum features without changing the orthogonal base of the feature mapping, thereby being beneficial to keeping semantic relativity. The method comprises the following specific steps: preprocessing a data set of the object purchased by the user, constructing an interaction graph, removing noise and interference information in the interaction graph information, improving the accuracy of a model, performing spectral feature interference on the learned feature matrix, generating two enhanced sample matrices, integrating a multi-task learning strategy into an original feature matrix, unifying the enhanced feature samples, improving the robustness of node embedding, calculating the score between the user and the object embedding by using an inner product, and recommending the object to the user by using score sequencing. Therefore, the method provided by the invention is used for disturbing the spectral characteristics of the feature matrix, so that noise and redundant information in the embedded representation of the original user and the object can be effectively filtered, and the generalization capability and accuracy of the recommendation system model are improved.

While the invention has been described with reference to the preferred embodiments, it is not limited thereto, and various changes and modifications can be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A contrast learning recommendation method based on feature space semantic enhancement is characterized by comprising the following steps:

2. The feature space semantic enhancement based contrast learning recommendation method of claim 1, wherein step (a) comprises the steps of:

(a-1) for an e-commerce dataset, following 8:1:1, randomly dividing the proportion to generate a training set, a verification set and a test set;

(a-2) building a set of user sets in a training setAnd a group of items->Is->Wherein->And->The number of users and items, respectively +.>Representing user +.>And articles->Interactive (I)>Representing user +.>And articles->No interaction;

(a-3) initializing the embedding parameters Xavier of the user node and the object node respectively, wherein the embedding dimension is 64, and generating the embedding information of the user nodeAnd embedded information of item node->Xavier initialization is an initialization method that solves the random problem.

3. The feature space semantic enhancement based contrast learning recommendation method of claim 1, wherein step (b) comprises the steps of:

(b-1) passing through the formulaThe similarity value of the user and the item node is calculated,

in the middle ofUser-embedded information indicating that there is an interaction in the original interaction matrix,/->Representing the embedded information of the object corresponding to the interactive user in the original interaction matrix,/for>Is a Sigmoid activation function,/->And->Is two trainable parameter matrices, +.>And->Respectively indicating the number of users and items +.>Representing the dimension, mapping the node characteristics,representing the inner product operation of the vector,>representing user +.>And articles->Is a similarity value of (1);

(b-2) passing through the formulaRemoving low-reliability interaction in the interaction information of the graph structure,

in the middle ofBinary indicator function for judging true or false +.>Similarity value representing user and item, +.>Is a threshold value for comparison with the similarity of the user and the item, < >>Representing the final generation of noise reduction map->Is provided with the interactive edge information.

4. The feature space semantic enhancement based contrast learning recommendation method of claim 1, wherein step (c) comprises the steps of:

(c-1) passing through the formulaAnd->Obtaining the user node and the item node at +.>Graph convolution results of layers->And->，

(c-2) passing through the formulaAnd->Respectively get the final embedding of the user->And final embedding of the article->，

(c-3) passing through the formulaPerforming original feature matrix decomposition to obtain an original matrix +.>Is used for the spectral characteristics of the (c),

in the middle ofRepresenting embedding information of the user and the item to be learned, < >>And->Representing the original feature matrix +.>A left singular vector matrix and a right singular vector matrix,

is a diagonal matrix, and the elements meet the conditions，

The elements on the diagonal of which are the original matricesCharacteristic value of>Indicating transpose,/->Representing singular value decomposition of the matrix;

(c-4) passing through the formulaAn approximate decomposition matrix is obtained and is used,

in the middle ofAnd->Representing the reduced dimension decomposition matrix, +.>Representing the original dimension reduced to +.>The dimensions of the dimensions are such that,

is a diagonal matrix after dimension reduction, and the elements meet the condition +.>，/>Representing an approximate decomposition of the matrix;

(c-5) passing through the formulaAndthe interference is added to the original features separately,

the two matrix elements satisfy the condition，，

In the middle of，/>Is a decreasing sequence and +.>，/>，/>Is a decreasing sequence and，

by the formulaGet and->Respectively obtain the uses after disturbanceNode embedded representation of household and article->And->，/>And->Respectively representing the spectral characteristics after interference.

5. The feature space semantic enhancement based contrast learning recommendation method of claim 1, wherein step (d) comprises the steps of:

(d-1) passing through the formulaThe BPR loss is calculated and obtained,

in the middle ofFor a set of items to interact with a user +.>Representation and user->Interactive positive sample item, +_>Representation and user->Negative sample item, < >>Transpose representing final embedding of a user，/>Representing Sigmoid activation function,/->The BPR loss function is a function for learning personalized preferences of users in a recommendation system;

(d-2) passing through the formulaThe contrast loss function is calculated and obtained,

the same user enhancement node in two embedded matrixes with the same essence，/>) Regarding as anchor point and positive sample, enhancement embedding of other users in another embedding is regarding as negative sample +.>Wherein,/>Is a temperature superparameter, < >>Is a cosine similarity function;

(d-3) passing through the formulaThe optimization is carried out and the optimization is carried out,

in the middle ofRepresenting BPR lossLoss function (I)>Represents InfoNCE vs. loss function, which is a contrast-based function, ++>Is regularized intensity->，/>Is a model parameter set, +.>Is->Norms (F/F)>Is a contrast loss function->Parameter of->Is regularized intensity->Is a super parameter of (2);

(d-4) optimizing parameters in a feature space semantic enhancement-based contrast learning recommendation method model by Adam, an algorithm for optimizing a random objective function based on a first order gradient, using BPR loss and InfoNCE contrast loss.

6. The feature space semantic enhancement based contrast learning recommendation method of claim 1, wherein step (e) comprises the steps of: by the formulaCalculating to obtain a predictive score of the user on the item>In the formula->For transposition->And->Representing the final embedding of the user and the item respectively, calculating the score between the user and the item by using the inner product, and recommending the item to the user according to the score sorting.