CN115391677A - Negative sample-based collaborative recommendation method and device, terminal and readable storage medium - Google Patents

Negative sample-based collaborative recommendation method and device, terminal and readable storage medium Download PDF

Info

Publication number
CN115391677A
CN115391677A CN202211148400.8A CN202211148400A CN115391677A CN 115391677 A CN115391677 A CN 115391677A CN 202211148400 A CN202211148400 A CN 202211148400A CN 115391677 A CN115391677 A CN 115391677A
Authority
CN
China
Prior art keywords
user
preset
collaborative
filtering model
collaborative filtering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211148400.8A
Other languages
Chinese (zh)
Inventor
鲜学丰
赵朋朋
董虎胜
吴童语
方立刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Vocational University
Original Assignee
Suzhou Vocational University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Vocational University filed Critical Suzhou Vocational University
Priority to CN202211148400.8A priority Critical patent/CN115391677A/en
Publication of CN115391677A publication Critical patent/CN115391677A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Finance (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Accounting & Taxation (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a negative sample-based collaborative recommendation method, a negative sample-based collaborative recommendation device, a terminal and a readable storage medium, wherein the collaborative recommendation method comprises the following steps: acquiring a preset collaborative filtering model, wherein the preset collaborative filtering model comprises a coding layer, a polymerization layer and a prediction layer; acquiring a negative sample corresponding to a user in a preset time period, and acquiring an initial embedded vector corresponding to the user from a preset collaborative filtering model; generating an embedding vector and a modified embedding vector, recommending the item to the user based on the modified embedding vector. The collaborative recommendation algorithm has the advantage of high accuracy.

Description

Negative sample-based collaborative recommendation method and device, terminal and readable storage medium
Technical Field
The invention relates to the technical field of big data processing, in particular to a negative sample-based collaborative recommendation method, a negative sample-based collaborative recommendation device, a terminal and a readable storage medium.
Background
In recent years, the rapid development of the internet has led to a rapid increase in the total amount of information on the internet, and electronic commerce has been expanding. The huge amount of data on the internet causes the user to spend a lot of time searching for their favorite articles, and this process of excluding a lot of useless information will undoubtedly prevent the user from enjoying the convenience of the internet. To address these issues, personalized recommendation systems have come to be applied. The personalized recommendation system is a high-level intelligent platform established on the basis of mass data mining, and is mainly used for recommending interested information and commodities to a user according to the interest characteristics and other information of the user so as to help the user to provide completely personalized decision support and information service. Personalized recommendation not only has important role and value in promoting economic development and network development, but also is a hot problem worthy of research on how to improve recommendation efficiency and accuracy.
The Collaborative Filtering Recommendation Algorithm (Collaborative Recommendation Algorithm) is one of the most common and effective Recommendation algorithms in personalized Recommendation systems. Different from the traditional content-based recommendation, the collaborative filtering algorithm analyzes the interests of the users, finds similar users of the specified user in the user group, integrates the evaluation of the similar users on a certain article, and finally forms the prediction of the preference degree of the specified user on the article. Although the collaborative filtering recommendation is widely applied, the problems caused by data sparsity and single information source are still difficult to overcome, and the recommendation accuracy is influenced due to inaccurate calculation of user similarity caused by sparsity of a user article matrix.
Therefore, designing a collaborative recommendation method with high accuracy becomes an urgent problem to be solved.
Disclosure of Invention
In view of the above, the present invention provides a negative sample-based collaborative recommendation method, apparatus, terminal and readable storage medium.
In order to achieve the purpose, the technical scheme of the invention is realized as follows: a negative sample-based collaborative recommendation method comprises the following steps: obtainTaking a preset collaborative filtering model, wherein the preset collaborative filtering model comprises a coding layer, an aggregation layer and a prediction layer; obtaining a negative sample corresponding to a user U in a preset time period, wherein the negative sample comprises Num articles N 1 、N 2 、...、N Num Generating an article N based on the coding layer in the preset collaborative filtering model h Corresponding D-dimensional initial embedding vector E' h D, h and Num are natural numbers, h =1, 2., num; obtaining a D-dimensional initial embedding vector E corresponding to a user U from the preset collaborative filtering model u (ii) a Generating an embedded vector
Figure BDA0003853900220000021
Wherein, I h Indicates whether or not to the article N h Masking is carried out when I h =1, representing a retained article N h (ii) a When I h =0, indicating that the article N is discarded h
Figure BDA0003853900220000022
Modified embedded vector H corresponding to user U u =g·E u +(1-g)·P u V, wherein g is a real number, g is more than or equal to 0 and less than or equal to 1, and V is a real number matrix of D rows and D columns; embedding vector H based on correction u Recommending items to user U.
As an improvement of the embodiment of the present invention, in the preset collaborative filtering model, an initial embedding vector E is embedded u Replacement by a modified embedding vector H u
As an improvement of the embodiment of the present invention, the method further comprises the following steps: using cosine contrast loss
Figure BDA0003853900220000023
Figure BDA0003853900220000024
Optimizing the preset collaborative filtering model; wherein the content of the first and second substances,
Figure BDA0003853900220000025
m and L are threshold values, and 0 is less than or equal to M<L, w are normalNumber, cos () is a cosine similarity function.
As an improvement of the embodiment of the invention, M is more than or equal to 0 and less than or equal to 1.
As an improvement of the embodiment of the invention, L is more than or equal to 0 and less than or equal to 1.
The embodiment of the invention also provides a negative sample-based collaborative recommendation device, which comprises the following modules: the information acquisition module acquires a preset collaborative filtering model, wherein the preset collaborative filtering model comprises a coding layer, an aggregation layer and a prediction layer; obtaining a negative sample corresponding to a user U in a preset time period, wherein the negative sample comprises Num articles N 1 、N 2 、...、N Num Generating an article N based on the coding layer in the preset collaborative filtering model h Corresponding D-dimensional initial embedding vector E' h D, h and Num are natural numbers, h =1, 2., num; a processing module, configured to obtain a D-dimensional initial embedding vector E corresponding to the user U from the preset collaborative filtering model u (ii) a Generating an embedded vector
Figure BDA0003853900220000026
Wherein, I h Indicates whether or not to the article N h Masking is carried out when I h =1, representing a retained article N h (ii) a When I is h =0, indicating that the article N is discarded h
Figure BDA0003853900220000027
Modified embedded vector H corresponding to user U u =g·E u +(1-g)·P u V, wherein g is a real number, g is more than or equal to 0 and less than or equal to 1, and V is a real number matrix of D rows and D columns; a recommendation module for embedding the vector H based on the correction u Recommending the item to the user U.
As an improvement of the embodiment of the present invention, in the preset collaborative filtering model, an initial embedding vector E is embedded u Replacement with modified insertion vector H u
As an improvement of the embodiment of the present invention, the present invention further includes the following modules: a training module for exploiting cosine contrast loss
Figure BDA0003853900220000028
Optimizing the preset collaborative filtering model; wherein the content of the first and second substances,
Figure BDA0003853900220000031
m and L are threshold values, and 0 is less than or equal to M<L, w are constants and cos () is a cosine similarity function.
An embodiment of the present invention further provides a terminal, including: a memory for storing a computer program; and the processor is used for realizing the steps of the collaborative recommendation method when the computer program is executed.
An embodiment of the present invention further provides a readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the computer program implements the steps of the foregoing collaborative recommendation method.
The negative sample-based collaborative recommendation method, the device, the terminal and the readable storage medium provided by the embodiment of the invention have the following advantages: the embodiment of the invention discloses a negative sample-based collaborative recommendation method, a device, a terminal and a readable storage medium, wherein the collaborative recommendation method comprises the following steps: acquiring a preset collaborative filtering model, wherein the preset collaborative filtering model comprises a coding layer, a polymerization layer and a prediction layer; acquiring a negative sample corresponding to a user in a preset time period, and acquiring an initial embedded vector corresponding to the user from a preset collaborative filtering model; generating an embedding vector and a modified embedding vector, recommending the item to the user based on the modified embedding vector. The collaborative recommendation algorithm has the advantage of high accuracy.
Drawings
Fig. 1 is a schematic flowchart of a collaborative recommendation method according to an embodiment of the present invention;
fig. 2A, 2B, 3A, 3B, and 3C are graphs of experimental results of the collaborative recommendation method.
Detailed Description
The present invention will be described in detail below with reference to embodiments shown in the drawings. The present invention is not limited to the embodiment, and structural, methodological, or functional changes made by one of ordinary skill in the art according to the embodiment are included in the scope of the present invention.
The following description and the drawings sufficiently illustrate specific embodiments herein to enable those skilled in the art to practice them. Portions and features of some embodiments may be included in or substituted for those of others. The scope of the embodiments herein includes the full breadth of the claims, as well as all available equivalents of the claims. The terms "first," "second," and the like, herein are used solely to distinguish one element from another element without requiring or implying any actual such relationship or order between such elements. In practice, a first element can also be referred to as a second element, and vice versa. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a structure, apparatus, or device that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such structure, apparatus, or device. Without further limitation, an element defined by the phrases "comprising one of 8230; \8230;" 8230; "does not exclude the presence of additional like elements in a structure, device, or apparatus that comprises the element. The various embodiments are described in a progressive manner, with each embodiment focusing on differences from the other embodiments, and with like parts being referred to one another.
The terms "longitudinal," "lateral," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like herein, as used herein, are defined as orientations or positional relationships based on the orientation or positional relationship shown in the drawings, and are used for convenience in describing and simplifying the description, but do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus should not be construed as limiting the present invention. In the description herein, unless otherwise specified and limited, the terms "mounted," "connected," and "connected" are to be construed broadly, and may include, for example, mechanical or electrical connections, communications between two elements, direct connections, and indirect connections via intermediary media, where the specific meaning of the terms is understood by those skilled in the art as appropriate.
The embodiment of the invention provides a negative sample-based collaborative recommendation method, wherein a computer system can be used for executing the collaborative recommendation method, such as a server of an online movie ticketing website, a server of an online shopping website and the like; thus, in this document, the item may be a movie ticket, a commercial good, or the like. The computer system executes the collaborative recommendation method every preset time or when the user logs in.
As shown in fig. 1, the method comprises the following steps:
step 101: acquiring a preset collaborative filtering model, wherein the preset collaborative filtering model comprises a coding layer, an aggregation layer and a prediction layer; obtaining a negative sample corresponding to a user U in a preset time period, wherein the negative sample comprises Num articles N 1 、N 2 、...、N Num Generating an article N based on the coding layer in the preset collaborative filtering model h Corresponding D-dimensional initial embedding vector E' h D, h and Num are natural numbers, h =1, 2., num;
here, in the encoding layer, an initial vector representation is generated by performing embedded encoding on the user and the article, respectively, and then a user-article interaction matrix is constructed according to historical interaction data of the user. When a user-item interaction matrix is created, M and N are set to respectively represent the number of users and items, U and I are set to respectively represent a user set and an item set, and a user-item interaction matrix Y belonging to R is defined M×N Using Y ij To represent the ith row and jth column elements of matrix Y, if user i interacts with item j, then Y is ij =1, otherwise Y ij =0。
In the aggregation layer, most of the traditional collaborative filtering methods directly use the original vector representation of the user to calculate the similarity between the user and the article, and here, in order to better simulate the user behavior characteristics, all the interacted articles of each user are input into a model as extra supplementary information, so that a user characteristic vector with better representation capability is formed.
In the prediction layer, after the final representation of the user and the article is obtained, the cosine similarity is used for calculating and obtaining the prediction score of the user on the article.
Step 102: obtaining a D-dimensional initial embedded vector E corresponding to a user U from the preset collaborative filtering model u (ii) a Generating an embedded vector
Figure BDA0003853900220000051
Wherein, I h Indicates whether or not to the article N h Masking is carried out when I h =1, representing a retained article N h (ii) a When I is h =0, indicating that the article N is discarded h
Figure BDA0003853900220000052
Modified embedded vector H corresponding to user U u =g·E u +(1-g)·P u V, wherein g is a real number, g is more than or equal to 0 and less than or equal to 1, and V is a real number matrix of D rows and D columns;
in the existing collaborative filtering model, the original vector representation of the user is almost directly used for calculating the similarity, but in consideration of the uniqueness of the user representation, the collaborative recommendation method takes the item sequences interacted by the user as supplementary information to be aggregated so as to enhance the user representation. It is understood that when I h =0, this corresponds to the article N being discarded h . Here, g can be understood as a hyperparametric weight for controlling the degree of importance of the behavior aggregate vector.
Step 103: embedding vector H based on correction u Recommending items to user U.
In this embodiment, in the preset collaborative filtering model, an initial embedded vector E is embedded u Replacement by a modified embedding vector H u
Here, a historical user representation E of the user is presented u Replacement by a corrected user profile H u Then, the next time the collaborative recommendation is runIn the method, an acquired historical user profile E u That is, the corrected user portrait H obtained from the execution u
In this embodiment, the method further includes the following steps: using cosine contrast loss
Figure BDA0003853900220000053
Figure BDA0003853900220000054
Optimizing the preset collaborative filtering model; wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003853900220000055
m and L are threshold values, and 0 is less than or equal to M<L, w are constants and cos () is a cosine similarity function.
In this embodiment, M is greater than or equal to 0 and less than or equal to 1.
In this embodiment, L is greater than or equal to 0 and less than or equal to 1.
Here, only when
Figure BDA0003853900220000056
Satisfy the requirement of
Figure BDA0003853900220000057
Then, the cosine contrast loss L is calculated CCL (H u ,E′ h ) For not satisfying this condition
Figure BDA0003853900220000058
Is not calculating cosine contrast loss, i.e. is discarded. Using the cosine contrast loss L CCL The final global penalty is calculated to update the model parameters during the backpropagation gradient descent.
By aligning two vectors H u And E' h Carry out L 2 Normalization, the cosine similarity only calculates the angle difference, thereby avoiding the influence of the representation size; this is particularly advantageous because the size of the user and item representations may be strongly biased by their popularity in collaborative filtering tasks.
Here, M can be understoodTo filter the lower bound Margin of the negative sample, L can be understood as filtering the upper bound Limit of the negative sample. L is a radical of an alcohol CCL (H u ,E′ h ) The similarity between positive pairs can be maximized and the similarity of negative pairs within the selection interval can be minimized. w, M and L can be understood as hyper-parameters that control the relative weight of positive and negative sample losses.
When the number of samples in the negative samples increases, many samples with insufficient information are usually included. But existing loss functions (e.g., BPR, etc.) would treat each negative sample equally. Therefore, the model training process may be overwhelmed by these uninformative samples, which may significantly reduce the model performance and convergence speed. Since the distribution of the negative examples follows a skewed distribution, the difficult negative examples are concentrated on the part with high similarity, and the negative examples with low similarity are generally simple negative examples. Thus, the cosine contrast loss L CCL Only the difficult negative examples can be retained by filtering the uninformative negative examples using appropriate Margin. When the cosine similarity of the negative sample is lower than the set value Margin, it represents that the negative sample information is insufficient, and the inventor sets the value of the negative sample cosine similarity in the loss function to be zero. Such an operation helps to automatically identify those difficult negative samples whose cosine similarity is greater than Margin, thereby facilitating better training of the model.
When the inventor researches the image field, the distribution of the false negative samples is also researched, and the false negative samples are concentrated at the top of the difficult negative samples along with the screening of the difficult negative samples. Thus, the inventors speculate that there is a similar distribution of false negative samples in the recommendation field. False negative samples share the same or even higher similarity with difficult negative samples and, if their presence is not taken into account, have some negative impact on the model performance. In order to alleviate or even eliminate the negative influence, the inventor sets a threshold upper Limit for the range of selecting the negative sample, i.e. when the cosine similarity of the negative sample is higher than the Limit set value, the probability of the negative sample being a false negative sample is higher, and the inventor sets the value of the negative sample in the loss function to be zero. Such an operation helps the negative sampling to obtain a better and true difficult negative example to improve the recommended performance of the model.
Here, the embedded representation for each user and item is updated and optimized step by step during the model training back propagation process, thereby updating the user-item interaction matrix.
In order to verify the usability of the collaborative recommendation method, the inventor carries out corresponding experiments, firstly, a data set, a comparison model and evaluation indexes used in the experiments are simply introduced, secondly, detailed comparison analysis is carried out on experimental parameter setting and experimental results, and finally, qualitative and quantitative analysis is carried out on an ablation experiment and a hyper-parameter tuning process carried out by the model.
Data set
The model was tested on 3 existing public data sets, which are Movielens-100k, amazon-Electronics and Amazon-Music, and the following briefly introduces each of these three data sets:
movielens-100k: the MovieLens dataset, which was collected by the GroupLens research project, included 100,000 scores (1-5) for 1682 movies by 943 users, where each user rated at least 20 movies.
Amazon-Electronics: this amazon dataset is user reviews collected from the e-commerce website amazon. The Electronic category is used herein to include records of purchases of Electronic products on amazon. The inventor selects the 5-core version (users and items with less than 5 purchase records deleted).
Amazon-Music: the present dataset employs the Digital Music category in the amazon dataset, which contains the purchase and review records of Digital Music on amazon. The inventors selected the 5-core version (deleting fewer than 5 users and music that were purchased or commented on).
For experimental purposes, the inventors have correspondingly preprocessed three data sets: the inventor only retains the user ID, the project ID and the scoring behavior in the data set, deletes other redundant information, and then constructs a user-object interaction sequence according to the scoring behavior of the user on the project and stores the user-object interaction sequence by rows.
Comparison model
The comparative models used in this experiment included NGCF, lightGCN, ENMF, IMP-GCN, DGCF and SimpleX. The inventors will next set forth it one by one.
NGCF: the NGCF exploits the potential collaboration signals in user-item interactions by propagating embedding over the user-item graph structure. This will express and model the high-order connectivity in the user's item map, inject the cooperative signal into the embedding process effectively in an explicit way, the NGCF model is disclosed by the paper "Neural Graph hierarchical Filtering", the download address is: https:// axiv.org/abs/1905.08108context = cs.ir.
LightGCN: to simplify the design of the GCN, making it more compact and suitable for recommendation, lightGCN contains only the most important component in the GCN-neighborhood aggregation-for collaborative filtering. In particular, lightGCN learns user and item embedding by propagating them linearly on user-item interaction Graph and uses the weighted sum of the embedding learned at all layers as the final embedding, lightGCN model is disclosed by the paper LightGCN: simplifying and Powering Graph convention Network for Recommendation, download address is: https:// arxiv.org/abs/2002.02126.
ENMF: to learn the Neural Recommendation model from the entire training data without Sampling, ENMF uses three new optimization methods, and can efficiently learn model parameters from the entire data (including all missing data) with relatively low time complexity, and the ENMF model is disclosed by the paper "Efficient Neural Matrix Factorization with out Sampling for Recommendation" and has the download address: https:// chenchunghu. Io/files/TOIS _ enmf. Pdf.
IMP-GCN: the IMP-GCN performs high-order graph convolution in the subgraph. The sub-graph consists of users with similar interests and their interactive items. In order to form subgraphs, the method designs an unsupervised subgraph generation module that can effectively identify users with common interests by utilizing the user characteristics and graph structure. Therefore, the model can avoid propagating negative information from higher-order neighbors into embedded learning, and the IMP-GCN model is published by the paper IMP-GCN: the Interest-ware Message-publishing GCN for Recommendation discloses that the download address is: https:// arxiv.org/abs/2102.10044.
SimpleX: the method considers that the choice of the loss function and the negative sampling rate are equally important. Meanwhile, cosine Contrast Loss (CCL) is proposed and further merged into a simple unified collaborative filtering model, which is published by the paper "SimpleX: a Simple and Strong Baseline for Collaborative Filtering discloses that the download address is: https:// arxiv. Org/abs/2109.12613.
Evaluation index
In the experiment, common evaluation indexes Recall and NDCG in a recommendation system are adopted to measure the quality of recommendation performance.
The Recall rate (Recall) is an evaluation index commonly used by recommendation systems during the Recall stage. The meaning is how much of the positive samples are predicted to be true. The formalization is expressed as follows:
Figure BDA0003853900220000081
wherein u is the user, R (u) is the set of items predicted by the model to be recommended, and T (u) represents the recommended set in the real test set.
NDCG (normalized discrete cumulative gain) is called normalized fracture cumulative gain, and is used as an evaluation index of the sorting result to evaluate the accuracy of sorting. When the recommendation model returns a corresponding recommendation list, the gap between the sorted list and the user real interaction list can be evaluated by the NDCG.
Figure BDA0003853900220000082
Figure BDA0003853900220000083
Figure BDA0003853900220000084
The evaluation indexes of all user recommendation lists are obtained through DCG calculation, normalization needs to be carried out on different users in order to enable the different users to be compared with each other, namely, the DCG score of each user real list is calculated and is represented by IDCG, then the ratio of the DCG and the IDCG of each user is used as the normalized score of each user, and finally, the average is carried out on each user to obtain the final score, namely the NDCG.
Analysis of Experimental results
Table 1 shows the experimental results of each model on 3 reference data sets, and the inventors can conclude the following by analyzing the data of the table:
(1) The upper bound of negative sample sampling is limited by the inventor model on the basis of the original CCL, and the comparison with the experimental result of the simpleX shows that the experimental result of the inventor model on three reference data sets exceeds the prior strongest comparison model simpleX, thereby proving the effectiveness of the inventor method.
Figure BDA0003853900220000085
Figure BDA0003853900220000091
Table 1 recommended performance comparison of each model over 3 data sets
Note: the row of "HNCF (our)" is the optimal data, and the row of "SimpleX" is the suboptimal data
(2) As the strongest contrast model, simpleX has significant performance due to its effective screening of negative examples, and it can be found from data that generating more difficult negative examples enables the model to defeat most collaborative filtering models.
(3) Based on the existing LightGCN, the IMP-GCN filters negative samples in the original image by using subgraphs, and experiments prove that compared with the LightGCN, the IMP-GCN has certain improvement on the effect after denoising.
Ablation experiment and hyper-parameter tuning
In order to better prove the effectiveness and the robustness of the model proposed by the inventor, the inventor removes the upper limit of the negative sample in the model to obtain a corresponding experimental result. Meanwhile, in order to verify the validity of the aggregation module, the inventor sets g to 1, that is, information of the interactive item is not aggregated when the user representation is generated, and also obtains a corresponding experiment result. During the experiment, other parameters are set to keep the optimal result unchanged, and the inventor compares the optimal result with the three reference data sets respectively to obtain the results shown in fig. 2A and 2B.
By analyzing fig. 2A and 2B, the inventors can conclude that:
(1) The inventor finds that the aggregation module is effective in most scenarios, and provides richer information for the user representation, thereby enhancing the final recommendation effect.
(2) On Music data sets the inventors found that the model works best when g =1, indicating that the interaction information of aggregated users is not always valid, requiring testing of the optimality on different data sets.
(3) By comparing the results of the tests with and without the limit model, the inventors can find that adding a certain upper bound in screening negative samples is helpful for the final prediction of the model.
In the process of carrying out super-parameter tuning, the inventor carries out gridding search on the key parameters related in the model one by one, the inventor selects 3 parameters which are most relevant to the model of the inventor to analyze, and the curve diagrams of the parameter effect are shown in fig. 3A, fig. 3B and fig. 3C.
According to FIGS. 3A, 3B and 3C, the lower the polymerization coefficient in the ml-100k data set, the better the effect, while the higher the g in the music data set. With respect to the weight w, a trend of lower and better is presented on both data sets. On the limit parameter, the inventor finds that the optimal parameter is basically stabilized at about 0.94 through grid search, which just proves the effectiveness of the method proposed by the inventor, and when the limit is too high, false negative samples may occur, thereby reducing the experimental effect.
The embodiment of the invention provides a negative sample-based collaborative recommendation device, which comprises the following modules:
the information acquisition module is used for acquiring a preset collaborative filtering model, and the preset collaborative filtering model comprises a coding layer, an aggregation layer and a prediction layer; obtaining a negative sample corresponding to a user U in a preset time period, wherein the negative sample comprises Num articles N 1 、N 2 、...、N Num Generating an item N based on the coding layer in the preset collaborative filtering model h Corresponding D-dimensional initial embedding vector E' h D, h and Num are natural numbers, h =1, 2., num;
a processing module, configured to obtain a D-dimensional initial embedding vector E corresponding to the user U from the preset collaborative filtering model u (ii) a Generating an embedded vector
Figure BDA0003853900220000101
Wherein, I h Indicates whether or not to the article N h Masking is carried out when I h =1, representing a retained article N h (ii) a When I is h =0, indicating that the article N is discarded h
Figure BDA0003853900220000102
Modified embedded vector H corresponding to user U u =g·E u +(1-g)·P u V, wherein g is a real number, g is more than or equal to 0 and less than or equal to 1, and V is a real number matrix of D rows and D columns;
a recommendation module for embedding vector H based on correction u Recommending items to user U.
In this embodiment, in the preset collaborative filtering model, an initial embedded vector E is embedded u Replacement by a modified embedding vector H u
In this embodiment, the following modules are further included:
a training module for using cosine contrast loss
Figure BDA0003853900220000103
Figure BDA0003853900220000104
Optimizing the preset collaborative filtering model; wherein the content of the first and second substances,
Figure BDA0003853900220000105
Figure BDA0003853900220000106
m and L are threshold values, 0 is less than or equal to M<L, w are constants and cos () is a cosine similarity function.
An embodiment of the present invention provides a terminal, including: a memory for storing a computer program; a processor for implementing the steps of the collaborative recommendation method according to the first embodiment when executing the computer program.
A fourth embodiment of the present invention provides a readable storage medium, where a computer program is stored on the readable storage medium, and when the computer program is executed by a processor, the steps of the collaborative recommendation method in the first embodiment are implemented.
It should be understood that although the present description refers to embodiments, not every embodiment contains only a single technical solution, and such description is for clarity only, and those skilled in the art should make the description as a whole, and the technical solutions in the embodiments can also be combined appropriately to form other embodiments understood by those skilled in the art.
The above-listed detailed description is only a specific description of a possible embodiment of the present invention, and they are not intended to limit the scope of the present invention, and equivalent embodiments or modifications made without departing from the technical spirit of the present invention should be included in the scope of the present invention.

Claims (10)

1. A negative sample-based collaborative recommendation method is characterized by comprising the following steps:
acquiring a preset collaborative filtering model, wherein the preset collaborative filtering model comprises a coding layer, an aggregation layer and a prediction layer; obtaining a negative sample corresponding to a user U in a preset time period, wherein the negative sample comprises Num articles N 1 、N 2 、...、N Num Generating an article N based on the coding layer in the preset collaborative filtering model h Corresponding D-dimensional initial embedding vector E' h D, h and Num are natural numbers, h =1, 2., num;
obtaining a D-dimensional initial embedded vector E corresponding to a user U from the preset collaborative filtering model u (ii) a Generating an embedded vector
Figure FDA0003853900210000011
Wherein, I h Indicates whether or not to the article N h Masking is carried out when I h =1, representing a retained article N h (ii) a When I h =0, indicating that the article N is discarded h
Figure FDA0003853900210000012
Modified embedded vector H corresponding to user U u =g·E u +(1-g)·P u V, wherein g is a real number, g is more than or equal to 0 and less than or equal to 1, and V is a real number matrix of D rows and D columns;
based on the modified embedding vector H u Recommending the item to the user U.
2. The collaborative recommendation method according to claim 1, wherein:
in the preset collaborative filtering model, an initial embedding vector E u Replacement by a modified embedding vector H u
3. The collaborative recommendation method according to claim 1, further comprising the steps of:
using cosine contrast loss
Figure FDA0003853900210000013
Figure FDA0003853900210000017
Optimizing the preset collaborative filtering model; wherein the content of the first and second substances,
Figure FDA0003853900210000014
Figure FDA0003853900210000015
m and L are threshold values, 0 is less than or equal to M<L, w are constants and cos () is a cosine similarity function.
4. The collaborative recommendation method according to claim 1, wherein:
0≤M≤1。
5. the collaborative recommendation method according to claim 1, wherein:
0≤L≤1。
6. the negative sample-based collaborative recommendation device is characterized by comprising the following modules:
the information acquisition module acquires a preset collaborative filtering model, wherein the preset collaborative filtering model comprises a coding layer, an aggregation layer and a prediction layer; obtaining a negative sample corresponding to a user U in a preset time period, wherein the negative sample comprises Num articles N 1 、N 2 、...、N Num Generating an article N based on the coding layer in the preset collaborative filtering model h Corresponding D-dimensional initial embedding vector E' h D, h and Num are natural numbers, h =1, 2., num;
a processing module, configured to obtain a D-dimensional initial embedded vector E corresponding to the user U from the preset collaborative filtering model u (ii) a Generating an embedded vector
Figure FDA0003853900210000016
Wherein, I h Indicates whether or not to the article N h Masking is carried out when I h =1, representing a retained article N h (ii) a When I is h =0, indicating that the article N is discarded h
Figure FDA0003853900210000021
User U pairCorresponding correction embedding vector H u =g·E u +(1-g)·P u V, wherein g is a real number, g is more than or equal to 0 and less than or equal to 1, and V is a real number matrix of D rows and D columns;
a recommendation module for embedding vector H based on correction u Recommending the item to the user U.
7. The collaborative recommendation device according to claim 6, wherein:
in the preset collaborative filtering model, an initial embedding vector E u Replacement with modified insertion vector H u
8. The collaborative recommendation device according to claim 6, further comprising the following modules:
a training module for exploiting cosine contrast loss
Figure FDA0003853900210000022
Figure FDA0003853900210000023
Optimizing the preset collaborative filtering model; wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003853900210000024
Figure FDA0003853900210000025
m and L are threshold values, and 0 is less than or equal to M<L, w are constants and cos () is a cosine similarity function.
9. A terminal, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the collaborative recommendation method according to any one of claims 1 to 5 when executing said computer program.
10. A readable storage medium, characterized in that the readable storage medium has stored thereon a computer program which, when being executed by a processor, carries out the steps of the collaborative recommendation method according to any one of claims 1 to 5.
CN202211148400.8A 2022-09-20 2022-09-20 Negative sample-based collaborative recommendation method and device, terminal and readable storage medium Pending CN115391677A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211148400.8A CN115391677A (en) 2022-09-20 2022-09-20 Negative sample-based collaborative recommendation method and device, terminal and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211148400.8A CN115391677A (en) 2022-09-20 2022-09-20 Negative sample-based collaborative recommendation method and device, terminal and readable storage medium

Publications (1)

Publication Number Publication Date
CN115391677A true CN115391677A (en) 2022-11-25

Family

ID=84126376

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211148400.8A Pending CN115391677A (en) 2022-09-20 2022-09-20 Negative sample-based collaborative recommendation method and device, terminal and readable storage medium

Country Status (1)

Country Link
CN (1) CN115391677A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115688907A (en) * 2022-12-30 2023-02-03 中国科学技术大学 Recommendation model training method based on graph propagation and recommendation method based on graph propagation

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115688907A (en) * 2022-12-30 2023-02-03 中国科学技术大学 Recommendation model training method based on graph propagation and recommendation method based on graph propagation
CN115688907B (en) * 2022-12-30 2023-04-21 中国科学技术大学 Recommendation model training method based on graph propagation and recommendation method based on graph propagation

Similar Documents

Publication Publication Date Title
Feng et al. RBPR: A hybrid model for the new user cold start problem in recommender systems
Li et al. On both cold-start and long-tail recommendation with social data
Li et al. Exploiting explicit and implicit feedback for personalized ranking
CN111444395B (en) Method, system and equipment for obtaining relation expression between entities and advertisement recall system
CN104699725A (en) Data searching processing method and system
Lu et al. Scalable news recommendation using multi-dimensional similarity and Jaccard–Kmeans clustering
Ni et al. A two-stage embedding model for recommendation with multimodal auxiliary information
Wang et al. SDDRS: stacked discriminative denoising auto-encoder based recommender system
CN116757763A (en) Electronic commerce recommendation method and system based on knowledge graph deep learning
WO2019188102A1 (en) Device, method and program for making recommendations on the basis of customer attribute information
Tan et al. Recommendation based on users’ long-term and short-term interests with attention
Xu et al. Leveraging app usage contexts for app recommendation: a neural approach
CN115391677A (en) Negative sample-based collaborative recommendation method and device, terminal and readable storage medium
CN103324641B (en) Information record recommendation method and device
CN116975615A (en) Task prediction method and device based on video multi-mode information
Yang et al. GANRec: A negative sampling model with generative adversarial network for recommendation
Wang et al. Knowledge graph attention network with attribute significance for personalized recommendation
WO2021223165A1 (en) Systems and methods for object evaluation
Guan et al. Enhanced SVD for collaborative filtering
Tong et al. Reinforcement learning-based denoising network for sequential recommendation
CN112668316A (en) word document key information extraction method
CN116823410A (en) Data processing method, object processing method, recommending method and computing device
Xiao et al. A better understanding of the interaction between users and items by knowledge graph learning for temporal recommendation
CN113469819A (en) Recommendation method of fund product, related device and computer storage medium
Liu et al. Collaborative social deep learning for celebrity recommendation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination