CN109360069A - A kind of recommended models based on pairs of dual training - Google Patents
A kind of recommended models based on pairs of dual training Download PDFInfo
- Publication number
- CN109360069A CN109360069A CN201811265107.3A CN201811265107A CN109360069A CN 109360069 A CN109360069 A CN 109360069A CN 201811265107 A CN201811265107 A CN 201811265107A CN 109360069 A CN109360069 A CN 109360069A
- Authority
- CN
- China
- Prior art keywords
- article
- arbiter
- generator
- pairs
- user
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0631—Item recommendations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
Abstract
The invention discloses a kind of recommended models based on pairs of dual training.The model mainly includes two parts, generator and arbiter.Wherein, generator to the preference of user for modeling and generating the article that user is liked, and arbiter is for judging whether user likes certain article.Based on the assumption that " relative to the article that generator generates, arbiter thinks the article that user prefers to have interacted ", establishes connection using pairs of loss function between generator and arbiter.Specifically, arbiter increases the discriminating power of oneself by minimizing loss in pairs, and generator is by maximizing loss modeling user preference in pairs and cheating arbiter.In addition, the present invention substitutes traditional sampling using sample mode that can be micro-, make the connection between generator and arbiter can be micro-, therefore the training of the method based on gradient can be used in this model.Compared to existing method, the present invention can be improved stability and convergence rate of the dual training in recommender system.
Description
Technical field
The invention belongs to recommender system technical fields, more specifically, being under dual training frame based on losing in pairs
Recommended models.
Background technique
With the fast development of e-commerce and online website, such as Taobao and bean cotyledon etc., user is enjoying convenient service
While, also perplexed by problem of information overload.Recommender system is considered as the effective tool for alleviating this problem, it passes through modeling
The historical behavior of user simultaneously recommends possible interested article to it.
The model of recommender system, which can be divided into, generates model and discrimination model.Model is generated to build the Behavior preference of user
Mould with good theoretical basis, but is difficult with information relevant to user and article, such as comment of the user to article
With the visual information of article etc..Discrimination model directly judges the relationship between user and article according to the feature of user and article,
But it cannot learn from the data of no label.
The advantages of in order to integrate two kinds of models, the unified information generated with discrimination model based on minimax game
Retrieval model (IRGAN, IRGAN:A Minimax Game for Unifying Generative and
Discriminative Information Retrieval Models) it will generation model and differentiation using dual training frame
Model system is combined with the precision of lift scheme.As the subdomains of information retrieval, recommender system can also use confrontation instruction
Practice the characterization ability that frame increases model.Different from traditional optimization problem, the solution of confrontation model belongs to minimax game,
Its objective function causes model training unstable.In addition, traditional dual training uses the method Optimized model of gradient decline, because
This requires model integrally can be micro-.In order to solve the problems, such as that the discrete non-differentiability of article in information retrieval, IRGAN use strategy ladder
The method Optimized model of degree.But the variance and number of articles of Policy-Gradient are proportional, in recommender system, substantial amounts
Article makes Policy-Gradient variance with higher, more unstable so as to cause model training.The above both sides reason is led
It causes IRGAN training in recommender system field unstable and restrains the problems such as slow.
Summary of the invention
For the disadvantages described above and Improvement requirement of the prior art, the characteristics of present invention combination recommender system and confrontation model,
A kind of recommended models based on pairs of dual training are provided, it is steady in recommender system its purpose is to improve dual training
Qualitative and convergence rate is simultaneously accurately and quickly recommended.
Meanwhile we use and are based on Geng Beier-flexibility maximum value (Gumbel-Softmax) classification reparameterization
(Categorical Reparameterization with Gumbel-Softmax) can be sampled micro-ly and be used successive value table
Show discrete articles, therefore model proposed by the present invention is that entirety can be micro-, can directly use the method based on gradient to carry out excellent
Change.Compared to Policy-Gradient, Geng Beier-flexibility max methods gradient has lower variance, being capable of further Lifting Modules
The stability of type training.
Model of the invention mainly includes two parts: generator and arbiter.Generator is responsible for inclined to the behavior of user
Good modeling, and the article that the preference list of user can be generated and may like.Arbiter is responsible for judging user to a certain article
Whether like.
Based on the assumption that " relative to the article that generator generates, arbiter thinks that user more likes the article interacted ", this
Generator and arbiter are unified under the frame of pairs of dual training by invention, by being optimized to (pairwise) loss function
To train entire model.Specifically, generator for modelling customer behavior preference, generates the favorite article of user and cheats and sentence
Other device, target are to maximize loss in pairs;Arbiter minimizes loss in pairs to maintain the hypothesis of model to set up.Not
Break in alternate dual training, generator and arbiter will reach nash banlance (Nash equilibrium).At this point, generating
The Behavior preference of device energy analog subscriber simultaneously generates the article that user likes, and arbiter cannot be distinguished user to generation article and hand over
The preference of mutual article.
A training process of the invention the following steps are included:
(1) parameter of fixed generator, training arbiter:
Choose the user-article pair interacted from data set, generator generates user to the preference probability of article used,
Then using can micro- sampling Gumbel-Softmax method generate the article that the user may like;
Input with triple (user, the article interacted, the article of generation) as arbiter, to minimize model
Pairs of loss function is the parameter that target updates arbiter using the method based on gradient.
(2) parameter of fixed arbiter, training generator:
Choose the user-article pair interacted from data set, generator generates user to the preference probability of article used,
Then using can micro- sampling Gumbel-Softmax method generate the article that the user may like;
Input with triple (user, the article interacted, the article of generation) as arbiter, to maximize model
Pairs of loss function is the parameter that target updates generator using the method based on gradient.
After training, modeling according to generator to user behavior preference recommends the article liked to it.
Using the above method provided by the invention, stability of the dual training in recommender system and convergence speed can be improved
Degree.
Using the above method provided by the invention, more acurrate favorite article effectively can be recommended for user.
Detailed description of the invention
Illustrate technical solution of the present invention in order to clearer, it below will be to required in embodiment or description of the prior art
The attached drawing used is simply introduced.
Fig. 1 is model structure of the invention;
Fig. 2 is micro- can to sample the exemplary diagram that Gumbel-Softmax method and parameter influence;
Fig. 3 is the algorithm flow chart of model training of the present invention;
Fig. 4 is the learning curve figure of model.
Specific implementation
The technical scheme in the embodiments of the invention will be clearly and completely described below, but tool described herein
The examples are only for explaining the invention for body, is not intended to limit the present invention.
M and n is enabled to indicate the number of user and article, S ∈ Rm×nFor user-article Interactive matrix, if user u and object
Product i has intersection record, then sui=1, otherwise sui=0.Enable W ∈ Rm×dWith V ∈ Rn×dIndicate the eigenmatrix of user and article,
In, wuAnd viThe feature vector of user u and article i are respectively indicated, d indicates the dimension of feature.b∈Rn×1Indicate the biasing of article
Vector.G and f is enabled to respectively indicate generator and arbiter.As described above, objective function of the invention is as follows:
Wherein, L is pairs of loss function, can be the pairs of loss function such as logarithm, hinge.θ and φ is generator respectively
With the parameter sets of arbiter, prealExpression has interacted the probability distribution of article, pθIndicate that generator generates the probability point of article
Cloth.
In our implementation, the model of generator and arbiter is matrix decomposition (MF, Matrix
Factorization) model, MF describe user u to the preference value r of article i using the inner product of vectorui:
Training step of the invention is as follows:
Step (1) pre-training.Use Bayes's personalized ordering (BPR, BPR:Bayesian Personalized
Ranking from Implicit Feedback) to generator pre-training to convergence until.
Step (2) trains arbiter.User-article (u, i) intersection record in Ergodic Matrices S, does each (u, i)
Following operation:
Step (2-1) generates user u to the preference vector r of all items using generator gu=(ru1,…,run):
Wherein, θ indicates the parameter of generator g.
Step (2-2) is by user u to the preference r of articleuIt is normalized to probability:
Wherein, sampled probability when subscript f expression training arbiter, and parameter τ ∈ (0,1] center of gravity sampled is controlled, τ is got over
It is small, ruiThe probability of higher article is bigger.
Step (2-3) uses the generation article j that Geng Beier-flexibility max methods can be micro- from the preference probability of user u:
Wherein, z is the noise vector obtained from Gumbel (0,1) profile samples.Article j is the class one-hot of n dimension
Vector indicates the article that generator g is generated, and when parameter t approach 0, j is close to one-hot vector, when t approach is just infinite, j
Become uniform vector.Fig. 2 illustrate can micro- sampling the influence to class one-hot vector j of process and parameter t.
Step (2-4) computational discrimination device f has interacted the scoring of article i to user u:
Wherein, φ indicates the parameter of arbiter f.
The article j that step (2-5) generates is not the article of necessary being, it is therefore desirable to the article be calculated in arbiter
Feature vector and biasing in f:
Wherein j ∈ R1×nIt is class one-hot vector, Vφ∈Rn×dIt is the eigenmatrix of article in arbiter, bφ∈Rn×1It is
Article bias vector, therefore bj φAnd vj φ∈R1×dIt can be used as the biasing and feature vector of article j.
Scoring of step (2-6) the computational discrimination device f to the article j of generation:
Pairs of loss of step (2-7) the computational discrimination device f about article i and j:
Loss=log (1+exp (f (j | u)-f (i u))) (11)
Formula (11) uses the pairs of loss function of logarithm, further, it is also possible to use other pairs of damages such as hinge loss
Lose function.
Step (2-8) arbiter f will minimize objective function, therefore the parameter phi of f is updated using gradient decline:
Wherein α is learning rate.
Step (3) trains generator.User-article (u, i) intersection record in Ergodic Matrices S, does each (u, i)
Following operation:
Step (3-1) generates user u to the preference vector r of all items using formula (3)u=(ru1..., run)。
Preference vector of the user u to article is normalized into using formula (13) as probability by step (3-2):
pu=softmax (ru) (13)
Step (3-3) generator is the preference in order to be fitted user, therefore using important when trained generator
Property sampling, make to have interacted article and occupy bigger specific gravity in sampling:
Wherein, subscript g indicates sampled probability when training generator, | { sui|sui=1 } | indicate that user u's has interacted object
Product quantity, λ are the parameters for controlling importance sampling, and value is bigger, and it is bigger to have interacted probability shared by article.
Step (3-4) uses the generation article j that Geng Beier-flexibility max methods can be micro- from the preference probability of user u:
Step (3-5) has interacted the scoring of article i using formula (7) computational discrimination device f to user u.
Step (3-6) calculates feature vector and biasing of the article j in arbiter f using formula (8) and formula (9).
Scoring of the step (3-7) using formula (10) computational discrimination device f to article j is generated.
Step (3-8) uses pairs of loss of formula (11) the computational discrimination device f about article i and j.
Step (3-9) generator will maximize objective function, therefore rise the parameter θ for updating g using gradient:
Step (4) is if model has been restrained, deconditioning, otherwise return step (2).
Training flow chart of the invention is as shown in Figure 3.
Fig. 4 is the learning curve of the present invention on both data sets, illustrates the stability and convergence rate of model.
Recommend the stage in article, the Behavior preference r of user is generated using formula (3) for user u, generator guSide by side
Sequence, the article high to user's recommendation score.
Those skilled in the art will readily recognize that the embodiment in the present invention, the foregoing is merely preferred embodiments of the invention
, it is not intended to limit the invention.
Claims (5)
1. a kind of recommended models based on pairs of dual training, it is characterised in that: model includes generator and arbiter two parts,
Generator models to the Behavior preference of user and generates the item lists that user likes, and arbiter judges that user is to a certain article
It is no to like;Based on the assumption that " relative to the article that generator generates, arbiter thinks that user more likes the article interacted ", uses
Confrontation loss function establishes connection between generator and arbiter in pairs;The target of arbiter is to minimize to lose and mention in pairs
The discriminating power of oneself is risen, the target of generator is to maximize loss in pairs, cheats arbiter and is promoted and is built to user preference
Mould ability;Using can be micro- the method for sampling and based on gradient decline optimization method promoted dual training stability.
2. a kind of recommended models based on pairs of dual training according to claim 1, it is characterised in that: based on the assumption that
" relative to the article that generator generates, arbiter thinks that user more likes the article interacted ", will using pairs of loss function
Generator and arbiter are unified under the frame of dual training, wherein arbiter needs to minimize objective function, and generator needs
Maximize objective function, it may be assumed that
Wherein, i is the article that user u has been interacted, and j is the article that generator g is generated, and f is arbiter, and L is pairs of loss function,
θ and φ is the parameter sets of generator and arbiter, p respectivelyrealExpression has interacted the probability distribution of article, pθIndicate generator
Generate the probability distribution of article;In the training stage, the optimization of generator and arbiter is alternately.
3. a kind of recommended models based on pairs of dual training according to claim 1, it is characterised in that: the model
Sampling process can be micro-, sampling process are as follows:
J=softmax ((logpu+z)/t)
Wherein, puIt is the probability distribution that generator generates article according to user u, z is obtained from Gumbel (0,1) profile samples
Noise vector, article j are only hot (one-hot) vectors of the approximation of a n dimension, indicate the article that generator g is generated;Parameter t becomes
When nearly 0, j is close to only hot vector, and when t approach is just infinite, j is then uniform vector.
4. a kind of recommended models based on pairs of dual training according to claim 1, it is characterised in that: generate article j
Feature vector and biasing in arbiter be by can be micro- process obtain:
vj=jV
bj=jb
Wherein j ∈ R1×nIt is approximate solely hot vector, represents the article of generation, V ∈ Rn×dIt is the eigenmatrix of article in arbiter, b
∈Rn×1It is article bias vector, therefore bjAnd vj∈R1×dIt can be used as the biasing and feature vector of article j.
5. a kind of recommended models based on pairs of dual training according to claim 1, it is characterised in that: the model is whole
Can be micro-, using the optimization method based on gradient, alternately training pattern parameter, the target of arbiter are to minimize loss in pairs, are needed
Gradient decline is used to update its parameter phi:
The target of generator is to maximize loss in pairs, needs to rise its parameter θ of update using gradient:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811265107.3A CN109360069B (en) | 2018-10-29 | 2018-10-29 | Method for recommending model based on pairwise confrontation training |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811265107.3A CN109360069B (en) | 2018-10-29 | 2018-10-29 | Method for recommending model based on pairwise confrontation training |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109360069A true CN109360069A (en) | 2019-02-19 |
CN109360069B CN109360069B (en) | 2022-04-01 |
Family
ID=65346970
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811265107.3A Active CN109360069B (en) | 2018-10-29 | 2018-10-29 | Method for recommending model based on pairwise confrontation training |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109360069B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110210933A (en) * | 2019-05-21 | 2019-09-06 | 清华大学深圳研究生院 | A kind of enigmatic language justice recommended method based on generation confrontation network |
CN110399553A (en) * | 2019-06-28 | 2019-11-01 | 南京工业大学 | A kind of session recommendation list generation method based on confrontation study |
CN110442804A (en) * | 2019-08-13 | 2019-11-12 | 北京市商汤科技开发有限公司 | A kind of training method, device, equipment and the storage medium of object recommendation network |
CN110727868A (en) * | 2019-10-12 | 2020-01-24 | 腾讯音乐娱乐科技(深圳)有限公司 | Object recommendation method, device and computer-readable storage medium |
CN111027714A (en) * | 2019-12-11 | 2020-04-17 | 腾讯科技(深圳)有限公司 | Artificial intelligence-based object recommendation model training method, recommendation method and device |
CN111259244A (en) * | 2020-01-14 | 2020-06-09 | 郑州大学 | Method for using countermeasure model on discrete data |
CN113268660A (en) * | 2021-04-28 | 2021-08-17 | 重庆邮电大学 | Diversity recommendation method and device based on generation countermeasure network and server |
WO2021169451A1 (en) * | 2020-09-28 | 2021-09-02 | 平安科技(深圳)有限公司 | Content recommendation method and apparatus based on adversarial learning, and computer device |
CN117093783A (en) * | 2023-04-12 | 2023-11-21 | 浙江卡赢信息科技有限公司 | Intelligent recommendation system and method for point exchange combined with user social data |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170278135A1 (en) * | 2016-02-18 | 2017-09-28 | Fitroom, Inc. | Image recognition artificial intelligence system for ecommerce |
CN108595493A (en) * | 2018-03-15 | 2018-09-28 | 腾讯科技(深圳)有限公司 | Method for pushing and device, storage medium, the electronic device of media content |
CN108665058A (en) * | 2018-04-11 | 2018-10-16 | 徐州工程学院 | A kind of generation confrontation network method based on segmentation loss |
-
2018
- 2018-10-29 CN CN201811265107.3A patent/CN109360069B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170278135A1 (en) * | 2016-02-18 | 2017-09-28 | Fitroom, Inc. | Image recognition artificial intelligence system for ecommerce |
CN108595493A (en) * | 2018-03-15 | 2018-09-28 | 腾讯科技(深圳)有限公司 | Method for pushing and device, storage medium, the electronic device of media content |
CN108665058A (en) * | 2018-04-11 | 2018-10-16 | 徐州工程学院 | A kind of generation confrontation network method based on segmentation loss |
Non-Patent Citations (2)
Title |
---|
HOMANGA BHARADHWAJ: "RecGAN: recurrent generative adversarial networks for recommendation systems", 《RECSYS "18: PROCEEDINGS OF THE 12TH ACM CONFERENCE ON RECOMMENDER SYSTEMS》 * |
JUN WANG: "IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models", 《SIGIR "17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110210933A (en) * | 2019-05-21 | 2019-09-06 | 清华大学深圳研究生院 | A kind of enigmatic language justice recommended method based on generation confrontation network |
CN110399553A (en) * | 2019-06-28 | 2019-11-01 | 南京工业大学 | A kind of session recommendation list generation method based on confrontation study |
CN110442804A (en) * | 2019-08-13 | 2019-11-12 | 北京市商汤科技开发有限公司 | A kind of training method, device, equipment and the storage medium of object recommendation network |
CN110727868A (en) * | 2019-10-12 | 2020-01-24 | 腾讯音乐娱乐科技(深圳)有限公司 | Object recommendation method, device and computer-readable storage medium |
CN111027714A (en) * | 2019-12-11 | 2020-04-17 | 腾讯科技(深圳)有限公司 | Artificial intelligence-based object recommendation model training method, recommendation method and device |
CN111027714B (en) * | 2019-12-11 | 2023-03-14 | 腾讯科技(深圳)有限公司 | Artificial intelligence-based object recommendation model training method, recommendation method and device |
CN111259244A (en) * | 2020-01-14 | 2020-06-09 | 郑州大学 | Method for using countermeasure model on discrete data |
CN111259244B (en) * | 2020-01-14 | 2022-12-16 | 郑州大学 | Recommendation method based on countermeasure model |
WO2021169451A1 (en) * | 2020-09-28 | 2021-09-02 | 平安科技(深圳)有限公司 | Content recommendation method and apparatus based on adversarial learning, and computer device |
CN113268660A (en) * | 2021-04-28 | 2021-08-17 | 重庆邮电大学 | Diversity recommendation method and device based on generation countermeasure network and server |
CN117093783A (en) * | 2023-04-12 | 2023-11-21 | 浙江卡赢信息科技有限公司 | Intelligent recommendation system and method for point exchange combined with user social data |
Also Published As
Publication number | Publication date |
---|---|
CN109360069B (en) | 2022-04-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109360069A (en) | A kind of recommended models based on pairs of dual training | |
CN110969516B (en) | Commodity recommendation method and device | |
WO2019029046A1 (en) | Video recommendation method and system | |
CN106980648B (en) | Personalized recommendation method based on probability matrix decomposition and combined with similarity | |
CN107563841B (en) | Recommendation system based on user score decomposition | |
CN108363804A (en) | Partial model Weighted Fusion Top-N films based on user clustering recommend method | |
CN111199458B (en) | Recommendation system based on meta learning and reinforcement learning | |
CN109740064A (en) | A kind of CF recommended method of fusion matrix decomposition and excavation user items information | |
CN106202377B (en) | A kind of online collaboration sort method based on stochastic gradient descent | |
Park et al. | Uniwalk: Explainable and accurate recommendation for rating and network data | |
CN108595493A (en) | Method for pushing and device, storage medium, the electronic device of media content | |
CN107016122B (en) | Knowledge recommendation method based on time migration | |
US20190340176A1 (en) | System and method for data mining and similarity estimation | |
CN110083764A (en) | A kind of collaborative filtering cold start-up way to solve the problem | |
CN112328908B (en) | Personalized recommendation method based on collaborative filtering | |
CN112149734B (en) | Cross-domain recommendation method based on stacked self-encoder | |
CN111159473A (en) | Deep learning and Markov chain based connection recommendation method | |
CN112256965A (en) | Neural collaborative filtering model recommendation method based on lambdamat | |
CN109934681A (en) | The recommended method of user's commodity interested | |
CN110825978B (en) | Multitask collaborative filtering method based on neighbor user feature sharing | |
CN114168790A (en) | Personalized video recommendation method and system based on automatic feature combination | |
CN105760965A (en) | Pre-estimated model parameter training method, service quality pre-estimation method and corresponding devices | |
CN112818238A (en) | Self-adaptive online recommendation method and system | |
CN106991122A (en) | A kind of film based on particle cluster algorithm recommends method | |
CN116308618A (en) | Training method of recommendation model, recommendation model and commodity recommendation method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |