CN109360069A - A kind of recommended models based on pairs of dual training - Google Patents

A kind of recommended models based on pairs of dual training Download PDF

Info

Publication number
CN109360069A
CN109360069A CN201811265107.3A CN201811265107A CN109360069A CN 109360069 A CN109360069 A CN 109360069A CN 201811265107 A CN201811265107 A CN 201811265107A CN 109360069 A CN109360069 A CN 109360069A
Authority
CN
China
Prior art keywords
article
arbiter
generator
pairs
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811265107.3A
Other languages
Chinese (zh)
Other versions
CN109360069B (en
Inventor
叶阳东
孙中川
吴宾
吴云鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou University
Original Assignee
Zhengzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou University filed Critical Zhengzhou University
Priority to CN201811265107.3A priority Critical patent/CN109360069B/en
Publication of CN109360069A publication Critical patent/CN109360069A/en
Application granted granted Critical
Publication of CN109360069B publication Critical patent/CN109360069B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Abstract

The invention discloses a kind of recommended models based on pairs of dual training.The model mainly includes two parts, generator and arbiter.Wherein, generator to the preference of user for modeling and generating the article that user is liked, and arbiter is for judging whether user likes certain article.Based on the assumption that " relative to the article that generator generates, arbiter thinks the article that user prefers to have interacted ", establishes connection using pairs of loss function between generator and arbiter.Specifically, arbiter increases the discriminating power of oneself by minimizing loss in pairs, and generator is by maximizing loss modeling user preference in pairs and cheating arbiter.In addition, the present invention substitutes traditional sampling using sample mode that can be micro-, make the connection between generator and arbiter can be micro-, therefore the training of the method based on gradient can be used in this model.Compared to existing method, the present invention can be improved stability and convergence rate of the dual training in recommender system.

Description

A kind of recommended models based on pairs of dual training
Technical field
The invention belongs to recommender system technical fields, more specifically, being under dual training frame based on losing in pairs Recommended models.
Background technique
With the fast development of e-commerce and online website, such as Taobao and bean cotyledon etc., user is enjoying convenient service While, also perplexed by problem of information overload.Recommender system is considered as the effective tool for alleviating this problem, it passes through modeling The historical behavior of user simultaneously recommends possible interested article to it.
The model of recommender system, which can be divided into, generates model and discrimination model.Model is generated to build the Behavior preference of user Mould with good theoretical basis, but is difficult with information relevant to user and article, such as comment of the user to article With the visual information of article etc..Discrimination model directly judges the relationship between user and article according to the feature of user and article, But it cannot learn from the data of no label.
The advantages of in order to integrate two kinds of models, the unified information generated with discrimination model based on minimax game Retrieval model (IRGAN, IRGAN:A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models) it will generation model and differentiation using dual training frame Model system is combined with the precision of lift scheme.As the subdomains of information retrieval, recommender system can also use confrontation instruction Practice the characterization ability that frame increases model.Different from traditional optimization problem, the solution of confrontation model belongs to minimax game, Its objective function causes model training unstable.In addition, traditional dual training uses the method Optimized model of gradient decline, because This requires model integrally can be micro-.In order to solve the problems, such as that the discrete non-differentiability of article in information retrieval, IRGAN use strategy ladder The method Optimized model of degree.But the variance and number of articles of Policy-Gradient are proportional, in recommender system, substantial amounts Article makes Policy-Gradient variance with higher, more unstable so as to cause model training.The above both sides reason is led It causes IRGAN training in recommender system field unstable and restrains the problems such as slow.
Summary of the invention
For the disadvantages described above and Improvement requirement of the prior art, the characteristics of present invention combination recommender system and confrontation model, A kind of recommended models based on pairs of dual training are provided, it is steady in recommender system its purpose is to improve dual training Qualitative and convergence rate is simultaneously accurately and quickly recommended.
Meanwhile we use and are based on Geng Beier-flexibility maximum value (Gumbel-Softmax) classification reparameterization (Categorical Reparameterization with Gumbel-Softmax) can be sampled micro-ly and be used successive value table Show discrete articles, therefore model proposed by the present invention is that entirety can be micro-, can directly use the method based on gradient to carry out excellent Change.Compared to Policy-Gradient, Geng Beier-flexibility max methods gradient has lower variance, being capable of further Lifting Modules The stability of type training.
Model of the invention mainly includes two parts: generator and arbiter.Generator is responsible for inclined to the behavior of user Good modeling, and the article that the preference list of user can be generated and may like.Arbiter is responsible for judging user to a certain article Whether like.
Based on the assumption that " relative to the article that generator generates, arbiter thinks that user more likes the article interacted ", this Generator and arbiter are unified under the frame of pairs of dual training by invention, by being optimized to (pairwise) loss function To train entire model.Specifically, generator for modelling customer behavior preference, generates the favorite article of user and cheats and sentence Other device, target are to maximize loss in pairs;Arbiter minimizes loss in pairs to maintain the hypothesis of model to set up.Not Break in alternate dual training, generator and arbiter will reach nash banlance (Nash equilibrium).At this point, generating The Behavior preference of device energy analog subscriber simultaneously generates the article that user likes, and arbiter cannot be distinguished user to generation article and hand over The preference of mutual article.
A training process of the invention the following steps are included:
(1) parameter of fixed generator, training arbiter:
Choose the user-article pair interacted from data set, generator generates user to the preference probability of article used, Then using can micro- sampling Gumbel-Softmax method generate the article that the user may like;
Input with triple (user, the article interacted, the article of generation) as arbiter, to minimize model Pairs of loss function is the parameter that target updates arbiter using the method based on gradient.
(2) parameter of fixed arbiter, training generator:
Choose the user-article pair interacted from data set, generator generates user to the preference probability of article used, Then using can micro- sampling Gumbel-Softmax method generate the article that the user may like;
Input with triple (user, the article interacted, the article of generation) as arbiter, to maximize model Pairs of loss function is the parameter that target updates generator using the method based on gradient.
After training, modeling according to generator to user behavior preference recommends the article liked to it.
Using the above method provided by the invention, stability of the dual training in recommender system and convergence speed can be improved Degree.
Using the above method provided by the invention, more acurrate favorite article effectively can be recommended for user.
Detailed description of the invention
Illustrate technical solution of the present invention in order to clearer, it below will be to required in embodiment or description of the prior art The attached drawing used is simply introduced.
Fig. 1 is model structure of the invention;
Fig. 2 is micro- can to sample the exemplary diagram that Gumbel-Softmax method and parameter influence;
Fig. 3 is the algorithm flow chart of model training of the present invention;
Fig. 4 is the learning curve figure of model.
Specific implementation
The technical scheme in the embodiments of the invention will be clearly and completely described below, but tool described herein The examples are only for explaining the invention for body, is not intended to limit the present invention.
M and n is enabled to indicate the number of user and article, S ∈ Rm×nFor user-article Interactive matrix, if user u and object Product i has intersection record, then sui=1, otherwise sui=0.Enable W ∈ Rm×dWith V ∈ Rn×dIndicate the eigenmatrix of user and article, In, wuAnd viThe feature vector of user u and article i are respectively indicated, d indicates the dimension of feature.b∈Rn×1Indicate the biasing of article Vector.G and f is enabled to respectively indicate generator and arbiter.As described above, objective function of the invention is as follows:
Wherein, L is pairs of loss function, can be the pairs of loss function such as logarithm, hinge.θ and φ is generator respectively With the parameter sets of arbiter, prealExpression has interacted the probability distribution of article, pθIndicate that generator generates the probability point of article Cloth.
In our implementation, the model of generator and arbiter is matrix decomposition (MF, Matrix Factorization) model, MF describe user u to the preference value r of article i using the inner product of vectorui:
Training step of the invention is as follows:
Step (1) pre-training.Use Bayes's personalized ordering (BPR, BPR:Bayesian Personalized Ranking from Implicit Feedback) to generator pre-training to convergence until.
Step (2) trains arbiter.User-article (u, i) intersection record in Ergodic Matrices S, does each (u, i) Following operation:
Step (2-1) generates user u to the preference vector r of all items using generator gu=(ru1,…,run):
Wherein, θ indicates the parameter of generator g.
Step (2-2) is by user u to the preference r of articleuIt is normalized to probability:
Wherein, sampled probability when subscript f expression training arbiter, and parameter τ ∈ (0,1] center of gravity sampled is controlled, τ is got over It is small, ruiThe probability of higher article is bigger.
Step (2-3) uses the generation article j that Geng Beier-flexibility max methods can be micro- from the preference probability of user u:
Wherein, z is the noise vector obtained from Gumbel (0,1) profile samples.Article j is the class one-hot of n dimension Vector indicates the article that generator g is generated, and when parameter t approach 0, j is close to one-hot vector, when t approach is just infinite, j Become uniform vector.Fig. 2 illustrate can micro- sampling the influence to class one-hot vector j of process and parameter t.
Step (2-4) computational discrimination device f has interacted the scoring of article i to user u:
Wherein, φ indicates the parameter of arbiter f.
The article j that step (2-5) generates is not the article of necessary being, it is therefore desirable to the article be calculated in arbiter Feature vector and biasing in f:
Wherein j ∈ R1×nIt is class one-hot vector, Vφ∈Rn×dIt is the eigenmatrix of article in arbiter, bφ∈Rn×1It is Article bias vector, therefore bj φAnd vj φ∈R1×dIt can be used as the biasing and feature vector of article j.
Scoring of step (2-6) the computational discrimination device f to the article j of generation:
Pairs of loss of step (2-7) the computational discrimination device f about article i and j:
Loss=log (1+exp (f (j | u)-f (i u))) (11)
Formula (11) uses the pairs of loss function of logarithm, further, it is also possible to use other pairs of damages such as hinge loss Lose function.
Step (2-8) arbiter f will minimize objective function, therefore the parameter phi of f is updated using gradient decline:
Wherein α is learning rate.
Step (3) trains generator.User-article (u, i) intersection record in Ergodic Matrices S, does each (u, i) Following operation:
Step (3-1) generates user u to the preference vector r of all items using formula (3)u=(ru1..., run)。
Preference vector of the user u to article is normalized into using formula (13) as probability by step (3-2):
pu=softmax (ru) (13)
Step (3-3) generator is the preference in order to be fitted user, therefore using important when trained generator Property sampling, make to have interacted article and occupy bigger specific gravity in sampling:
Wherein, subscript g indicates sampled probability when training generator, | { sui|sui=1 } | indicate that user u's has interacted object Product quantity, λ are the parameters for controlling importance sampling, and value is bigger, and it is bigger to have interacted probability shared by article.
Step (3-4) uses the generation article j that Geng Beier-flexibility max methods can be micro- from the preference probability of user u:
Step (3-5) has interacted the scoring of article i using formula (7) computational discrimination device f to user u.
Step (3-6) calculates feature vector and biasing of the article j in arbiter f using formula (8) and formula (9).
Scoring of the step (3-7) using formula (10) computational discrimination device f to article j is generated.
Step (3-8) uses pairs of loss of formula (11) the computational discrimination device f about article i and j.
Step (3-9) generator will maximize objective function, therefore rise the parameter θ for updating g using gradient:
Step (4) is if model has been restrained, deconditioning, otherwise return step (2).
Training flow chart of the invention is as shown in Figure 3.
Fig. 4 is the learning curve of the present invention on both data sets, illustrates the stability and convergence rate of model.
Recommend the stage in article, the Behavior preference r of user is generated using formula (3) for user u, generator guSide by side Sequence, the article high to user's recommendation score.
Those skilled in the art will readily recognize that the embodiment in the present invention, the foregoing is merely preferred embodiments of the invention , it is not intended to limit the invention.

Claims (5)

1. a kind of recommended models based on pairs of dual training, it is characterised in that: model includes generator and arbiter two parts, Generator models to the Behavior preference of user and generates the item lists that user likes, and arbiter judges that user is to a certain article It is no to like;Based on the assumption that " relative to the article that generator generates, arbiter thinks that user more likes the article interacted ", uses Confrontation loss function establishes connection between generator and arbiter in pairs;The target of arbiter is to minimize to lose and mention in pairs The discriminating power of oneself is risen, the target of generator is to maximize loss in pairs, cheats arbiter and is promoted and is built to user preference Mould ability;Using can be micro- the method for sampling and based on gradient decline optimization method promoted dual training stability.
2. a kind of recommended models based on pairs of dual training according to claim 1, it is characterised in that: based on the assumption that " relative to the article that generator generates, arbiter thinks that user more likes the article interacted ", will using pairs of loss function Generator and arbiter are unified under the frame of dual training, wherein arbiter needs to minimize objective function, and generator needs Maximize objective function, it may be assumed that
Wherein, i is the article that user u has been interacted, and j is the article that generator g is generated, and f is arbiter, and L is pairs of loss function, θ and φ is the parameter sets of generator and arbiter, p respectivelyrealExpression has interacted the probability distribution of article, pθIndicate generator Generate the probability distribution of article;In the training stage, the optimization of generator and arbiter is alternately.
3. a kind of recommended models based on pairs of dual training according to claim 1, it is characterised in that: the model Sampling process can be micro-, sampling process are as follows:
J=softmax ((logpu+z)/t)
Wherein, puIt is the probability distribution that generator generates article according to user u, z is obtained from Gumbel (0,1) profile samples Noise vector, article j are only hot (one-hot) vectors of the approximation of a n dimension, indicate the article that generator g is generated;Parameter t becomes When nearly 0, j is close to only hot vector, and when t approach is just infinite, j is then uniform vector.
4. a kind of recommended models based on pairs of dual training according to claim 1, it is characterised in that: generate article j Feature vector and biasing in arbiter be by can be micro- process obtain:
vj=jV
bj=jb
Wherein j ∈ R1×nIt is approximate solely hot vector, represents the article of generation, V ∈ Rn×dIt is the eigenmatrix of article in arbiter, b ∈Rn×1It is article bias vector, therefore bjAnd vj∈R1×dIt can be used as the biasing and feature vector of article j.
5. a kind of recommended models based on pairs of dual training according to claim 1, it is characterised in that: the model is whole Can be micro-, using the optimization method based on gradient, alternately training pattern parameter, the target of arbiter are to minimize loss in pairs, are needed Gradient decline is used to update its parameter phi:
The target of generator is to maximize loss in pairs, needs to rise its parameter θ of update using gradient:
CN201811265107.3A 2018-10-29 2018-10-29 Method for recommending model based on pairwise confrontation training Active CN109360069B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811265107.3A CN109360069B (en) 2018-10-29 2018-10-29 Method for recommending model based on pairwise confrontation training

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811265107.3A CN109360069B (en) 2018-10-29 2018-10-29 Method for recommending model based on pairwise confrontation training

Publications (2)

Publication Number Publication Date
CN109360069A true CN109360069A (en) 2019-02-19
CN109360069B CN109360069B (en) 2022-04-01

Family

ID=65346970

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811265107.3A Active CN109360069B (en) 2018-10-29 2018-10-29 Method for recommending model based on pairwise confrontation training

Country Status (1)

Country Link
CN (1) CN109360069B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110210933A (en) * 2019-05-21 2019-09-06 清华大学深圳研究生院 A kind of enigmatic language justice recommended method based on generation confrontation network
CN110399553A (en) * 2019-06-28 2019-11-01 南京工业大学 A kind of session recommendation list generation method based on confrontation study
CN110442804A (en) * 2019-08-13 2019-11-12 北京市商汤科技开发有限公司 A kind of training method, device, equipment and the storage medium of object recommendation network
CN110727868A (en) * 2019-10-12 2020-01-24 腾讯音乐娱乐科技(深圳)有限公司 Object recommendation method, device and computer-readable storage medium
CN111027714A (en) * 2019-12-11 2020-04-17 腾讯科技(深圳)有限公司 Artificial intelligence-based object recommendation model training method, recommendation method and device
CN111259244A (en) * 2020-01-14 2020-06-09 郑州大学 Method for using countermeasure model on discrete data
CN113268660A (en) * 2021-04-28 2021-08-17 重庆邮电大学 Diversity recommendation method and device based on generation countermeasure network and server
WO2021169451A1 (en) * 2020-09-28 2021-09-02 平安科技(深圳)有限公司 Content recommendation method and apparatus based on adversarial learning, and computer device
CN117093783A (en) * 2023-04-12 2023-11-21 浙江卡赢信息科技有限公司 Intelligent recommendation system and method for point exchange combined with user social data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170278135A1 (en) * 2016-02-18 2017-09-28 Fitroom, Inc. Image recognition artificial intelligence system for ecommerce
CN108595493A (en) * 2018-03-15 2018-09-28 腾讯科技(深圳)有限公司 Method for pushing and device, storage medium, the electronic device of media content
CN108665058A (en) * 2018-04-11 2018-10-16 徐州工程学院 A kind of generation confrontation network method based on segmentation loss

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170278135A1 (en) * 2016-02-18 2017-09-28 Fitroom, Inc. Image recognition artificial intelligence system for ecommerce
CN108595493A (en) * 2018-03-15 2018-09-28 腾讯科技(深圳)有限公司 Method for pushing and device, storage medium, the electronic device of media content
CN108665058A (en) * 2018-04-11 2018-10-16 徐州工程学院 A kind of generation confrontation network method based on segmentation loss

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HOMANGA BHARADHWAJ: "RecGAN: recurrent generative adversarial networks for recommendation systems", 《RECSYS "18: PROCEEDINGS OF THE 12TH ACM CONFERENCE ON RECOMMENDER SYSTEMS》 *
JUN WANG: "IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models", 《SIGIR "17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110210933A (en) * 2019-05-21 2019-09-06 清华大学深圳研究生院 A kind of enigmatic language justice recommended method based on generation confrontation network
CN110399553A (en) * 2019-06-28 2019-11-01 南京工业大学 A kind of session recommendation list generation method based on confrontation study
CN110442804A (en) * 2019-08-13 2019-11-12 北京市商汤科技开发有限公司 A kind of training method, device, equipment and the storage medium of object recommendation network
CN110727868A (en) * 2019-10-12 2020-01-24 腾讯音乐娱乐科技(深圳)有限公司 Object recommendation method, device and computer-readable storage medium
CN111027714A (en) * 2019-12-11 2020-04-17 腾讯科技(深圳)有限公司 Artificial intelligence-based object recommendation model training method, recommendation method and device
CN111027714B (en) * 2019-12-11 2023-03-14 腾讯科技(深圳)有限公司 Artificial intelligence-based object recommendation model training method, recommendation method and device
CN111259244A (en) * 2020-01-14 2020-06-09 郑州大学 Method for using countermeasure model on discrete data
CN111259244B (en) * 2020-01-14 2022-12-16 郑州大学 Recommendation method based on countermeasure model
WO2021169451A1 (en) * 2020-09-28 2021-09-02 平安科技(深圳)有限公司 Content recommendation method and apparatus based on adversarial learning, and computer device
CN113268660A (en) * 2021-04-28 2021-08-17 重庆邮电大学 Diversity recommendation method and device based on generation countermeasure network and server
CN117093783A (en) * 2023-04-12 2023-11-21 浙江卡赢信息科技有限公司 Intelligent recommendation system and method for point exchange combined with user social data

Also Published As

Publication number Publication date
CN109360069B (en) 2022-04-01

Similar Documents

Publication Publication Date Title
CN109360069A (en) A kind of recommended models based on pairs of dual training
CN110969516B (en) Commodity recommendation method and device
WO2019029046A1 (en) Video recommendation method and system
CN106980648B (en) Personalized recommendation method based on probability matrix decomposition and combined with similarity
CN107563841B (en) Recommendation system based on user score decomposition
CN108363804A (en) Partial model Weighted Fusion Top-N films based on user clustering recommend method
CN111199458B (en) Recommendation system based on meta learning and reinforcement learning
CN109740064A (en) A kind of CF recommended method of fusion matrix decomposition and excavation user items information
CN106202377B (en) A kind of online collaboration sort method based on stochastic gradient descent
Park et al. Uniwalk: Explainable and accurate recommendation for rating and network data
CN108595493A (en) Method for pushing and device, storage medium, the electronic device of media content
CN107016122B (en) Knowledge recommendation method based on time migration
US20190340176A1 (en) System and method for data mining and similarity estimation
CN110083764A (en) A kind of collaborative filtering cold start-up way to solve the problem
CN112328908B (en) Personalized recommendation method based on collaborative filtering
CN112149734B (en) Cross-domain recommendation method based on stacked self-encoder
CN111159473A (en) Deep learning and Markov chain based connection recommendation method
CN112256965A (en) Neural collaborative filtering model recommendation method based on lambdamat
CN109934681A (en) The recommended method of user's commodity interested
CN110825978B (en) Multitask collaborative filtering method based on neighbor user feature sharing
CN114168790A (en) Personalized video recommendation method and system based on automatic feature combination
CN105760965A (en) Pre-estimated model parameter training method, service quality pre-estimation method and corresponding devices
CN112818238A (en) Self-adaptive online recommendation method and system
CN106991122A (en) A kind of film based on particle cluster algorithm recommends method
CN116308618A (en) Training method of recommendation model, recommendation model and commodity recommendation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant