CN109446420A - A kind of cross-domain collaborative filtering method and system - Google Patents
A kind of cross-domain collaborative filtering method and system Download PDFInfo
- Publication number
- CN109446420A CN109446420A CN201811209371.5A CN201811209371A CN109446420A CN 109446420 A CN109446420 A CN 109446420A CN 201811209371 A CN201811209371 A CN 201811209371A CN 109446420 A CN109446420 A CN 109446420A
- Authority
- CN
- China
- Prior art keywords
- user
- training sample
- training
- classifier
- sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001914 filtration Methods 0.000 title claims abstract description 27
- 238000000034 method Methods 0.000 title claims abstract description 24
- 238000012549 training Methods 0.000 claims abstract description 145
- 239000011159 matrix material Substances 0.000 claims abstract description 41
- 230000006870 function Effects 0.000 claims description 22
- 238000004422 calculation algorithm Methods 0.000 claims description 12
- 238000006243 chemical reaction Methods 0.000 claims description 12
- 238000005457 optimization Methods 0.000 claims description 8
- 238000000354 decomposition reaction Methods 0.000 claims description 6
- 238000007635 classification algorithm Methods 0.000 claims description 5
- 239000012141 concentrate Substances 0.000 claims description 5
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000007792 addition Methods 0.000 description 1
- 238000009412 basement excavation Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000005303 weighing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0631—Item recommendations
Landscapes
- Business, Economics & Management (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of cross-domain collaborative filtering methods, after user items score data is converted to training sample set, Funk-SVD is carried out to the user items rating matrix in each auxiliary domain to decompose to obtain user's latent variable, then the training sample set is extended using user's latent variable obtain the first spread training sample set, add items characteristic information obtains the second spread training sample set to extend the first spread training sample set, use the uneven classifier of the second spread training sample set training, the missing data of the user items score data is finally predicted based on the uneven classifier and generates recommendation;It is extended by using auxiliary numeric field data and solves aiming field data sparsity problem, then the training of uneven classifier is carried out to training sample after extension, using the missing item of uneven classifier prediction aiming field, and then recommending data is obtained, solve the problems, such as that existing recommender system data set is sparse and disequilibrium.
Description
Technical field
The invention belongs to technical field of information recommendation, specifically, being to be related to a kind of cross-domain collaborative filtering method and system.
Background technique
The rapid growth of internet information needs effective intelligent information agent that can filter out all available information, and
Wherein finding the information to user's most worthy.
In recent years, recommender system was widely used in e-business network and online social media, current main recommended method
Be divided into: content-based recommendation, the recommendation based on correlation rule, is recommended, based on knowing based on effectiveness the recommendation based on collaborative filtering
Know recommendation, combined recommendation etc.;Wherein, the recommendation based on collaborative filtering is most successful strategy, basic thought in recommended method
It is the resource that user similar with certain user likes, which is likely to also like;Certain user likes certain resource, he is likely to
Also like other resources similar with the resource;I.e. users can work as one man the behavior by oneself on website, such as right
Evaluation, browsing of resource etc., excavation of helping each other filter out oneself interested content.
However, user is typically reluctant to score to the project that they do not like, this is just in actual recommender system
It is unbalanced for causing most of score data collection.
Summary of the invention
This application provides a kind of cross-domain collaborative filtering method and systems, and solving existing recommender system, there are data set injustice
The technical issues of weighing apparatus.
In order to solve the above technical problems, the application is achieved using following technical scheme:
It is proposed a kind of cross-domain collaborative filtering method, comprising the following steps: user items score data is converted into classification and is calculated
The training sample set of method;To it is each auxiliary domain user items rating matrix carry out Funk-SVD decomposition, obtain user it is potential to
Amount;The feature vector that the training sample concentrates user is extended using user's latent variable, obtains the first spread training sample
This collection;Add items characteristic information obtains the second spread training to extend the feature vector of project in the first spread training sample set
Sample set;Use the uneven classifier of the second spread training sample set training;Institute is predicted based on the uneven classifier
It states the missing data of user items score data and generates recommendation.
Further, user items score data is converted to the training sample set of sorting algorithm, specifically: use LuTable
Show row of the user in user items rating matrix, using LiColumn of the expression project in user items rating matrix;Based on spy
Levy vector (Lu, Li) structuring user's project score data classification algorithm training sample set { (Lu,Li,Rui) | (u, i) ∈ κ },
Middle κ is the set for having scoring " user-project " pair in rating matrix, RuiIndicate scoring of the user u to project i.
Further, to the user items rating matrix in each auxiliary domain carry out Funk-SVD decompose to obtain user it is potential to
Amount, specifically includes: setting objective functionUsing pu+γ(euiqi-λ
pu) and qi+γ(euipu-λqi) update puAnd qi, to optimize the objective function;Wherein, λ is regularization parameter, and γ is study speed
Rate;The latent variable of user u on j-th of auxiliary domain is obtained based on optimum resultsWherein for j from 1 to K, K is for assisting domain
Number.
Further, it using the uneven classifier of the second spread training sample set training, specifically includes: initialization institute
The sample weights for stating each sample in the second spread training sample set areWherein, A is sample number, 1≤a≤A;
Following steps repeat T times: when 1) according to the t times iteration, all weight { Dt(xa) | 1≤a≤A }, training simultaneously obtains Weak Classifier
ht;Wherein, t is from 1 to T;2) each training sample x is calculatedaPenalty term pt=1- | amb |,
Wherein,For the weight of Weak Classifier;
3) it usesUpdate sample weights;Wherein, ZtFor regularization because
Son, λ ∈ [0.5,12] are the update step-length of the penalty term;
Calculate uneven classifier
It is proposed a kind of cross-domain system filtration system, including training sample conversion module, user's latent variable generation module, instruction
Practice the first expansion module of sample, the second expansion module of training sample, uneven classifier training module and recommending module;The instruction
Practice sample conversion module, for user items score data to be converted to the training sample set of sorting algorithm;The user is potential
Vector generation module carries out Funk-SVD decomposition for the user items rating matrix to each auxiliary domain, it is potential to obtain user
Vector;First expansion module of training sample concentrates use for extending the training sample using user's latent variable
The feature vector at family obtains the first spread training sample set;Second expansion module of training sample is used for add items feature
The feature vector of project obtains the second spread training sample set in first spread training sample set described in Information expansion;The injustice
Weigh classifier training module, for using the uneven classifier of the second spread training sample set training;The recommending module,
For predicting the missing data of the user items score data based on the uneven classifier and generating recommendation.
Further, the training sample conversion module is specifically used for, using LuIndicate user in user items scoring square
Row in battle array, using LiColumn of the expression project in user items rating matrix, and it is based on feature vector (Lu, Li) structuring user's
Classification algorithm training the sample set { (L of project score datau,Li,Rui) | (u, i) ∈ κ }, wherein κ is that have scoring in rating matrix
" user-project " pair set, RuiIndicate scoring of the user u to project i.
Further, user's latent variable generation module includes objective function setup unit, objective function optimization list
Member and user's latent variable generation unit;
The objective function setup unit, for setting objective function
The objective function optimization unit, for using pu+ γ (euiqi-λpu) and qi+γ(euipu-λqi) update puAnd qi, with optimization
The objective function;Wherein, λ is regularization parameter, and γ is learning rate;User's latent variable generation unit is used for base
The latent variable of user u on j-th of auxiliary domain is obtained in optimum resultsWherein for j from 1 to K, K is the number for assisting domain.
Further, the uneven classifier training module includes sample weights initialization unit, Weak Classifier training
Unit, sample weights updating unit and uneven classifier generation unit;The sample weights initialization unit, for initializing
The sample weights of each sample are in the second spread training sample setWherein, A is sample number, 1≤a≤
A;The Weak Classifier training unit, when for according to the t times iteration, all sample weights { Dt(xa) | 1≤a≤A }, training is simultaneously
Obtain Weak Classifier ht;Wherein, t is from 1 to T;The sample weights updating unit, for calculating each training sample xaPunishment
Item pt=1- | amb |,Wherein,It is weak
The weight of classifier;It usesUpdate sample weights;Wherein, ZtFor
Regularization factors, λ ∈ [0.5,12] are the update step-length of the penalty term;The imbalance classifier generation unit, is used for
The Weak Classifier training unit and the sample weights updating unit repeat T times after calculating,
Calculate uneven classifier
Compared with prior art, the advantages of the application and good effect is: the cross-domain collaborative filtering method that the application proposes
In system, the score data in user items rating matrix is converted according to its position in a matrix as feature vector
For training sample, then from other include decomposed in auxiliary domains of relative abundance information by Funk-SVD obtain user it is potential to
Amount, and the first spread training sample set is obtained using user's latent variable spread training sample set, to reduce aiming field
Sparsity, and then the second spread training sample is obtained to extend the first spread training sample set using the item characteristic information in auxiliary domain
This collection, finally using the uneven classifier of training sample set training after extension, namely to the training set after conversion and extension into
Row classification, predicts the missing data of the user items rating matrix of aiming field, generates to the recommending data of user;In the application,
It solves the problems, such as existing recommender system using uneven disaggregated model there are data sets unbalanced, effectively overcome scoring
Partial velocities problem.
After the detailed description of the application embodiment is read in conjunction with the figure, other features and advantages of the application will become more
Add clear.
Detailed description of the invention
Fig. 1 is the method flow diagram for the cross-domain collaborative filtering method that the application proposes;
Fig. 2 is the system architecture diagram for the cross-domain collaborative filtering system that the application proposes.
Specific embodiment
The specific embodiment of the application is described in more detail with reference to the accompanying drawing.
The cross-domain collaborative filtering method that the application proposes, it is intended to which training is converted to the user items rating matrix of aiming field
After sample set, auxiliary numeric field data is used to be extended to solve aiming field data sparsity problem, then to training sample after extension
The training of this progress imbalance classifier, using the missing item of uneven classifier prediction aiming field, and then obtains recommending data,
Solve the problems, such as that existing recommender system data set is sparse and disequilibrium.Specifically include the following steps:
Step S11: user items score data is converted to the training sample set of sorting algorithm.
In the embodiment of the present application, it is assumed that aiming field T, u and i respectively represent the project of user, between user and project
Relationship indicates that R is scoring by u × i → R, and range is set as { 1,2,3,4,5 };In the embodiment of the present application, using LuIt indicates
Row of the user u in user items rating matrix, using LiColumn of the expression project i in user items rating matrix, then user
Each scoring in project score data may be expressed as a training sample { (Lu,Li,Rui) | (u, i) ∈ κ }, wherein κ is
There is the set of " user-project " pair of scoring in rating matrix, that is, user items rating matrix as shown in Table 1 is converted
For training sample set as shown in Table 2:
Table one
i1 | i2 | i3 | i4 | |
u1 | 5 | 4 | ||
u2 | 5 | 1 | ||
u3 | 2 | 4 | 3 |
Table two
Lu | Li | label |
1 | 1 | 5 |
1 | 3 | 4 |
2 | 2 | 5 |
2 | 4 | 1 |
3 | 1 | 2 |
3 | 3 | 4 |
3 | 4 | 3 |
In table one, u1、u1And u3For three users, i1、i2、i3And i4It is four projects, is commented using user in user items
The position of row in sub-matrix is as Lu, use the position of column of the project in user items rating matrix as Li, therefore can use
(1,1,5) indicates the correlation between u and i, so that the user items rating matrix of table one to be converted to the training of table two
Sample set, that is, being based on feature vector (Lu, Li) generate can user items score data training sample set.
Step S12: to it is each auxiliary domain user items rating matrix carry out Funk-SVD decomposition, obtain user it is potential to
Amount.
In traditional collaborative filtering method, in order to solve the problems, such as user items rating matrix sparsity, usually from same
Effective information is looked in a domain, such as the relationship of user and project are inferred with social networks, trusting relationship or the information of comment, but
Information in same domain is not readily available, and in the embodiment of the present application, extracts effective information from auxiliary domain using cross-domain mode
Mode solve the problems, such as aiming field Sparse.
In the embodiment of the present application, Funk-SVD is decomposed to the user items rating matrix being applied in auxiliary domain, to obtain
User's latent variable is obtained, is multiplied that is, will be decomposed by Funk-SVD user items rating matrix being decomposed into user's latent factor
In the form of project latent factor, high-dimensional user items rating matrix is broken down into two low dimensional matrixes, such as X (m*n)
It is decomposed into U (m*k) × V (k*n), m and n are that the line number of user items rating matrix and columns k indicate latent factor dimension respectively,
And k is far smaller than min (m, n).Funk-SVD, which is decomposed, is intended to maximumlly be fitted the known point of X to predict the unknown point of X, and k is too
It is small then possibly can not fitting data, and k then may cause overfitting greatly very much, useIndicate pre- assessment of the user u to project i
Point, then haveWherein puIndicate the latent factor vector of user u, qiThe latent factor vector of expression project i.
In decomposition, set objective function asWherein p*=
{puser| user ∈ userset } indicate the set of all user's latent variables, q*={ qitem| item ∈ itemset } indicate institute
There is the set of project latent factor.
Using pu←pu+γ(euiqi-λpu) and qi←qi+γ(euipu-λqi) update puAnd qiCarry out optimization object function, with
Optimal optimum results are obtained, wherein
Finally the latent variable of user u on j-th of auxiliary domain is obtained based on optimum resultsWherein j is from 1 to K, supplemented by K
Help the number in domain;λ is regularization parameter, and γ is learning rate, and the excessive algorithm that will lead to of γ value will not restrain, and value is too small to be will lead to
Algorithm is lot more time to restrain.
Step S13: using the feature vector of user in user's latent variable spread training sample set, the first extension instruction is obtained
Practice sample set.And step S14: add items characteristic information come extend the feature of project in the first spread training sample set to
Amount, obtains the second spread training sample set.
User's latent variable obtained in step S12 is added in the training sample in aiming field namely user is potential
Vector is added to feature vector (Lu, Li), obtain the first spread training sample set
In addition, add items characteristic information obtains the second spread training sample set to extend the first spread training sample set,
Recommend performance to improve.By taking film domain as an example, the attribute of film can be added in feature vector, be retrieved according to movie name all
The attribute information of film, and therefrom choose and set a several attributes as the item characteristic for being added to feature vector, such as direct,
School, performer, country, language etc. obtain the second spread training sample set and are represented by
Q is item characteristic quantity.
Step S15: the uneven classifier of the second spread training sample set training is used.
In the embodiment of the present application, the second extension after conversion and extension is instructed using AdaBoost.NC imbalance algorithm
Practice sample set to classify.If becoming one strong classification the basic principle is that multiple classifiers are reasonably combined
Device, using the thought of iteration, each iteration only trains a Weak Classifier, and trained Weak Classifier will participate in next iteration
Use, that is to say, that after iv-th iteration, just there is N number of Weak Classifier altogether, wherein N-1 is trained before being,
Various parameters all no longer change, this training n-th classifier, wherein the relationship of Weak Classifier is that n-th Weak Classifier more may be used
Can point data that preceding N-1 Weak Classifier is not divided pair, final classification output will see the resultant effect of this N number of classifier.?
In AdaBoost.NC algorithm, there are two weight, first is sample weights that training sample concentrates each sample, with vector D table
Show, after the completion of primary study, needs to readjust sample weights, adjust in this subseries by the sample of wrong classification samples
Weight, so that can be learnt with emphasis to it in next study;Another weight is the power of each Weak Classifier
Weight, is indicated with vector α, since there are multiple classifiers, so a fuzzy item need to be arranged to measure between different classifications device
Difference, the fuzzy item are usedIt indicates, htIndicate the classification results of t-th of Weak Classifier;If
Training sample x is correctly classified by t-th of Weak Classifier, then htValue be 1, be otherwise -1;H is point for combining all classifiers
Class result.
Specifically, in the embodiment of the present application, initializing the sample power of each sample in the second spread training sample set first
Weight isWherein, A is sample number, 1≤a≤A;Set the number of Weak Classifier as T, then not based on AdaBoost.NC
Following steps are repeated to be iterated calculating T times by the thought of balanced algorithm: when 1) according to the t times iteration, all sample weights
{Dt(xa) | 1≤a≤A } it trains and obtains Weak Classifier ht;Wherein, for t from 1 to T, t is often repeated once increase by 1, directly from 1 value
To T;2) each training sample x is calculatedaPenalty term pt=1- | amb |, whereinFor
The weight of Weak Classifier;3) it usesUpdate sample weights;Wherein,
ZtFor regularization factors, λ ∈ [0.5,12] is the update step-length of penalty term;After the completion of T iteration, uneven classifier is calculated
Step S16: the missing data based on uneven classifier prediction user items score data simultaneously generates recommendation.
It is above-mentioned as it can be seen that the application propose cross-domain collaborative filtering method in, by the scoring number in user items rating matrix
According to, be converted into training sample as feature vector according to its position in a matrix, then from other include relative abundance information
It assists decomposing in domain by Funk-SVD and obtains user's latent variable, and obtained using user's latent variable spread training sample set
First spread training sample set to reduce the sparsity of aiming field, and then is expanded using the item characteristic information in auxiliary domain
It opens up the first spread training sample set and obtains the second spread training sample set, finally using the second spread training sample set after extension
Training imbalance classifier, namely classify to the training set after conversion and extension, predict the user items scoring of aiming field
The missing data of matrix is generated to the recommending data of user;In the application, existing recommendation is solved using uneven disaggregated model
There are the unbalanced problems of data set for system, effectively overcome the partial velocities problem of scoring.
Based on cross-domain collaborative filtering method set forth above, the application also proposes a kind of cross-domain collaborative filtering system, such as Fig. 2
It is shown, including training sample conversion module 21, user's latent variable generation module 22, the first expansion module of training sample 23, instruction
Practice the second expansion module of sample 24, uneven classifier training module 25 and recommending module 26.
Training sample conversion module 21 is used to be converted to user items score data the training sample set of sorting algorithm;With
Family latent variable generation module 22 is used to carry out Funk-SVD decomposition to the user items rating matrix in each auxiliary domain, is used
Family latent variable;The first expansion module of training sample 23 is used for the spy using user in user's latent variable spread training sample set
Vector is levied, the first spread training sample set is obtained;The second expansion module of training sample 24 is extended for add items characteristic information
The feature vector of project obtains the second spread training sample set in first spread training sample set;Uneven classifier training module
25 for using the uneven classifier of the second spread training sample set training;Recommending module 26 is used for pre- based on uneven classifier
It surveys the missing data of user items score data and generates recommendation.
Specifically, training sample conversion module is used to use LuIt indicates row of the user in user items rating matrix, adopts
Use LiColumn of the expression project in user items scoring is put to the proof, and it is based on feature vector (Lu, Li) generate user items score data
Classification algorithm training sample set, { (Lu,Li,Rui) | (u, i) ∈ κ }, wherein κ is that have scoring " user-item in rating matrix
The set of mesh " pair, RuiIndicate scoring of the user u to project i.
In the embodiment of the present application, user's latent variable generation module 22 includes objective function setup unit 221, objective function
Optimize unit 222 and user's latent variable generation unit 223;Objective function setup unit 221 is for setting objective functionObjective function optimization unit 222 is used to use pu←pu+γ
(euiqi-λpu) and qi←qi+γ(euipu-λqi) update puAnd qi, with optimization object function;User's latent variable generation unit
223 for obtaining the latent variable of user u on j-th of auxiliary domain based on optimum resultsWherein for j from 1 to K, K is auxiliary domain
Number.
Uneven classifier training module 25 include sample weights initialization unit 251, Weak Classifier training unit 252,
Sample weights updating unit 253 and uneven classifier generation unit 254;Sample weights initialization unit 251 is for initializing
The sample weights of each sample are in second spread training sample setWherein, A is sample number, 1≤a≤A;It is weak
When classifier training unit 252 is used for according to the t times iteration of sample, all sample weights { Dt(xa) | 1≤a≤A }, training simultaneously obtains
To Weak Classifier ht;Wherein, t is from 1 to T;Sample weights updating unit 253 is for calculating each training sample xaPenalty term pt=
1-|amb|,Wherein,For the power of Weak Classifier
Weight;It usesUpdate sample weights;Wherein, ZtFor regularization factors, λ ∈
It [0.5,12] is the update step-length of penalty term;Uneven classifier generation unit 254 is used in Weak Classifier training unit and sample
Weight updating unit repeats T times after calculating, and calculates uneven classifier
The recommended method of cross-domain collaborative filtering system is described in detail in cross-domain collaborative filtering method set forth above, herein
It will not go into details.
It should be noted that the above description is not a limitation of the present invention, the present invention is also not limited to the example above,
The variations, modifications, additions or substitutions that those skilled in the art are made within the essential scope of the present invention, are also answered
It belongs to the scope of protection of the present invention.
Claims (8)
1. a kind of cross-domain collaborative filtering method, which comprises the following steps:
User items score data is converted to the training sample set of sorting algorithm;
Funk-SVD decomposition is carried out to the user items rating matrix in each auxiliary domain, obtains user's latent variable;
The feature vector that the training sample concentrates user is extended using user's latent variable, obtains the first spread training sample
This collection;
Add items characteristic information obtains the second spread training to extend the feature vector of project in the first spread training sample set
Sample set;
Use the uneven classifier of the second spread training sample set training;
The missing data of the user items score data is predicted based on the uneven classifier and generates recommendation.
2. cross-domain collaborative filtering method according to claim 1, which is characterized in that be converted to user items score data
The training sample set of sorting algorithm, specifically:
Using LuRow of the user in user items rating matrix is indicated, using LiExpression project is in user items rating matrix
Column;
Based on feature vector (Lu, Li) structuring user's project score data classification algorithm training sample set { (Lu, Li,Rui)|(u,
I) ∈ κ }, wherein κ is the set for having scoring " user-project " pair in rating matrix, RuiIndicate scoring of the user u to project i.
3. cross-domain collaborative filtering method according to claim 2, which is characterized in that comment the user items in each auxiliary domain
Sub-matrix carries out Funk-SVD and decomposes to obtain user's latent variable, specifically includes:
Set objective function
Using pu+ γ (euiqi-λpu) and qi+γ(euipu-λqi) update puAnd qi, to optimize the objective function;Wherein, λ is
Regularization parameter, γ are learning rate;
The latent variable of user u on j-th of auxiliary domain is obtained based on optimum resultsWherein for j from 1 to K, K is for assisting domain
Number.
4. cross-domain collaborative filtering method according to claim 1, which is characterized in that use the second spread training sample
The uneven classifier of collection training, specifically includes:
The sample weights for initializing each sample in the second spread training sample set areWherein, A is sample
Number, 1≤a≤A;
Following steps repeat T times:
1) when according to the t times iteration, all sample weights { Dt(xa) | 1≤a≤A }, training simultaneously obtains Weak Classifier ht;Wherein, t
From 1 to T;
2) each training sample x is calculatedaPenalty term pt=1- | amb |, Wherein,For the weight of Weak Classifier;
3) it usesUpdate sample weights;Wherein, ZtFor regularization because
Son, λ ∈ [0.5,12] are the update step-length of the penalty term;
Calculate uneven classifier
5. a kind of cross-domain system filtration system, which is characterized in that generate mould including training sample conversion module, user's latent variable
Block, the first expansion module of training sample, the second expansion module of training sample, uneven classifier training module and recommending module;
The training sample conversion module, for user items score data to be converted to the training sample set of sorting algorithm;
User's latent variable generation module carries out Funk-SVD points for the user items rating matrix to each auxiliary domain
Solution, obtains user's latent variable;
First expansion module of training sample concentrates user for extending the training sample using user's latent variable
Feature vector, obtain the first spread training sample set;Second expansion module of training sample is believed for add items feature
The feature vector that breath extends project in the first spread training sample set obtains the second spread training sample set;
The imbalance classifier training module, for using the uneven classifier of the second spread training sample set training;
The recommending module, for predicting the missing data of the user items score data simultaneously based on the uneven classifier
It generates and recommends.
6. cross-domain collaborative filtering system according to claim 5, which is characterized in that the training sample conversion module is specific
For using LuRow of the user in user items rating matrix is indicated, using LiExpression project is in user items rating matrix
Column, and be based on feature vector (Lu, Li) structuring user's project score data classification algorithm training sample set { (Lu,Li,Rui)|
(u, i) ∈ κ }, wherein κ is the set for having scoring " user-project " pair in rating matrix, RuiIndicate that user u comments project i
Point.
7. cross-domain collaborative filtering system according to claim 6, which is characterized in that user's latent variable generation module
Including objective function setup unit, objective function optimization unit and user's latent variable generation unit;
The objective function setup unit, for setting objective function
The objective function optimization unit, for using pu+γ(euiqi-λpu) and qi+γ(euipu-λqi) update puAnd qi, with
Optimize the objective function;Wherein, λ is regularization parameter, and γ is learning rate;
User's latent variable generation unit, for obtaining the latent variable of user u on j-th of auxiliary domain based on optimum resultsWherein for j from 1 to K, K is the number for assisting domain.
8. cross-domain collaborative filtering method according to claim 5, which is characterized in that the imbalance classifier training module
It is generated including sample weights initialization unit, Weak Classifier training unit, sample weights updating unit and uneven classifier single
Member;
The sample weights initialization unit, the sample for initializing each sample in the second spread training sample set are weighed
Weight isWherein, A is sample number, 1≤a≤A;
The Weak Classifier training unit, when for according to the t times iteration, all sample weights { Dt(xa) | 1≤a≤A } training
And obtain Weak Classifier ht;Wherein, t is from 1 to T;
The sample weights updating unit, for calculating each training sample xaPenalty term pt=1- | amb |,
Wherein,For Weak Classifier
Weight;It uses Update sample weights;Wherein, ZtFor regularization
The factor, λ ∈ [0.5,12] are the update step-length of the penalty term;
The imbalance classifier generation unit, in the Weak Classifier training unit and the sample weights updating unit
It repeats T times after calculating, calculates uneven classifier
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811209371.5A CN109446420B (en) | 2018-10-17 | 2018-10-17 | Cross-domain collaborative filtering method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811209371.5A CN109446420B (en) | 2018-10-17 | 2018-10-17 | Cross-domain collaborative filtering method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109446420A true CN109446420A (en) | 2019-03-08 |
CN109446420B CN109446420B (en) | 2022-01-25 |
Family
ID=65546951
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811209371.5A Active CN109446420B (en) | 2018-10-17 | 2018-10-17 | Cross-domain collaborative filtering method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109446420B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110119465A (en) * | 2019-05-17 | 2019-08-13 | 哈尔滨工业大学 | Merge the mobile phone application user preferences search method of LFM latent factor and SVD |
CN110264274A (en) * | 2019-06-21 | 2019-09-20 | 深圳前海微众银行股份有限公司 | Objective group's division methods, model generating method, device, equipment and storage medium |
CN110297848A (en) * | 2019-07-09 | 2019-10-01 | 深圳前海微众银行股份有限公司 | Recommended models training method, terminal and storage medium based on federation's study |
CN110543597A (en) * | 2019-08-30 | 2019-12-06 | 北京奇艺世纪科技有限公司 | Grading determination method and device and electronic equipment |
CN112214682A (en) * | 2019-07-11 | 2021-01-12 | 中移(苏州)软件技术有限公司 | Recommendation method, device and equipment based on field and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102385586A (en) * | 2010-08-27 | 2012-03-21 | 日电(中国)有限公司 | Multiparty cooperative filtering method and system |
CN102930341A (en) * | 2012-10-15 | 2013-02-13 | 罗辛 | Optimal training method of collaborative filtering recommendation model |
EP2837199A1 (en) * | 2012-04-12 | 2015-02-18 | MOVIRI S.p.A. | Client-side recommendations on one-way broadcast networks |
CN105447145A (en) * | 2015-11-25 | 2016-03-30 | 天津大学 | Item-based transfer learning recommendation method and recommendation apparatus thereof |
-
2018
- 2018-10-17 CN CN201811209371.5A patent/CN109446420B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102385586A (en) * | 2010-08-27 | 2012-03-21 | 日电(中国)有限公司 | Multiparty cooperative filtering method and system |
EP2837199A1 (en) * | 2012-04-12 | 2015-02-18 | MOVIRI S.p.A. | Client-side recommendations on one-way broadcast networks |
CN102930341A (en) * | 2012-10-15 | 2013-02-13 | 罗辛 | Optimal training method of collaborative filtering recommendation model |
CN105447145A (en) * | 2015-11-25 | 2016-03-30 | 天津大学 | Item-based transfer learning recommendation method and recommendation apparatus thereof |
Non-Patent Citations (2)
Title |
---|
XU YU等: "A User-Based Cross Domain CollaborativeFiltering Algorithm Based on a Linear Decomposition Model", 《IEEE》 * |
刘青文: "跨域协同过滤系统", 《中国博士学位全文全文数据库》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110119465A (en) * | 2019-05-17 | 2019-08-13 | 哈尔滨工业大学 | Merge the mobile phone application user preferences search method of LFM latent factor and SVD |
CN110119465B (en) * | 2019-05-17 | 2023-06-13 | 哈尔滨工业大学 | Mobile phone application user preference retrieval method integrating LFM potential factors and SVD |
CN110264274A (en) * | 2019-06-21 | 2019-09-20 | 深圳前海微众银行股份有限公司 | Objective group's division methods, model generating method, device, equipment and storage medium |
CN110264274B (en) * | 2019-06-21 | 2023-12-29 | 深圳前海微众银行股份有限公司 | Guest group dividing method, model generating method, device, equipment and storage medium |
CN110297848A (en) * | 2019-07-09 | 2019-10-01 | 深圳前海微众银行股份有限公司 | Recommended models training method, terminal and storage medium based on federation's study |
CN110297848B (en) * | 2019-07-09 | 2024-02-23 | 深圳前海微众银行股份有限公司 | Recommendation model training method, terminal and storage medium based on federal learning |
CN112214682A (en) * | 2019-07-11 | 2021-01-12 | 中移(苏州)软件技术有限公司 | Recommendation method, device and equipment based on field and storage medium |
CN112214682B (en) * | 2019-07-11 | 2023-04-07 | 中移(苏州)软件技术有限公司 | Recommendation method, device and equipment based on field and storage medium |
CN110543597A (en) * | 2019-08-30 | 2019-12-06 | 北京奇艺世纪科技有限公司 | Grading determination method and device and electronic equipment |
CN110543597B (en) * | 2019-08-30 | 2022-06-03 | 北京奇艺世纪科技有限公司 | Grading determination method and device and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN109446420B (en) | 2022-01-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111428147B (en) | Social recommendation method of heterogeneous graph volume network combining social and interest information | |
CN109446420A (en) | A kind of cross-domain collaborative filtering method and system | |
CN103353872B (en) | A kind of teaching resource personalized recommendation method based on neutral net | |
CN104462383B (en) | A kind of film based on a variety of behavior feedbacks of user recommends method | |
CN102591915B (en) | Recommending method based on label migration learning | |
CN111222332A (en) | Commodity recommendation method combining attention network and user emotion | |
CN106708953A (en) | Discrete particle swarm optimization based local community detection collaborative filtering recommendation method | |
CN103399858A (en) | Socialization collaborative filtering recommendation method based on trust | |
CN107808278A (en) | A kind of Github open source projects based on sparse self-encoding encoder recommend method | |
CN106874355A (en) | The collaborative filtering method of social networks and user's similarity is incorporated simultaneously | |
CN112699310A (en) | Cold start cross-domain hybrid recommendation method and system based on deep neural network | |
CN113190751A (en) | Recommendation algorithm for generating fused keywords | |
CN109086463A (en) | A kind of Ask-Answer Community label recommendation method based on region convolutional neural networks | |
CN104572915B (en) | One kind is based on the enhanced customer incident relatedness computation method of content environment | |
Topal et al. | Hybrid artificial intelligence based automatic determination of travel preferences of chinese tourists | |
Dwivedi et al. | Time-series data prediction problem analysis through multilayered intuitionistic fuzzy sets | |
CN112560105B (en) | Joint modeling method and device for protecting multi-party data privacy | |
Hua et al. | Social media based simulation models for understanding disease dynamics | |
CN108491477A (en) | Neural network recommendation method based on multidimensional cloud and user's dynamic interest | |
CN109344319B (en) | Online content popularity prediction method based on ensemble learning | |
CN112148994A (en) | Information push effect evaluation method and device, electronic equipment and storage medium | |
Nguyen et al. | A variational autoencoder mixture model for online behavior recommendation | |
Zahari et al. | Evaluation of sustainable development indicators with fuzzy TOPSIS based on subjective and objective weights | |
Ali et al. | Applications of Soft Computing for the Web | |
CN112559905B (en) | Conversation recommendation method based on dual-mode attention mechanism and social similarity |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |