CN103488705A - User interest model incremental update method of personalized recommendation system - Google Patents

User interest model incremental update method of personalized recommendation system Download PDF

Info

Publication number
CN103488705A
CN103488705A CN201310403293.3A CN201310403293A CN103488705A CN 103488705 A CN103488705 A CN 103488705A CN 201310403293 A CN201310403293 A CN 201310403293A CN 103488705 A CN103488705 A CN 103488705A
Authority
CN
China
Prior art keywords
msub
mrow
document
user
documents
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310403293.3A
Other languages
Chinese (zh)
Other versions
CN103488705B (en
Inventor
姚兴苗
夏春燕
伍盛
胡光岷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201310403293.3A priority Critical patent/CN103488705B/en
Publication of CN103488705A publication Critical patent/CN103488705A/en
Application granted granted Critical
Publication of CN103488705B publication Critical patent/CN103488705B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a user interest model incremental update method of a personalized recommendation system. According to the basic principle of the method, the method includes storing and generating an intermediate result of calculation of a current user interest model, and performing incremental calculating on the basis of the intermediate result when the user interest model is updated. On the premise that interest information is protected from losing during updating process, the requirements that the user interest model can be updated rapidly and continuously on condition of large data amount can be met, performances of the recommendation system can be improved, and higher-quality service is provided for users.

Description

个性化推荐系统的用户兴趣模型增量更新方法Incremental update method of user interest model for personalized recommendation system

技术领域technical field

本发明涉及计算机应用技术领域,特别是一种个性化推荐系统的用户兴趣模型增量更新方法。The invention relates to the field of computer application technology, in particular to a method for incrementally updating a user interest model of a personalized recommendation system.

背景技术Background technique

个性化推荐系统通过建立用户与推荐对象之间的二元关系,利用已有的选择过程或相似性关系挖掘每个用户潜在感兴趣的对象,进而进行个性化推荐(刘建国,周涛,汪秉宏.个性化推荐系统的研究进展[J].自然科学进展,2009,19(1),1-15.)。随着用户需求的多样化,个性化推荐系统应用变得更加广泛,不仅用于电子商务,也用于推荐网页、文档等。对于文案人员和研究学者来说需要经常查阅大量的资料文献。基于文档内容信息的个性化推荐系统通过收集和分析用户阅读过的感兴趣文档内容来了解用户的阅读兴趣并建立用户兴趣模型,通过比较文档内容与用户兴趣模型的匹配度,向用户推荐匹配度高的文档。基于文档内容信息的个性化推荐系统有三个重要的模块:用户兴趣建模模块、推荐对象建模模块、推荐算法模块,该系统模型如图1所示。The personalized recommendation system establishes the binary relationship between the user and the recommended object, uses the existing selection process or similarity relationship to mine potential interested objects for each user, and then performs personalized recommendation (Liu Jianguo, Zhou Tao, Wang Binghong. Personality The research progress of the recommendation system [J]. Natural Science Progress, 2009, 19 (1), 1-15.). With the diversification of user needs, personalized recommendation systems have become more widely used, not only for e-commerce, but also for recommending web pages, documents, etc. For copywriters and researchers, it is necessary to consult a large number of documents frequently. The personalized recommendation system based on document content information collects and analyzes the interested document content that the user has read to understand the user's reading interest and establish a user interest model. By comparing the matching degree between the document content and the user interest model, the matching degree is recommended to the user. high documentation. The personalized recommendation system based on document content information has three important modules: user interest modeling module, recommendation object modeling module, and recommendation algorithm module. The system model is shown in Figure 1.

在基于文档内容信息的推荐系统中,用户兴趣建模模块是其中一个核心的模块,其作用是从用户阅读过的感兴趣的文档中提取用户兴趣模型并根据用户兴趣的变化实现兴趣模型更新。为实现高精度的推荐,用户兴趣模型必须能够准确描述用户的当前兴趣,而兴趣模型的更新必须能够快速跟踪用户兴趣的变化。In the recommendation system based on document content information, the user interest modeling module is one of the core modules, and its role is to extract the user interest model from the interested documents that the user has read and update the interest model according to the change of the user interest. In order to achieve high-precision recommendation, the user interest model must be able to accurately describe the user's current interest, and the update of the interest model must be able to quickly track the change of user interest.

目前用户兴趣模型的更新主要有两种方法,时间窗口法和遗忘函数法,时间窗口法是利用滑动时间窗滤除过时的兴趣,遗忘函数法是利用遗忘函数衰减兴趣的权重(费红晓,戴弋,穆珺等.基于优化时间窗的用户兴趣漂移方法[J].计算机工程,2008,34(16),210-214.)。文献(SHIN H.,CHO S..Neighborhood Property Based Pattern Selection for Support VectorMachines[J].Neural Computation,2007,19(3),816-855.)中采用时间窗口法更新用户兴趣模型,该方法利用滑动时间窗滤除过时的兴趣。文献(KEERTHI S.S.,SHEVADE S.K.,BHATTACHARYYA.,et al.A Fast Iterative Nearest Point Algorithm for Support Vector MachineClassifier Design[J].IEEE Transactions on Neural Networks,2000,11(1),124-136.)中采用遗忘函数法更新用户兴趣模型,该方法利用遗忘函数衰减兴趣的权重。单蓉(单蓉.用户兴趣模型的更新与遗忘机制研究[J].微型电脑应用,2011,27(7),10-11,69)根据HTML文档的特点以及用户的浏览速度更新兴趣模型,结合遗忘因子修正特征词的权重来实现模型的遗忘。文献(李峰,裴军,游之洋.基于隐式反馈的自适应用户兴趣模型[J].计算机工程与应用,2008,44(9),76-79.)将用户兴趣分为短期兴趣和长期兴趣,短期兴趣采用时间窗口更新机制,长期兴趣采用基于时间的遗忘函数的更新策略。At present, there are two main methods for updating the user interest model, the time window method and the forgetting function method. The time window method uses a sliding time window to filter out outdated interests, and the forgetting function method uses the forgetting function to attenuate the weight of interest (Fei Hongxiao, Dai Yi, Mu Jun, etc. User Interest Drift Method Based on Optimal Time Window [J]. Computer Engineering, 2008, 34(16), 210-214.). In the literature (SHIN H., CHO S..Neighborhood Property Based Pattern Selection for Support VectorMachines[J].Neural Computation,2007,19(3),816-855.), the time window method is used to update the user interest model. This method uses A sliding time window filters out outdated interests. Literature (KEERTHI S.S., SHEVADE S.K., BHATTACHARYYA., et al. A Fast Iterative Nearest Point Algorithm for Support Vector Machine Classifier Design[J]. IEEE Transactions on Neural Networks, 2000, 11(1), 124-136.) using forgetting The function method updates the user interest model, which uses the forgetting function to attenuate the weight of interest. Shan Rong (Shan Rong. Research on Update and Forgetting Mechanism of User Interest Model[J]. Microcomputer Application, 2011, 27(7), 10-11, 69) Update the interest model according to the characteristics of HTML documents and the user's browsing speed, Combined with the forgetting factor to modify the weight of the feature words to realize the forgetting of the model. Literature (Li Feng, Pei Jun, You Zhiyang. Adaptive User Interest Model Based on Implicit Feedback [J]. Computer Engineering and Applications, 2008, 44(9), 76-79.) divides user interest into short-term interest and Long-term interest, short-term interest uses a time window update mechanism, and long-term interest uses a time-based forgetting function update strategy.

现有的用户兴趣模型更新方法强调的是如何从用户感兴趣的文档当中剔除偏离用户兴趣的文档,以及增加新的感兴趣文档,使得用于构建用户兴趣模型的文档更能反映用户当前兴趣,而忽略了用户兴趣模型更新的计算效率问题。随着用户阅读文档数量的增加,其标记的感兴趣的文档数量也会增加,用户兴趣模型更新的计算效率问题逐渐凸显出来,造成模型更新速度过低而不能满足用户需求的不良后果。The existing user interest model update method emphasizes how to remove documents that deviate from the user's interest from the documents that the user is interested in, and add new interested documents, so that the documents used to build the user interest model can better reflect the user's current interests. However, the calculation efficiency problem of user interest model update is ignored. As the number of documents read by users increases, the number of documents of interest marked by them will also increase, and the computational efficiency of user interest model updates is gradually highlighted, resulting in the adverse consequences that the model update speed is too low to meet user needs.

发明内容Contents of the invention

本发明所要解决的技术问题是,针对现有技术不足,提供一种个性化推荐系统的用户兴趣模型增量更新方法,在确保更新过程不丢失兴趣信息的前提下,提高用户兴趣模型更新的计算效率,满足用户兴趣模型在数据量庞大的情况下也能不断快速更新的要求,提高个性化推荐系统性能,为用户提供更高质量的服务。The technical problem to be solved by the present invention is to provide a user interest model incremental update method for a personalized recommendation system to improve the update calculation of the user interest model on the premise of ensuring that the update process does not lose interest information. Efficiency, meet the requirement that the user interest model can be continuously and rapidly updated in the case of a huge amount of data, improve the performance of the personalized recommendation system, and provide users with higher quality services.

为解决上述技术问题,本发明所采用的技术方案是:一种个性化推荐系统的用户兴趣模型增量更新方法,该方法为:In order to solve the above-mentioned technical problems, the technical solution adopted in the present invention is: a method for incrementally updating the user interest model of a personalized recommendation system, the method is:

1)构建基于文档内容的用户兴趣向量空间模型U01) Construct a user interest vector space model U 0 based on document content;

2)建立所述用户兴趣向量空间模型U0的用户感兴趣文档集D0={d01,d02,...,d0m},令D={d1,d2,...,dn}为待推荐文档集,其中文档di的特征向量为,(ti2,wi2),...,(tia,wia)};其中,d0e表示所述用户感兴趣文档集D0中的文档,e=1,2,...,m,m为所述用户感兴趣文档集D0中的文档总数;tik表示文档di第k项特征词;wik表示文档di第k项特征词的权重;i=1,2,...,n;k=1,2,...,a;a表示文档di特征词的总项数;这里,待推荐文档集一般从网络搜集得到或者从文献资料中得到;2) Establishing the user interested document set D 0 ={d 01 ,d 02 ,...,d 0m } of the user interest vector space model U 0 , let D={d 1 ,d 2 ,..., d n } is the document set to be recommended, where the feature vector of document d i is ,(t i2 ,w i2 ),...,(t ia ,w ia )}; wherein, d 0e represents the documents in the document set D 0 that the user is interested in, e=1,2,..., m, m is the total number of documents in the document set D 0 that the user is interested in; t ik represents the feature word of item k of document di; w ik represents the weight of feature word of item k of document d i ; i=1,2,. ...,n; k=1,2,...,a; a represents the total number of items of feature words in document d i ; here, the document set to be recommended is generally collected from the Internet or obtained from literature;

3)推荐文档时,计算所述待推荐文档集D中所有文档特征向量与所述用户兴趣向量空间模型U0的相似度r,推荐出相似度r大于阈值α的文档,向个性化推荐系统反馈感兴趣的新文档,所述新文档集合为

Figure BDA0000378349690000022
阈值α的取值范围为0到1之间,根据用户需要调节α大小,当用户希望得到更多推荐结果时,α的取值越接近0,当用户希望得到更准确的推荐结果时,α的取值越接近1;选择用户感兴趣文档集合中过时或者偏离用户兴趣的文档时,分别计算集合D0中各个文档特征向量与所述用户兴趣向量空间模型U0的相似度r',选择r'小于阈值α的文档作为过时或者偏离用户兴趣的文档,所述过时或者偏离用户兴趣的文档集合为
Figure BDA00003783496900000311
Figure BDA00003783496900000312
为所述新文档集合为D'中的文档,f=1,2,...,q,q为所述新文档集合D'中的文档总数;为所述过时或者偏离用户兴趣的文档集合D''中的文档,h=1,2,...,c,c为所述过时或者偏离用户兴趣的文档集合D''中的文档总数;3) When recommending documents, calculate the similarity r of all document feature vectors in the document set D to be recommended and the user interest vector space model U0 , recommend documents with similarity r greater than the threshold α, and send them to the personalized recommendation system Feedback new documents of interest, the collection of new documents is
Figure BDA0000378349690000022
The value range of the threshold α is between 0 and 1, and the size of α is adjusted according to the needs of the user. When the user wants to get more recommendation results, the value of α is closer to 0. When the user wants to get more accurate recommendation results, α The closer the value is to 1; when selecting documents that are outdated or deviate from the user's interest in the user's interest document collection, calculate the similarity r' between each document feature vector in the collection D 0 and the user interest vector space model U 0 , and select Documents whose r' is less than the threshold α are regarded as outdated or deviated from the user's interest, and the set of outdated or deviated from the user's interest is
Figure BDA00003783496900000311
Figure BDA00003783496900000312
The new document set is the document in D', f=1,2,...,q, q is the total number of documents in the new document set D'; is the document in the outdated or deviated from the user's interest document collection D'', h=1,2,...,c, c is the total number of documents in the outdated or deviated from the user's interest in the document collection D'';

4)增加用户感兴趣文档集合时,将所述新文档集合D'添加到所述用户感兴趣文档集D0中,构成新的第一用户感兴趣文档集D1;剔除用户感兴趣文档集合中过时或者偏离用户兴趣的文档时,将所述过时或者偏离用户兴趣的文档集合D''从所述用户感兴趣文档集D0中剔除,构成新的第二用户感兴趣文档集D24) When increasing the document collection of interest to the user, add the new document collection D' to the document collection D 0 of interest to the user to form a new first document collection D 1 of interest to the user; remove the document collection of interest to the user When outdated or deviating from the documents of the user's interest, the outdated or deviating from the user's interest document set D'' is removed from the user's interested document set D0 to form a new second user's interested document set D2 ;

5)根据下式计算所述新的第一用户感兴趣文档集D1的中心向量

Figure BDA0000378349690000031
5) Calculate the center vector of the new first user interested document set D1 according to the following formula
Figure BDA0000378349690000031

WW DD. 11 ‾‾ == ΣΣ ee == 11 mm WW dd 00 ee ++ ΣΣ ff == 11 qq WW dd pfpf mm ++ qq == mm WW DD. 00 ‾‾ ++ ΣΣ ff == 11 qq WW dd pfpf mm ++ qq ;;

其中,

Figure BDA0000378349690000033
为所述用户感兴趣文档集D0中第e个文档的特征向量;
Figure BDA0000378349690000034
为所述新文档集合D'中第f个文档的特征向量;q为所述新文档集合D'中的文档总数;
Figure BDA0000378349690000035
为所述用户感兴趣文档集D0的中心向量;m为所述用户感兴趣文档集D0中的文档总数;e=1,2,...,m;f=1,2,...,q;in,
Figure BDA0000378349690000033
is the feature vector of the e-th document in the document set D 0 that the user is interested in;
Figure BDA0000378349690000034
is the feature vector of the fth document in the new document collection D'; q is the total number of documents in the new document collection D';
Figure BDA0000378349690000035
is the center vector of the document set D 0 of interest to the user; m is the total number of documents in the document set D 0 of interest to the user; e=1,2,...,m; f=1,2,... .,q;

根据下式计算新的第二用户感兴趣文档集D2的中心向量

Figure BDA0000378349690000036
Calculate the center vector of the new second user's interested document set D2 according to the following formula
Figure BDA0000378349690000036

WW DD. 22 ‾‾ == ΣΣ ee == 11 mm WW dd 00 ee -- ΣΣ hh == 11 cc WW dd bhbh mm -- cc == mm WW DD. 00 ‾‾ -- ΣΣ hh == 11 cc WW dd bhbh mm -- cc ;;

其中,

Figure BDA0000378349690000038
为所述用户感兴趣文档集D0中第h个文档的特征向量;
Figure BDA0000378349690000039
为过时或者偏离用户兴趣的文档集合D''中文档的特征向量;c为过时或者偏离用户兴趣的文档集合D''中文档总数;
Figure BDA00003783496900000310
为所述用户感兴趣文档集D0的中心向量;m为所述用户感兴趣文档集D0中文档总数;h=1,2,...,c;in,
Figure BDA0000378349690000038
is the feature vector of the hth document in the document set D 0 that the user is interested in;
Figure BDA0000378349690000039
is the feature vector of the document in the document collection D'' that is outdated or deviates from the user's interest; c is the total number of documents in the document collection D'' that is outdated or deviates from the user's interest;
Figure BDA00003783496900000310
is the center vector of the document set D 0 of interest to the user; m is the total number of documents in the document set D 0 of interest to the user; h=1,2,...,c;

6)将

Figure BDA0000378349690000041
Figure BDA0000378349690000042
各维按权值从大到小排序,选择
Figure BDA0000378349690000043
Figure BDA0000378349690000044
的前N维构建新的用户兴趣向量空间模型U1或U2,同时把
Figure BDA0000378349690000045
Figure BDA0000378349690000046
存入个性化推荐系统;其中,N不超过
Figure BDA0000378349690000047
Figure BDA0000378349690000048
的维数;用所述新的用户兴趣向量空间模型U1或U2代替步骤1)中的U0进行新一轮推荐。6) Will
Figure BDA0000378349690000041
or
Figure BDA0000378349690000042
Dimensions are sorted by weight from large to small, select
Figure BDA0000378349690000043
or
Figure BDA0000378349690000044
The first N dimensions construct a new user interest vector space model U 1 or U 2 , and at the same time put
Figure BDA0000378349690000045
or
Figure BDA0000378349690000046
Stored in the personalized recommendation system; among them, N does not exceed
Figure BDA0000378349690000047
or
Figure BDA0000378349690000048
Dimensions; use the new user interest vector space model U 1 or U 2 to replace U 0 in step 1) for a new round of recommendation.

所述步骤1)中,构建基于文档内容的用户兴趣向量空间模型U0的具体步骤如下:In the step 1), the specific steps of constructing the user interest vector space model U 0 based on document content are as follows:

1)对所有用户感兴趣的文档进行特征词选择及特征词权重计算;文档特征词选择及特征词权重可以由ICTCLAS汉语分词软件(http://ictclas.nlpir.org/)的关键词提取功能获得,或基于词频的特征词选择方法得到;1) Feature word selection and feature word weight calculation for all documents that users are interested in; document feature word selection and feature word weight can be extracted by the keyword extraction function of ICTCLAS Chinese word segmentation software (http://ictclas.nlpir.org/) Obtain, or obtain based on the feature word selection method of word frequency;

2)提取所有用户感兴趣的文档的特征向量,构成文档特征向量集D32) extract the feature vectors of all documents that the user is interested in, and form the document feature vector set D3 ;

3)计算所述文档特征向量集D3的中心向量,将所述文档特征向量集D3的中心向量按各维的权重从大到小排序,选取前M维作为用户兴趣向量空间模型U0;其中M不超过所述文档特征向量集D3的中心向量的维数。3) Calculate the center vector of the document feature vector set D3 , sort the center vector of the document feature vector set D3 according to the weight of each dimension from large to small, and select the first M dimensions as the user interest vector space model U0 ; where M does not exceed the dimension of the center vector of the document feature vector set D 3 .

文档特征向量集D3={d31,d32,...,d3x}的中心向量

Figure BDA0000378349690000049
的计算公式为:Document feature vector set D 3 =center vector of {d 31 ,d 32 ,...,d 3x }
Figure BDA0000378349690000049
The calculation formula is:

WW DD. 33 ‾‾ == ΣΣ ythe y == 11 xx WW dd 33 ythe y xx ;;

其中,x为所述文档特征向量集D3中元素的个数;

Figure BDA00003783496900000413
为所述文档特征向量集D3中第y个文档的特征向量;y=1,2,...,x。Wherein, x is the number of elements in the document feature vector set D3 ;
Figure BDA00003783496900000413
is the feature vector of the yth document in the document feature vector set D3 ; y=1,2,...,x.

待推荐文档集D中文档di的特征向量与所述用户兴趣向量空间模型U0的相似度r的计算公式为:The calculation formula of the similarity r between the feature vector of the document d i in the document set D to be recommended and the user interest vector space model U0 is:

r = cos ( W d i , U 0 ) = W d i · U 0 | | W d i | | 2 × | | U 0 | | 2 ; 其中,||||2表示二范数。 r = cos ( W d i , u 0 ) = W d i &Center Dot; u 0 | | W d i | | 2 × | | u 0 | | 2 ; Among them, |||| 2 represents the two-norm.

用户感兴趣文档集D0中第e个文档特征向量与所述用户兴趣向量空间模型U0的相似度r'的计算公式为:The calculation formula of the similarity r' between the e-th document feature vector in the user interest document set D0 and the user interest vector space model U0 is:

rr ′′ == coscos (( WW dd 00 ee ,, Uu 00 )) == WW dd 00 ee ·· Uu 00 || || WW dd 00 ee || || 22 ×× || || Uu 00 || || 22 ..

本发明提出的用户兴趣模型增量更新方法的基本思想是存储生成当前用户兴趣模型的计算过程中的中间结果,更新用户兴趣模型时,在该中间结果基础上进行增量计算。The basic idea of the user interest model incremental update method proposed by the present invention is to store the intermediate results in the calculation process of generating the current user interest model, and perform incremental calculations on the basis of the intermediate results when updating the user interest model.

与现有技术相比,本发明所具有的有益效果为:本发明针对基于文档内容信息的推荐系统的用户兴趣模型更新的效率问题,在保证用户信息完整的前提下,本发明的更新方法减少了用户兴趣模型更新时的计算量,使得用户兴趣模型可以快速频繁更新,提高了个性化推荐系统的性能,能够快速实现用户兴趣跟踪,以适应用户兴趣的变化,为用户提供更高质量的服务。Compared with the prior art, the beneficial effects of the present invention are as follows: the present invention aims at updating the efficiency of the user interest model of the recommendation system based on document content information, and on the premise of ensuring the integrity of user information, the updating method of the present invention reduces Reduce the amount of calculation when updating the user interest model, so that the user interest model can be updated quickly and frequently, improve the performance of the personalized recommendation system, and quickly realize user interest tracking to adapt to changes in user interest and provide users with higher quality services .

附图说明Description of drawings

图1为基于文档内容信息的推荐系统;Figure 1 is a recommendation system based on document content information;

图2为本发明用户兴趣模型的构建流程。Fig. 2 is the construction process of the user interest model of the present invention.

具体实施方式Detailed ways

本发明中构建基于文档内容的用户兴趣向量空间模型的流程如图1所示,首先对用户感兴趣的文档进行特征词选择及特征词权重计算,得到一个由一组特征词及其权重组成的文档特征向量。文档特征向量提取方法可以利用ICTCLAS汉语分词软件(http://ictclas.nlpir.org/)的特征词提取功能,或基于词频的特征词选择方法得到。多个文档特征向量构成文档特征向量集。计算得到文档特征向量集的中心向量之后,将中心向量各维按权重从大到小排序,选取前N维作为该用户的兴趣模型向量。In the present invention, the process of constructing a user interest vector space model based on document content is shown in Figure 1. First, the feature word selection and feature word weight calculation are performed on the document that the user is interested in, and a set of feature words and their weights is obtained. Document feature vector. The document feature vector extraction method can be obtained by using the feature word extraction function of ICTCLAS Chinese word segmentation software (http://ictclas.nlpir.org/), or the feature word selection method based on word frequency. Multiple document feature vectors constitute a document feature vector set. After calculating the center vector of the document feature vector set, sort the dimensions of the center vector in descending order of weight, and select the first N dimensions as the user's interest model vector.

文档特征向量集的中心向量计算方法如下:The calculation method of the center vector of the document feature vector set is as follows:

文档集合D3={d31,d32,...,d3x},文档d2i的特征向量为

Figure BDA0000378349690000053
,(t3i2,w3i2),...,(t3im,w3im)},其中,t3ik表示文档d3i第k项特征词,w3ik表示文档d3i第k项特征词的权重,那么中心向量
Figure BDA0000378349690000051
计算公式为:Document collection D 3 ={d 31 ,d 32 ,...,d 3x }, the feature vector of document d 2i is
Figure BDA0000378349690000053
,(t 3i2 ,w 3i2 ),...,(t 3im ,w 3im )}, where, t 3ik represents the feature word of item k of document d 3i , w 3ik represents the weight of feature word of item k of document d 3i , Then the center vector
Figure BDA0000378349690000051
The calculation formula is:

WW DD. 33 ‾‾ == ΣΣ ythe y == 11 xx WW dd 33 ythe y xx -- -- -- (( 11 ))

在此公式中,文档特征向量通过匹配每一维的特征词来求和,特征词相同则对应权值相加。该中心向量各维按权重排序后的前M项即为该用户的兴趣模型U,M不超过中心向量的维数,一般由训练样本经验值决定。In this formula, the document feature vectors are summed by matching the feature words of each dimension, and the corresponding weights are added if the feature words are the same. The top M items of each dimension of the center vector sorted by weight are the user's interest model U, and M does not exceed the dimension of the center vector, which is generally determined by the experience value of the training samples.

假设用户感兴趣文档为{d1,d2,d3},建立用户兴趣模型的过程见表1。Assuming that the user's interested document is {d 1 ,d 2 ,d 3 }, the process of establishing the user interest model is shown in Table 1.

表1用户兴趣模型建立过程Table 1 User interest model building process

Figure BDA0000378349690000061
Figure BDA0000378349690000061

表中中心向量

Figure BDA0000378349690000062
由公式(1)计算所得,此处选择该中心向量的前5个特征项作为用户兴趣模型U。center vector in table
Figure BDA0000378349690000062
Calculated by the formula (1), the first 5 feature items of the center vector are selected as the user interest model U here.

本发明提出的增量更新方法的具体实现步骤如下:The specific implementation steps of the incremental update method proposed by the present invention are as follows:

设U0为用户当前已经建立的用户兴趣模型,建立该用户兴趣模型的用户感兴趣文档集为D0={d01,d02,...,d0m}。文档集合D={d1,d2,...,dn}为待推荐文档,文档di的特征向量为

Figure BDA0000378349690000064
,(ti2,wi2),...,(tia,wia)}。Let U 0 be the user interest model currently established by the user, and the user interested document set for establishing the user interest model is D 0 ={d 01 ,d 02 ,...,d 0m }. The document set D={d 1 ,d 2 ,...,d n } is the document to be recommended, and the feature vector of the document d i is
Figure BDA0000378349690000064
,(t i2 ,w i2 ),...,(t ia ,w ia )}.

(1)推荐文档时,通过余弦夹角公式计算集合D中所有文档特征向量与用户模型U0的相似度r,推荐出相似度r大于阈值α的文档,用户浏览后向系统反馈感兴趣的新文档,设该文档集合为

Figure BDA0000378349690000065
选择用户感兴趣文档集合中过时或者偏离用户兴趣的文档时,分别计算集合D0中各个文档特征向量与所述用户兴趣向量空间模型U0的相似度r',选择r'小于阈值α的文档作为过时或者偏离用户兴趣的文档,所述过时或者偏离用户兴趣的文档集合为 D ′ ′ = { d b 1 , d b 2 , . . . , d b c } ; (1) When recommending documents, the similarity r of all document feature vectors in the set D and the user model U 0 is calculated by the cosine angle formula, and documents with similarity r greater than the threshold α are recommended, and the user feeds back the interested ones to the system after browsing new document, let the document collection be
Figure BDA0000378349690000065
When selecting documents that are outdated or deviate from the user's interest in the document collection of user interest, calculate the similarity r' between each document feature vector in the set D 0 and the user interest vector space model U 0 , and select the document whose r' is less than the threshold α As documents that are outdated or deviate from user interests, the collection of documents that are outdated or deviate from user interests is D. ′ ′ = { d b 1 , d b 2 , . . . , d b c } ;

(2)增加用户感兴趣文档集合时,将所述新文档集合D'添加到所述用户感兴趣文档集D0中,构成新的用户感兴趣文档集D1;剔除用户感兴趣文档集合中过时或者偏离用户兴趣的文档时,将所述过时或者偏离用户兴趣的文档集合D''从所述用户感兴趣文档集D0中剔除,构成新的用户感兴趣文档集D2(2) When increasing the document collection of interest to the user, add the new document collection D' to the document collection D of interest to the user to form a new document collection D 1 of interest to the user; remove the document collection of interest to the user When documents that are outdated or deviate from the user's interest, the outdated or document set D'' that deviates from the user's interest is removed from the user-interested document set D 0 to form a new user-interested document set D 2 ;

(3)为了完整的保留用户兴趣,避免重复计算,提高算法性能,系统已经预先存储了计算用户兴趣模型U0时文档集合D0的中心向量将公式(1)变形为公式(2)计算增加新文档后的新的兴趣模型的中心向量:(3) In order to fully retain user interests, avoid repeated calculations, and improve algorithm performance, the system has pre-stored the center vector of the document collection D 0 when calculating the user interest model U 0 Transform formula (1) into formula (2) to calculate the center vector of the new interest model after adding new documents:

WW DD. 11 ‾‾ == ΣΣ ee == 11 mm WW dd 00 ee ++ ΣΣ ff == 11 qq WW dd pfpf mm ++ qq == mm WW DD. 00 ‾‾ ++ ΣΣ ff == 11 qq WW dd pfpf mm ++ qq -- -- -- (( 22 ))

将公式(2)变形为公式(3)计算剔除过时或者偏离用户兴趣的文档后的新的兴趣模型的中心向量:Transform formula (2) into formula (3) to calculate the center vector of the new interest model after excluding documents that are outdated or deviate from the user's interest:

WW DD. 22 ‾‾ == ΣΣ ee == 11 mm WW dd 00 ee -- ΣΣ hh == 11 cc WW dd bhbh mm -- cc == mm WW DD. 00 ‾‾ -- ΣΣ hh == 11 cc WW dd bhbh mm -- cc -- -- -- (( 33 ))

(4)将各维按权值从大到小排序,选择前N维构建新的用户兴趣模型U1(U2),同时把

Figure BDA0000378349690000074
)存入系统。用得到的新用户兴趣模型U1(U2)代替步骤(1)中的U0进行新一阶段推荐。(4) will Each dimension is sorted from large to small according to the weight, and the first N dimensions are selected to construct a new user interest model U 1 (U 2 ), and the
Figure BDA0000378349690000074
) into the system. Use the obtained new user interest model U 1 (U 2 ) to replace U 0 in step (1) for a new stage of recommendation.

从公式(2)和公式(3)可以看出,中心向量

Figure BDA0000378349690000075
都出现在这两个公式中。中心向量
Figure BDA0000378349690000076
是前一次计算用户兴趣模型的一个中间结果,本发明的核心就是每次更新用户兴趣模型时都保存该中心向量
Figure BDA0000378349690000077
使得下一次更新时不需要重新计算该部分内容,从而提高更新效率。From formula (2) and formula (3), it can be seen that the center vector
Figure BDA0000378349690000075
appear in both formulas. center vector
Figure BDA0000378349690000076
is an intermediate result of the previous calculation of the user interest model, and the core of the present invention is to save the center vector every time the user interest model is updated
Figure BDA0000378349690000077
This makes it unnecessary to recalculate this part of the content in the next update, thereby improving the update efficiency.

以表2中的例子为例,对表2所述的用户兴趣模型在增加文档d4更新时,设d4={{汽车,4.0},{保险,3.6},{国产,2.5},{涨幅,2.0}},在中心向量

Figure BDA0000378349690000078
的基础上更新,对于特征词“汽车”,其权值w1计算如式(4)所示,Taking the example in Table 2 as an example, when adding document d 4 to update the user interest model described in Table 2, set d 4 ={{automobile,4.0},{insurance,3.6},{domestic,2.5},{ gain, 2.0}}, at the center vector
Figure BDA0000378349690000078
is updated on the basis of , for the feature word "car", its weight w 1 is calculated as shown in formula (4),

ww 11 == 3.23.2 ** 33 ++ 4.04.0 33 ++ 11 == 3.43.4 -- -- -- (( 44 ))

剔除文档d1更新时,对于特征词“汽车”,其权值w2计算如式(5)所示,When the document d 1 is deleted and updated, for the feature word "car", its weight w 2 is calculated as shown in formula (5),

ww 22 == 3.23.2 ** 33 -- 5.35.3 33 -- 11 == 2.152.15 -- -- -- (( 55 ))

以此类推得到新的用户兴趣模型中心向量,更新结果见表2。该示例仅在用户感兴趣文档数为3的基础上进行增量计算,所以在本示例中计算效率提高并不明显。本示例仅用来说明增量更新算法。实际应用中,用户标记的感兴趣文档数量会比较多,而增加或提出的文档数相对较少,这时候增量更新算法的效率会更为明显。By analogy, the new center vector of the user interest model is obtained, and the update results are shown in Table 2. This example only performs incremental calculations based on the fact that the number of documents the user is interested in is 3, so the increase in calculation efficiency is not obvious in this example. This example is only used to illustrate the incremental update algorithm. In practical applications, the number of interested documents marked by users will be relatively large, and the number of added or proposed documents will be relatively small. At this time, the efficiency of the incremental update algorithm will be more obvious.

Figure BDA0000378349690000081
Figure BDA0000378349690000081

对比表1中的用户兴趣模型提取和和表2中本发明提出的增量更新过程,可以发现,中心向量

Figure BDA0000378349690000082
作为上一次用户兴趣模型创建或更新过程中的一个中间结果,本发明在该中间结果的基础上进行增量更新,从而避免了大量的向量求和工作;并且可以看出,本发明提出的增量更新方法得到的中心向量与直接从更新后的文档集合中提取的相同。一般来说,用于构建新的用户兴趣模型的文档有两部分构成,第一部分是新增加的感兴趣的文档;第二部分是原有的感兴趣文档中剔除偏离当前用户兴趣的文档后剩下的部分,而这部分文档数量占绝大多数。本发明提出的增量更新方式的意义在于避免了第二部分文档的重复计算工作,从而有效降低用户兴趣模型更新计算量。Comparing the user interest model extraction in Table 1 and the incremental update process proposed by the present invention in Table 2, it can be found that the center vector
Figure BDA0000378349690000082
As an intermediate result in the last user interest model creation or update process, the present invention performs incremental update on the basis of the intermediate result, thereby avoiding a large amount of vector summation work; and it can be seen that the incremental update proposed by the present invention The center vectors obtained by the volume update method are the same as those extracted directly from the updated document collection. Generally speaking, the documents used to build a new user interest model are composed of two parts. The first part is newly added interested documents; The following part, and the number of documents in this part accounts for the vast majority. The significance of the incremental update method proposed by the present invention is to avoid the repeated calculation work of the second part of the document, thereby effectively reducing the calculation amount of updating the user interest model.

Claims (5)

1. A user interest model incremental updating method of a personalized recommendation system is characterized by comprising the following steps:
1) user interest vector space model U based on document content is constructed0
2) Establishing the user interest vector space model U0User interest document set D0={d01,d02,...,d0mLet D = { D }1,d2,...,dnIs a set of documents to be recommended, wherein the document diThe feature vector of
Figure FDA0000378349680000012
,(ti2,wi2),...,(tia,wia) }; wherein d is0eRepresenting the set of documents of interest to the user D0The document in (1), e =1,2, ·, m, m is the document set D of interest to the user0Total number of documents in; t is tikRepresenting a document diThe kth term feature word; w is aikRepresenting a document diThe weight of the kth characteristic word; i =1,2,. n; k =1,2,. a; a represents a document diThe total number of terms of the feature words;
3) when the document is recommended, calculating all document feature vectors in the document set D to be recommended and the user interest vector space model U0Recommending the document with the similarity r larger than a threshold value alpha, and feeding back an interested new document to a personalized recommendation system, wherein the new document set is
Figure FDA0000378349680000013
(ii) a When the outdated documents or the documents deviating from the user interest in the user interest document set are selected, respectively calculating the user interest document set D0Each document feature vector and the user interest vector space model U0Selecting the documents with r' smaller than the threshold value alpha as the documents with outdated or deviated user interest, wherein the set of the documents with outdated or deviated user interest is the documents with the outdated or deviated user interest
Figure FDA0000378349680000014
The value range of the threshold alpha is 0-1;
Figure FDA0000378349680000015
for a document in D 'that is the new document set, f =1, 2., q, q is the total number of documents in D' that is the new document set;
Figure FDA0000378349680000016
for the documents in the set of documents D ″ that are out of date or that deviate from the user's interest, h =1,2The total number of documents in the set of documents D ' ' that are outdated or that deviate from the user's interests;
4) when the user interested document set is added, the new document set D' is added to the user interested document set D0In (2), a new first set of user-interesting documents D is formed1(ii) a Or when the documents which are outdated or deviate from the user interest in the user interest document set D are removed, the outdated or the user interest-deviating document set D '' is removed from the user interest document set D0Removing to form a new second user interested document set D2
5) Calculating the new first set of user-interesting documents D according to1Central vector of
<math> <mrow> <mover> <msub> <mi>W</mi> <msub> <mi>D</mi> <mn>1</mn> </msub> </msub> <mo>&OverBar;</mo> </mover> <mo>=</mo> <mfrac> <mrow> <munderover> <mi>&Sigma;</mi> <mrow> <mi>e</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>m</mi> </munderover> <msub> <mi>W</mi> <msub> <mi>d</mi> <mrow> <mn>0</mn> <mi>e</mi> </mrow> </msub> </msub> <mo>+</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>f</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>q</mi> </munderover> <msub> <mi>W</mi> <msub> <mi>d</mi> <mi>pf</mi> </msub> </msub> </mrow> <mrow> <mi>m</mi> <mo>+</mo> <mi>q</mi> </mrow> </mfrac> <mo>=</mo> <mfrac> <mrow> <mi>m</mi> <mover> <msub> <mi>W</mi> <msub> <mi>D</mi> <mn>0</mn> </msub> </msub> <mo>&OverBar;</mo> </mover> <mo>+</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>f</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>q</mi> </munderover> <msub> <mi>W</mi> <msub> <mi>d</mi> <mi>pf</mi> </msub> </msub> </mrow> <mrow> <mi>m</mi> <mo>+</mo> <mi>q</mi> </mrow> </mfrac> <mo>;</mo> </mrow> </math>
Wherein,
Figure FDA00003783496800000215
set D of documents of interest to said user0The feature vector of the e-th document;the feature vector of the f document in the new document set D';
Figure FDA0000378349680000022
set D of documents of interest to said user0A center vector of (d);
calculating the new second set of user-interesting documents D according to2Central vector of
Figure FDA0000378349680000023
<math> <mrow> <mover> <msub> <mi>W</mi> <msub> <mi>D</mi> <mn>2</mn> </msub> </msub> <mo>&OverBar;</mo> </mover> <mo>=</mo> <mfrac> <mrow> <munderover> <mi>&Sigma;</mi> <mrow> <mi>e</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>m</mi> </munderover> <msub> <mi>W</mi> <msub> <mi>d</mi> <mrow> <mn>0</mn> <mi>e</mi> </mrow> </msub> </msub> <mo>-</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>h</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>c</mi> </munderover> <msub> <mi>W</mi> <msub> <mi>d</mi> <mi>bh</mi> </msub> </msub> </mrow> <mrow> <mi>m</mi> <mo>-</mo> <mi>c</mi> </mrow> </mfrac> <mo>=</mo> <mfrac> <mrow> <mi>m</mi> <mover> <msub> <mi>W</mi> <msub> <mi>D</mi> <mn>0</mn> </msub> </msub> <mo>&OverBar;</mo> </mover> <mo>-</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>h</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>c</mi> </munderover> <msub> <mi>W</mi> <msub> <mi>d</mi> <mi>bh</mi> </msub> </msub> </mrow> <mrow> <mi>m</mi> <mo>-</mo> <mi>c</mi> </mrow> </mfrac> <mo>;</mo> </mrow> </math>
Wherein,
Figure FDA00003783496800000217
a feature vector of the h document in the document set D ' ' that is outdated or deviating from the user's interest;
6) will be provided with
Figure FDA0000378349680000025
Or
Figure FDA0000378349680000026
All dimensions are sorted from large to small according to the weight value and selectedOr
Figure FDA0000378349680000028
The first N dimension of the user interest vector space model U is constructed1Or U2At the same time handle
Figure FDA0000378349680000029
Or
Figure FDA00003783496800000210
Storing the information into a personalized recommendation system; wherein N is not more than
Figure FDA00003783496800000211
Or
Figure FDA00003783496800000212
The dimension of (a); using the new user interest vector space model U1Or U2Replacement procedure
1) In (1) U0A new round of recommendation is made.
2. The method for incrementally updating the user interest model of the personalized recommendation system according to claim 1, wherein in the step 1), a user interest vector space model U based on document contents is constructed0The method comprises the following specific steps:
1) performing feature word selection and feature word weight calculation on all documents in which the user is interested;
2) extracting the feature vectors of all the documents which are interested by the user to form a document feature vector set D3
3) Computing the document feature vectorCollection D3The center vector of (2), the document feature vector set D3The central vectors are sorted from large to small according to the weight of each dimension, and the top M dimensions are selected as a user interest vector space model U0(ii) a Wherein M does not exceed the document feature vector set D3The dimension of the central vector.
3. The method of claim 2, wherein the document feature vector set D is a set of document feature vectors3={d31,d32,...,d3xCentral vector of }
Figure FDA00003783496800000213
The calculation formula of (2) is as follows:
<math> <mrow> <mover> <msub> <mi>W</mi> <msub> <mi>D</mi> <mn>3</mn> </msub> </msub> <mo>&OverBar;</mo> </mover> <mo>=</mo> <mfrac> <mrow> <munderover> <mi>&Sigma;</mi> <mrow> <mi>y</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>x</mi> </munderover> <msub> <mi>W</mi> <msub> <mi>d</mi> <mrow> <mn>3</mn> <mi>y</mi> </mrow> </msub> </msub> </mrow> <mi>x</mi> </mfrac> <mo>;</mo> </mrow> </math>
wherein x is the document feature vector set D3The number of middle elements;
Figure FDA0000378349680000033
set D of feature vectors for said document3The feature vector of the y-th document; y =1, 2.
4. The method for incrementally updating the user interest model of the personalized recommendation system according to any one of claims 1 to 3, wherein the document D in the document set D to be recommended isiFeature vector and the user interest vector space model U0The calculation formula of the similarity r is as follows:
<math> <mrow> <mi>r</mi> <mo>=</mo> <mi>cos</mi> <mrow> <mo>(</mo> <msub> <mi>W</mi> <msub> <mi>d</mi> <mi>i</mi> </msub> </msub> <mo>,</mo> <msub> <mi>U</mi> <mn>0</mn> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <msub> <mi>W</mi> <msub> <mi>d</mi> <mi>i</mi> </msub> </msub> <mo>&CenterDot;</mo> <msub> <mi>U</mi> <mn>0</mn> </msub> </mrow> <mrow> <msub> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>W</mi> <msub> <mi>d</mi> <mi>i</mi> </msub> </msub> <mo>|</mo> <mo>|</mo> </mrow> <mn>2</mn> </msub> <mo>&times;</mo> <msub> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>U</mi> <mn>0</mn> </msub> <mo>|</mo> <mo>|</mo> </mrow> <mn>2</mn> </msub> </mrow> </mfrac> <mo>;</mo> </mrow> </math>
wherein | | | purple hair2Representing a two-norm.
5. The method for incrementally updating the user interest model of the personalized recommendation system as recited in claim 4, wherein the set of documents D of interest of the user0The e-th document feature vector and the user interest vector space model U0The calculation formula of the similarity r' is as follows:
<math> <mrow> <msup> <mi>r</mi> <mo>&prime;</mo> </msup> <mo>=</mo> <mi>cos</mi> <mrow> <mo>(</mo> <msub> <mi>W</mi> <msub> <mi>d</mi> <mrow> <mn>0</mn> <mi>e</mi> </mrow> </msub> </msub> <mo>,</mo> <msub> <mi>U</mi> <mn>0</mn> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <msub> <mi>W</mi> <msub> <mi>d</mi> <mrow> <mn>0</mn> <mi>e</mi> </mrow> </msub> </msub> <mo>&CenterDot;</mo> <msub> <mi>U</mi> <mn>0</mn> </msub> </mrow> <mrow> <msub> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>W</mi> <msub> <mi>d</mi> <mrow> <mn>0</mn> <mi>e</mi> </mrow> </msub> </msub> <mo>|</mo> <mo>|</mo> </mrow> <mn>2</mn> </msub> <mo>&times;</mo> <msub> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>U</mi> <mn>0</mn> </msub> <mo>|</mo> <mo>|</mo> </mrow> <mn>2</mn> </msub> </mrow> </mfrac> <mo>.</mo> </mrow> </math>
CN201310403293.3A 2013-09-06 2013-09-06 The user interest model increment updating method of personalized recommendation system Expired - Fee Related CN103488705B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310403293.3A CN103488705B (en) 2013-09-06 2013-09-06 The user interest model increment updating method of personalized recommendation system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310403293.3A CN103488705B (en) 2013-09-06 2013-09-06 The user interest model increment updating method of personalized recommendation system

Publications (2)

Publication Number Publication Date
CN103488705A true CN103488705A (en) 2014-01-01
CN103488705B CN103488705B (en) 2016-06-22

Family

ID=49828931

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310403293.3A Expired - Fee Related CN103488705B (en) 2013-09-06 2013-09-06 The user interest model increment updating method of personalized recommendation system

Country Status (1)

Country Link
CN (1) CN103488705B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104239512A (en) * 2014-09-16 2014-12-24 电子科技大学 Text recommendation method
CN104268760A (en) * 2014-09-24 2015-01-07 同济大学 User interest obtaining and transmitting method and system
CN105260481A (en) * 2015-11-13 2016-01-20 合一网络技术(北京)有限公司 Evaluation method and system of push list diversity
CN106055661A (en) * 2016-06-02 2016-10-26 福州大学 Multi-interest resource recommendation method based on multi-Markov-chain model
WO2017088587A1 (en) * 2015-11-24 2017-06-01 华为技术有限公司 Data processing method and device
CN107562912A (en) * 2017-09-12 2018-01-09 电子科技大学 Sina Weibo event recommendation method
CN107635004A (en) * 2017-09-26 2018-01-26 义乌控客科技有限公司 A kind of personalized service method for customizing in intelligent domestic system
WO2018028326A1 (en) * 2016-08-08 2018-02-15 华为技术有限公司 Model updating method and apparatus
CN108446350A (en) * 2018-03-09 2018-08-24 华中科技大学 A kind of recommendation method based on topic model analysis and user's length interest
CN110287202A (en) * 2019-05-16 2019-09-27 北京百度网讯科技有限公司 Data-updating method, device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101339562A (en) * 2008-08-15 2009-01-07 北京航空航天大学 A Portal Personalized Recommendation Service System Introducing an Interest Model Feedback and Update Mechanism
CN101477554A (en) * 2009-01-16 2009-07-08 西安电子科技大学 User interest based personalized meta search engine and search result processing method
CN102890689A (en) * 2011-07-22 2013-01-23 北京百度网讯科技有限公司 Method and system for building user interest model

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101339562A (en) * 2008-08-15 2009-01-07 北京航空航天大学 A Portal Personalized Recommendation Service System Introducing an Interest Model Feedback and Update Mechanism
CN101477554A (en) * 2009-01-16 2009-07-08 西安电子科技大学 User interest based personalized meta search engine and search result processing method
CN102890689A (en) * 2011-07-22 2013-01-23 北京百度网讯科技有限公司 Method and system for building user interest model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
INGRID ZUKERMAN AND DAVID W. ALBRECHT: "Predictive Statistical Models for User Modeling", 《USER MODELING AND USER-ADAPTED INTERACTION》, 31 December 2001 (2001-12-31), pages 5 - 18, XP008026202, DOI: doi:10.1023/A:1011175525451 *
李峰,等。: "基于隐式反馈的自适应用户兴趣模型", 《计算机工程与应用》, 31 December 2008 (2008-12-31) *

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104239512A (en) * 2014-09-16 2014-12-24 电子科技大学 Text recommendation method
CN104239512B (en) * 2014-09-16 2017-06-06 电子科技大学 A kind of text recommends method
CN104268760A (en) * 2014-09-24 2015-01-07 同济大学 User interest obtaining and transmitting method and system
CN104268760B (en) * 2014-09-24 2017-06-13 同济大学 A kind of user interest is obtained and transmission method and its system
CN105260481A (en) * 2015-11-13 2016-01-20 合一网络技术(北京)有限公司 Evaluation method and system of push list diversity
CN105260481B (en) * 2015-11-13 2019-09-17 优酷网络技术(北京)有限公司 A kind of multifarious evaluating method of push list and system
WO2017088587A1 (en) * 2015-11-24 2017-06-01 华为技术有限公司 Data processing method and device
CN106055661A (en) * 2016-06-02 2016-10-26 福州大学 Multi-interest resource recommendation method based on multi-Markov-chain model
CN106055661B (en) * 2016-06-02 2017-11-17 福州大学 More interest resource recommendations based on more Markov chain models
WO2018028326A1 (en) * 2016-08-08 2018-02-15 华为技术有限公司 Model updating method and apparatus
CN107704929A (en) * 2016-08-08 2018-02-16 华为技术有限公司 A kind of model update method and device
CN107704929B (en) * 2016-08-08 2020-10-23 华为技术有限公司 Model updating method and device
CN107562912A (en) * 2017-09-12 2018-01-09 电子科技大学 Sina Weibo event recommendation method
CN107562912B (en) * 2017-09-12 2021-08-27 电子科技大学 Sina microblog event recommendation method
CN107635004A (en) * 2017-09-26 2018-01-26 义乌控客科技有限公司 A kind of personalized service method for customizing in intelligent domestic system
CN107635004B (en) * 2017-09-26 2020-12-08 杭州控客信息技术有限公司 Personalized service customization method in intelligent home system
CN108446350A (en) * 2018-03-09 2018-08-24 华中科技大学 A kind of recommendation method based on topic model analysis and user's length interest
CN108446350B (en) * 2018-03-09 2020-05-19 华中科技大学 Recommendation method based on topic model analysis and long and short interests of user
CN110287202A (en) * 2019-05-16 2019-09-27 北京百度网讯科技有限公司 Data-updating method, device, electronic equipment and storage medium
CN110287202B (en) * 2019-05-16 2022-02-15 北京百度网讯科技有限公司 Data updating method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN103488705B (en) 2016-06-22

Similar Documents

Publication Publication Date Title
CN103488705B (en) The user interest model increment updating method of personalized recommendation system
CN108334574B (en) A Cross-modal Retrieval Method Based on Collaborative Matrix Decomposition
CN103886048B (en) Cluster-based increment digital book recommendation method
Xi et al. Training transformers with 4-bit integers
CN111523051A (en) A method and system for social interest recommendation based on graph convolution matrix decomposition
CN105701191A (en) Push information click rate estimation method and device
JP2013534334A (en) Method and apparatus for sorting query results
CN103034726B (en) Text filtering system and method
CN108399268B (en) Incremental heterogeneous graph clustering method based on game theory
CN104239373A (en) Document tag adding method and document tag adding device
CN109492678A (en) A kind of App classification method of integrated shallow-layer and deep learning
CN103425677A (en) Method for determining classified models of keywords and method and device for classifying keywords
Fitriyani et al. The K-means with mini batch algorithm for topics detection on online news
CN110717103B (en) Improved collaborative filtering method based on stack noise reduction encoder
CN105912524A (en) Article topic keyword extraction method and apparatus based on low-rank matrix decomposition
CN106294418A (en) Search method and searching system
WO2020147259A1 (en) User portait method and apparatus, readable storage medium, and terminal device
CN113282756A (en) Text clustering intelligent evaluation method based on hybrid clustering
CN109800853B (en) Matrix decomposition method and device fusing convolutional neural network and explicit feedback and electronic equipment
Dong et al. A convex adaptive total variation model based on the gray level indicator for multiplicative noise removal
CN107506871A (en) A kind of method and system of interval prediction
CN109325511B (en) A method to improve feature selection
Fienberg Introduction to papers on the modeling and analysis of network data
Zhu Analysis of a multigrid preconditioner for Crouzeix–Raviart discretization of elliptic partial differential equation with jump coefficients
CN112396137A (en) Point cloud semantic segmentation method fusing context semantics

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160622

Termination date: 20190906