CN105426550A - Collaborative filtering tag recommendation method and system based on user quality model - Google Patents

Collaborative filtering tag recommendation method and system based on user quality model Download PDF

Info

Publication number
CN105426550A
CN105426550A CN201511018787.5A CN201511018787A CN105426550A CN 105426550 A CN105426550 A CN 105426550A CN 201511018787 A CN201511018787 A CN 201511018787A CN 105426550 A CN105426550 A CN 105426550A
Authority
CN
China
Prior art keywords
user
label
model
prof
resource
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201511018787.5A
Other languages
Chinese (zh)
Other versions
CN105426550B (en
Inventor
冯研
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
TCL Corp
Original Assignee
TCL Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by TCL Corp filed Critical TCL Corp
Priority to CN201511018787.5A priority Critical patent/CN105426550B/en
Publication of CN105426550A publication Critical patent/CN105426550A/en
Application granted granted Critical
Publication of CN105426550B publication Critical patent/CN105426550B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9562Bookmark management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a collaborative filtering tag recommendation method and system based on user quality model. The collaborative filtering tag recommendation method comprises the following steps: perfecting a tag system according to situations that happen in the existing system; mapping information of users in the system into a two-dimensional matrix, so as to construct a user model, and storing the user model in the form of a user-tag two-dimensional matrix; acquiring model vectors of the current user, and calculating the degree of similarity between the current user and the neighbor user in the system; calculating model vectors of the neighbor user; according to the model vectors of the neighbor user in the system and a modified collaborative filtering recommendation algorithm, generating the best recommendation; returning the best recommendation result to the user interface through a WEB server. According to the collaborative filtering tag recommendation method and system based on the user quality model, provided by the invention, the best recommended user selection process in the conventional algorithm is optimized, the recommendation accuracy and the recall rate are improved, and evolution and updating of the current tag system in the system are facilitated; besides, according to appearing situations of users and resources in the system, proper tag sources are selected, and the problems of cold boot and single tag source are solved.

Description

A kind of collaborative filtering label recommendation method based on user quality model and system
Technical field
The present invention relates to WEB application technical field, particularly relate to a kind of collaborative filtering label recommendation method based on user quality model and system.
Background technology
Along with the deep development of network technology, label has become a kind of standard information organizational form on internet, be widely used in freedom to pasture, freedom to pasture is a kind of method that user accesses freedom of information, and this method makes user can mark information characteristics with the form of " label " with the voice of oneself.Utilize label to carry out the classification of information, tissue and retrieval to this paper, picture, video and audio resource, realizing the search of information and share, is a kind of Information Organization instrument shown unique characteristics in internet information environment.In the past in recent years, the tag system that user set up, shared metadata has obtained exploration and application on internet, such as Flickrtll, Del.icio.us21, the websites such as Connoteat and LibraryThingt are all considered to the example of Web2.0 technology application, because they utilize network to collect and organize your messages, this type systematic provides " group's driving " and the method for " organic " is come, to network information resource classification, to be convenient to the discovery of information, to browse and multiplexing.
Classificating word in traditional classification system is remitted to toward lacking popular and correlativity, vocabulary is relatively out-of-date, and professional person is difficult to obtain relevant information and expected results by traditional classification lexical search, and the metadata cost used in traditional classification structure is relatively high, because the definition of metadata and classification need to expend a large amount of time and efforts of professional person, and in tag system, loaded down with trivial details metadata definition task is given user and has been come by system, label definition is the group behavior of user to resource, therefore tag system is stronger concerning compactedness user than traditional fixed bed aggregated(particle) structure categorizing system, better adaptability, more meet current fashion trend.Labeling makes the emphasis searched better be shown and give prominence to by label, with general keyword unlike, when searching for keyword, the article comprising keyword in content can only be searched, but tag contains the keyword do not had in literary composition, use tag to search for, the article comprising vocabulary beyond keyword can be searched, expand width and the range of search.
Although label embodies remarkable advantage when realizing retrieval and the web page navigation of information resources, but the request for utilization people of label must pre-define label, but the definition procedure of manual label is often time-consuming loaded down with trivial details, in order to people are freed from time-consuming loaded down with trivial details label definition work, make freely to classify and can obtain broader applications, the release of label recommendations service is extremely urgent, what this service realized is to user recommend some potential may label interested to user, allow user therefrom select, thus it is more convenient that label is defined.
Label recommendations is one to be applied along with network technology and the emerging field occurred, but sees to there is following problem on the whole:
1. the outmoded problem of label.The label recommended derives from fixing label system, As time goes on, the continuous increase of data volume, must increase and to lack in some original label systems, and be applicable to the label of new resources, but fixing label system can not As time goes on evolution, will certainly cause the decline recommending quality.
2. cold start-up problem.User, label, resource are three large key elements of label recommendations system, this three large key element appearance situation in systems in which should be taken into full account during recommendation, but existing label recommendations system mostly only carrys out information extraction from existing user model and resource model, but have ignored the data mining problem that system should solve when in the face of a new user, new resources.
3. the unicity in label source.Resource content, user's history tab (also referred to as user interest label), resource history tab are three kinds of topmost label sources of label recommendations, and there are the relative merits of self in often kind of label source, it is a certain that existing label recommendations system is mostly only absorbed in wherein, multiple label source do not combined.
Therefore, prior art has yet to be improved and developed.
Summary of the invention
In view of the deficiencies in the prior art, the object of the invention is to provide a kind of collaborative filtering label recommendation method based on user quality model and system, be intended to solve in prior art all also exist that Label space is outmoded based on Collaborative Filtering Recommendation Algorithm and existing most of label recommendations algorithm, cold start-up and the problem such as label source is too single.
Technical scheme of the present invention is as follows:
Based on a collaborative filtering label recommendation method for user quality model, wherein, method comprises:
A, user's input information detected, obtain the training set in labeling information database, extract all labels in training set and form the label system of existing system, and carry out perfect according to the situation that resource and user occur in existing system to label system;
B, the information MAP of user in system is built user model to two-dimensional matrix, and to store with user-label two-dimensional matrix form;
The model vector of C, acquisition active user, calculates the similarity of neighbor user in active user and system;
The model quality of neighbor user in D, computing system;
E, model quality according to neighbor user in system, produce best recommendation according to the Collaborative Filtering Recommendation Algorithm improved;
F, best recommendation results is back to user interface by WEB server.
The described collaborative filtering label recommendation method based on user quality model, wherein, described steps A specifically comprises:
A1, user's input information detected, obtain the training set in labeling information database, all labels extracted in training set form the label system C{t1 of existing system S, t2 ..., tn};
A2, judge resource R iwith user U isituation about occurring in existing system S;
If A3 or if resource did not occur in existing system, then extracted resource R iin the highest resource key word in title of front X weight add in system label system C;
If A4 namely resource occurred in systems in which, and user did not occur, then extract resource R iin Y the label that frequency of utilization is the highest and the highest resource key word in title of X weight add in system label system C;
If A5 is U i∈ SandR i∈ S, namely user and resource occurred all in systems in which, adopted history tab information.
The described collaborative filtering label recommendation method based on user quality model, wherein, described step B specifically comprises:
B1, the information MAP of the user of K in system is built user model to two-dimensional matrix, and mapping result stores with user-label characteristics matrix;
Each row vector VU in B2, matrix k=(w (T 1); W (T 2); ...; W (T i); W (T n)) represent the user model of a user, wherein T irepresent i-th and user U krelevant resource, w (T i) represent label T iat vectorial VU kin weight,
W ( T i ) = t f ( T i , U k ) × l o g ( N N T i )
Wherein tf (T i, U k) represent T ilabel is by user U kthe number of times used, N represents system label sum, represent and at least used a T ithe number of users of label.
The described collaborative filtering label recommendation method based on user quality model, wherein, described step C is specially: the model vector obtaining active user, calculates the similarity sim (prof of neighbor user in active user and system u, prof v)
s i m ( prof u , prof v ) : = < prof u , prof v > | | prof u | | | | prof v | |
Wherein prof uand prof vbe respectively the user model vector of active user u and neighbor user v.
The described collaborative filtering label recommendation method based on user quality model, wherein, described step D specifically comprises: model quality Qu (v) of neighbor user in computing system,
Q u ( v ) = &Sigma; i = 1 | P l ( v ) | | u l , k i | N u &times; avgU s i m , l , k i &times; N k i , l N l &times; w ( l , k i ) | P l ( v ) |
Wherein:
a v g U s i m , l , k i = 1 | u l , k i | | u l , k j | &Sigma; u s i m x &Element; U l , k i &Sigma; u s i m y &Element; U l , k j s i m ( u s i m x , u s i m y ) w ( l , k i ) = &Sigma; U r e c &Element; U l , k i s i m ( U r e c , v ) &times; kf k i , v , l &times; log ( N N k i ) max k &Element; k r e c , w l , k
In above-mentioned formula, k ifor i-th label of user v, for k inumber of users normative value, avg for k ithe average similarity of user, for k iword frequency, w (l, k i) be k ispecificity values, the model quality of neighbor user is the average label quality of this neighbor user.
The described collaborative filtering label recommendation method based on user quality model, wherein, the best recommendation results in the Collaborative Filtering Recommendation Algorithm of the improvement in described step e is designated as T (u, r), and computing formula is:
N u : = argmax v &Element; U k Q u ( v ) s i m ( prof u , prof v )
T ( u , l ) : = argmax t &Element; N u n &Sigma; v &Element; U Q u ( v ) s i m ( prof u , prof v ) &delta; ( v , l , t )
δ(v,l,t):=1ifδ(v,l,t)∈U×L×T,else0。,
N in above formula uk for active user u the most close individual neighbor user, the best recommendation results that T (u, r) is algorithm, sim (prof u, prof v) be the similarity between active user u and neighbor user v, δ (v, r, t) ∈ U × R × T represents that user v exists label defining relation to resource r.
Based on a collaborative filtering label recommendations system for user quality model, wherein, system comprises:
Label system improves module, for user's input information being detected, obtain the training set in labeling information database, extract all labels in training set and form the label system of existing system, and carry out perfect according to the situation that resource and user occur in existing system to label system;
User model builds module, for the information MAP of user in system is built user model to two-dimensional matrix, and stores with user-label two-dimensional matrix form;
Similarity calculation module, for obtaining the model vector of active user, calculates the similarity of neighbor user in active user and system;
Model quality computing module, for the model quality of neighbor user in computing system;
Best recommendation generation module, for the model quality according to neighbor user in system, generates best recommendation according to the Collaborative Filtering Recommendation Algorithm improved;
Result feedback module, for being back to user interface by best recommendation results by WEB server.
The described collaborative filtering label recommendations system based on user quality model, wherein, described label system is improved module and is specifically comprised:
Label system Component units, for user's input information being detected, obtains the training set in labeling information database, and all labels extracted in training set form the label system C{t1 of existing system S, t2 ..., tn};
Judging unit, for judging resource R iwith user U isituation about occurring in existing system S;
First processing unit, if for or if resource did not occur in existing system, then extracted resource R iin the highest resource key word in title of front X weight add in system label system C;
Second processing unit, if for namely resource occurred in systems in which, and user did not occur, then extract resource R iin Y the label that frequency of utilization is the highest and the highest resource key word in title of X weight add in system label system C;
3rd processing unit, if for U i∈ SandR i∈ S, namely user and resource occurred all in systems in which, adopted history tab information.
The described collaborative filtering label recommendations system based on user quality model, wherein, described user model builds module and specifically comprises:
Storage unit, for the information MAP of the user of K in system is built user model to two-dimensional matrix, and mapping result stores with user-label characteristics matrix;
User model construction unit, for each row vector VU in matrix k=(w (T 1); W (T 2); W (T i); W (T n)) represent the user model of a user, wherein T irepresent i-th and user U krelevant resource, w (T i) represent label T iat vectorial VU kin weight,
W ( T i ) = t f ( T i , U k ) &times; l o g ( N N T i )
Wherein tf (T i, U k) represent T ilabel is by user U kthe number of times used, N represents system label sum, represent and at least used a T ithe number of users of label.
The described collaborative filtering label recommendations system based on user quality model, wherein, described similarity calculation module is specially: the model vector obtaining active user, calculates the similarity sim (prof of neighbor user in active user and system u, prof v),
s i m ( prof u , prof v ) : = < prof u , prof v > | | prof u | | | | prof v | |
Wherein prof uand prof vbe respectively the user model vector of active user u and neighbor user v.
The invention provides a kind of collaborative filtering label recommendation method based on user quality model and system, in the present invention, user model quality judging theory is applied in traditional collaborative filtering label recommendations, recommend user to choose process to the best in traditional algorithm to be optimized, and then improve accuracy and the recall rate of recommendation, system can realize evolution and the renewal of label system, solves the outmoded problem of Label space; Analyze the advantage in various label source simultaneously, and according to user and resource appearance situation in systems in which, choose suitable label source, solve the too single problem of cold start-up and label source.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of the preferred embodiment of a kind of collaborative filtering label recommendation method based on user quality model of the present invention.
Fig. 2 is the schematic diagram of the embody rule embodiment of a kind of collaborative filtering label recommendation method based on user quality model of the present invention.
Fig. 3 is the functional schematic block diagram of the preferred embodiment of a kind of collaborative filtering label recommendations system based on user quality model of the present invention.
Embodiment
For making object of the present invention, technical scheme and effect clearly, clearly, the present invention is described in more detail below.Should be appreciated that specific embodiment described herein only in order to explain the present invention, be not intended to limit the present invention.
Traditional Collaborative Filtering Recommendation System based on nearest-neighbors collection has obtained and extensively and has successfully used, using the same method and solving label recommendations problem is natural selection, but there is the singularity of self in label recommendations system, because not scoring, the substitute is label in label recommendations system.
Tag system is made up of user, resource and label three kinds of elements usually, and user can to the resource definition label in system, and the type of resource is determined by system type, and a label recommendations system can be made up of following 4 parts:
1. the set U that in system, all users are formed
2. the set R of all resource composition in system
3. the set T that in system, all labels form
4. relation function this relation function represents that user U defines tag set to resource wherein
Input a user u ∈ U and resource r ∈ R generation in system with the tag set T (u, r) of score value, recommend front n the label of marking the highest in set.
Similar with collaborative filtering, user profile is also mapped to two-dimensional matrix to store by label recommendations system, and mapping result will obtain two user model matrixes: one is that the user-resource matrix of K × M size is designated as matrix X; Another is the user-label matrix of K × L size, be designated as Y, wherein K=|U|, M:=|R|, L=|T|, do not have score information record in collaborative filtering tag system, only have the related information of user and resource and the related information of user and label, these information are recorded in binary matrix X and Y in an encoded form, wherein X ∈ { 0,1} k × m, Y ∈ { 0,1} k × 1such as, if the element X in X matrix k, m=1, represent that a kth user is relevant with m resource, if equal 0, then represent onrelevant.In like manner, the element Y in matrix Y k, lwhen=1, represent that a kth user is relevant with the 1st label, if equal 0, then represent onrelevant.
For given user u and resource r, first algorithm finds once for resource r is defined the user of label, then the calculating formula of similarity based on the collaborative filtering of user is adopted to go out to calculate the similarity of active user and these users respectively, the neighbor user collection that will obtain active user (adopts the difference of model according to Similarity Measure, neighbor user will be different, because the calculating of similarity may based on two kinds of matrix models, one is user-resource matrix model, another kind is user-label matrix model), then recommendation score is carried out according to the similarity of neighbor user and active user to by the label of neighbours, the label that multiple neighbor user is shared has higher recommendation score.
Integrate as U user, tally set is T, and resource set is that in the label recommendations system of R, it is as follows that algorithm recommended by collaborative filtering label:
N u : = argmax v &Element; U k s i m ( prof u , prof v )
T ( u , r ) : = argmax t &Element; T n &Sigma; v &Element; U u s i m ( prof u , prof v ) &delta; ( v , l , t )
δ(v,r,t):=1ifδ(v,r,t)∈U×R×Telse0
N in above formula uk for active user u the most close individual neighbor user, the recommendation results that T (u, r) is algorithm, sim (prof u, prof v) be the similarity between active user u and neighbor user v, δ (v, r, t) ∈ U × R × T represents that user v exists label defining relation to resource r.Wherein: the dynamic assignment of=operator representation, each time when in formula, the parameter value on right side changes, the value on the left side covers previous value automatically.
In label recommendations system, user u ∈ U model uses P usually 1(u)=∪ r ∈ Rd (u, l) represents, wherein D (u, l) represents the tag set that user u defines resource l, and what user model described is the tag set that user is be defined in systems in which, and therefore the quality of label directly determines the quality of user model.Label is the keyword that user defines according to personal interest and resource content, and therefore a good label should have personalization and specificity, not only meets the vocabulary use habit of user, highly can also describe resource and reflect the interest tendency of user.
User u defines label k to resource l iafter, then label k ithe parameter measure such as quality available subscribers number, user's similarity, word frequency, label specificity.
Label k inumber of users for use k idefine user's number of resource l.K inumber of users is larger, and quality is also higher.K inumber of users can be expressed as | u l, k i|, u l, k ifor using label k idefine all user's set of resource l, adopt system user sum N allit is standardized:
Label k iuser's similarity for use k idefine the average similarity of those users of resource l.Average user similarity is larger, label k iquality high.The average similarity computing formula of user is as follows:
a v g U s i m , l , k i = 1 | u l , k i | | u l , k j | &Sigma; u s i m x &Element; U l , k i &Sigma; u s i m y &Element; U l , k j s i m ( u s i m x , u s i m y )
Wherein represent and use label k idefine all user's set of resource l, | u l, k i| represent the number of users in customer group, u simxand u simyshow any two not identical users in customer group, sim (u simx, u simy) represent the user model similarity of two users, obtain by calculating user model proper vector cosine angle.
Label k iword frequency be defined as k ithe number of times being used to define resource l accounts for resource l by the proportion by subtraction of all label definition number of times.K iword frequency is higher, and its quality is also higher.K ithe word frequency of label can be used represent, represent k ilabel is used to the number of times defining resource l, N lrepresent the total degree that resource l is defined by all labels.
Label k ilabel specificity be used to weigh k ito the sign degree important indicator of resource l, which show k ibe used to the extensive degree defining different resources.Specificity is higher, and label quality is better.Label specificity calculates by TF-IDF algorithm and obtains:
w l , k i = kf k i , l &times; l o g ( N N k i )
In above-mentioned formula represent label k ibe used to the frequency defining resource l, N is all total number resource, for at least by k ithe first resource number that label is defined.
User tag total quality is higher, and user model quality is also higher, and user model quality reflects the accuracy of user tag define behavior, advisability, and that is, the user model quality of a user is higher, and the label of his definition is just more suitable for for recommending.
Traditional collaborative filtering label recommendations algorithm seeks label recommendations from neighbor user, only consider the user model similarity of neighbor user and active user, but have ignored the user model quality at neighbours family, recommend of low quality, therefore can adopt the label recommendations algorithm based on user model quality.
The invention provides a kind of process flow diagram of preferred embodiment of the collaborative filtering label recommendation method based on user quality model, as shown in Figure 1, method comprises:
Step S100, user's input information detected, obtain the training set in labeling information database, extract all labels in training set and form the label system of existing system, and carry out perfect according to the situation that resource and user occur in existing system to label system.
During concrete enforcement, from labeling information database, select training set, all labels extracted in training set form the label system C{t1 of ready-made system S, t2 ..., tn}, and carry out perfect respectively according to the situation that resource and user occur in existing system.Further, user's input information detected, also from labeling information database, obtain test set, wherein test set is the sampling set of the label of labeling information database.The label system C{t1 of existing system S is formed when extracting all labels in training set, t2, ..., during tn}, test set is adopted to detect label system, judge current label system C{t1, t2, ..., whether tn} is complete, concrete, when all labels in test set are all in current label system, then judge current label system C{t1, t2, ..., tn} is complete, if when some label in all labels in test set is not in current label system, then judge current label system C{t1, t2, ..., tn} is incomplete, to will further carry out perfect to existing label system.Particularly, improve label system and again can choose training set, or the label do not occurred in test set is added in label system.
During concrete enforcement, described step S100 specifically comprises:
Step S101, user's input information detected, obtain the training set in labeling information database, all labels extracted in training set form the label system C{t1 of existing system S, t2 ..., tn};
Step S102, judge resource R iwith user U isituation about occurring in existing system S;
If step S103 or if resource did not occur in existing system, then extracted resource R iin the highest resource key word in title of front X weight add in system label system C;
If step S104 namely resource occurred in systems in which, and user did not occur, then extract resource R iin Y the label that frequency of utilization is the highest and the highest resource key word in title of X weight add in system label system C;
If step S105 is U i∈ SandR i∈ S, namely user and resource occurred all in systems in which, adopted history tab information.
During concrete enforcement, analyze resource R iwith user U isituation about occurring in existing system S, user U iwith resource R ifollowing 4 kinds of situations can be there are:
(1) be entirely cold start-up situation, new user, new resources;
(2) user occurred in systems in which, and resource did not occur;
(3) resource occurred in systems in which, and user did not occur;
(4) U i∈ SandR i∈ S user and resource occurred all in systems in which.
Label for different situation is employed all means available as follows:
When there are situation (1) and situation (2), extract resource R iin the highest resource key word in title of front X weight { key1, key2, key3} add in system label system C, i.e. C ← { key1, key2, key3};
When there are situation (3), extract resource R iin Y the most popular label and the highest resource key word in title of X weight add in system label system C;
When there are situation (4), adopt history tab information.
During concrete enforcement, X can pre-set, and is preferably 3, and the value of Y also can pre-set, and is preferably 2.
Step S200, the information MAP of user in system is built user model to two-dimensional matrix, and to store with user-label two-dimensional matrix form.
During concrete enforcement, user model is by building the information MAP of the user of k in system to two-dimensional matrix, and mapping result will to shown in a user one label characteristics matrix QT, each the row vector VU in matrix k=(w (T 1); W (T 2); W (T i); W (T n)) represent the user model of a user, wherein T irepresent i-th and user U krelevant resource, w (T i) represent label T iat vectorial VU kin weight.
Described step S200 specifically comprises:
Step S201, the information MAP of the user of K in system is built user model to two-dimensional matrix, and mapping result stores with user-label characteristics matrix;
Each row vector VU in step S202, matrix k=(w (T 1); W (T 2); W (T i); W (T n)) represent the user model of a user, wherein T irepresent i-th and user U krelevant resource, w (T i) represent label T iat vectorial VU kin weight,
W ( T i ) = t f ( T i , U k ) &times; l o g ( N N T i )
Wherein tf (T i, U k) represent T ilabel is by user U kthe number of times used, N represents system label sum, represent and at least used a T ithe number of users of label.
The model vector of step S300, acquisition active user, calculates the similarity of neighbor user in active user and system.
During concrete enforcement, neighbor user refers to the user higher with the degree of correlation of active user, and such as coexist a regional user.User model in label recommendations system stores with the form of user-label two-dimensional matrix, and active user can obtain by calculating their user models corresponding in a matrix vector cosine similar value with the similarity of other users in system.Obtain the model vector of active user particularly, calculate the similarity sim (prof of neighbor user in active user and system u, prof v),
s i m ( prof u , prof v ) : = < prof u , prof v > | | prof u | | | | prof v | |
Wherein prof uand prof vbe respectively the user model vector of active user u and neighbor user v.
The model quality of neighbor user in step S400, computing system.
During concrete enforcement, learnt by user model mass theory, user model quality is by user's usage frequency, customer group similarity, tag characterization frequency and label is specific affects.Model quality Qu (v) of neighbor user in computing system,
Q u ( v ) = &Sigma; i = 1 | P l ( v ) | | u l , k i | N u &times; a v g U s i m , l , k i &times; N k i , l N l &times; w ( l , k i ) | P l ( v ) |
Wherein:
a v g U s i m , l , k i = 1 | u l , k i | | u l , k j | &Sigma; u s i m x &Element; U l , k i &Sigma; u s i m y &Element; U l , k j s i m ( u s i m x , u s i m y ) w ( l , k i ) = &Sigma; U r e c &Element; U l , k i s i m ( U r e c , v ) &times; kf k i , v , l &times; log ( N N k i ) max k &Element; k r e c , w l , k
In above-mentioned formula, k ifor i-th label of user v, for k inumber of users normative value, avg for k ithe average similarity of user, for k iword frequency, w (l, k i) be k ispecificity values, the model quality of neighbor user is the average label quality of this neighbor user.
Step S500, model quality according to neighbor user in system, produce best recommendation according to the Collaborative Filtering Recommendation Algorithm improved.
During concrete enforcement, in label recommendations system, for active user, as the neighbor user of nominator, the height of his user model quality has material impact to recommendation effect, is therefore improved by collaborative filtering label recommendations algorithm, and the best recommendation results in the Collaborative Filtering Recommendation Algorithm of improvement is designated as T (u, r), computing formula is:
N u : = argmax v &Element; U k Q u ( v ) s i m ( prof u , prof v )
T ( u , l ) : = argmax t &Element; N u n &Sigma; v &Element; U Q u ( v ) s i m ( prof u , prof v ) &delta; ( v , l , t )
δ(v,l,t):=1ifδ(v,l,t)∈U×L×T,else0。,
N in above formula uk for active user u the most close individual neighbor user, the best recommendation results that T (u, r) is algorithm, sim (prof u, prof v) be the similarity between active user u and neighbor user v, δ (v, r, t) ∈ U × R × T represents that user v exists label defining relation to resource r.
Step S600, best recommendation results is back to user interface by WEB server.
During concrete enforcement, by best recommendation results by WEB server, turn back to user interface.User can use different interfaces, if user uses television interfaces, is then back to user's television interfaces.
Present invention also offers a kind of process flow diagram of embody rule embodiment of the collaborative filtering label recommendation method based on user quality model, be introduced for user's television interfaces, as shown in Figure 2, method comprises:
Particularly, described TV is connected with WEB server, described WEB server also with described DataBase combining.Described database comprises the user information database storing user history information, the resource information bank of storage resources information, stores the tag information base of label information.
When user watches TV by user's television interfaces of TV, and user's viewing information is sent to WEB server, WEB server carries out data prediction to viewing information, and obtain user history information from user information database, active user's quality model is generated according to user history information, core recommended models is generated according to the resource information of active user's quality model and resource information bank and the label information of tag information base, according to core recommended models generating recommendations result, and recommendation results is sent to WEB server, WEB server is checked for user by recommending the page that recommendation results is back to user's television interfaces.
Present invention also offers a kind of functional schematic block diagram of the collaborative filtering label recommendations system based on user quality model, as shown in Figure 3, wherein, method comprises:
Label system improves module 100, for user's input information being detected, obtain the training set in labeling information database, extract all labels in training set and form the label system of existing system, and carry out perfect according to the situation that resource and user occur in existing system to label system; As detailed above.
User model builds module 200, for the information MAP of user in system is built user model to two-dimensional matrix, and stores with user-label two-dimensional matrix form; As detailed above.
Similarity calculation module 300, for obtaining the model vector of active user, calculates the similarity of neighbor user in active user and system; As detailed above.
Model quality computing module 400, for the model quality of neighbor user in computing system; As detailed above.
Best recommendation generation module 500, for the model quality according to neighbor user in system, generates best recommendation according to the Collaborative Filtering Recommendation Algorithm improved; As detailed above.
Result feedback module 600, for being back to user interface by best recommendation results by WEB server; As detailed above.
The described collaborative filtering label recommendations system based on user quality model, wherein, described label system is improved module and is specifically comprised:
Label system Component units, for user's input information being detected, obtains the training set in labeling information database, and all labels extracted in training set form the label system C{t1 of existing system S, t2 ..., tn};
Judging unit, for judging resource R iwith user U isituation about occurring in existing system S; As detailed above.
First processing unit, if for or if resource did not occur in existing system, then extracted resource R iin the highest resource key word in title of front X weight add in system label system C; As detailed above.
Second processing unit, if for namely resource occurred in systems in which, and user did not occur, then extract resource R iin Y the label that frequency of utilization is the highest and the highest resource key word in title of X weight add in system label system C; As detailed above.
3rd processing unit, if for U i∈ SandR i∈ S, namely user and resource occurred all in systems in which, adopted history tab information; As detailed above.
The described collaborative filtering label recommendations system based on user quality model, wherein, described user model builds module and specifically comprises:
Storage unit, for the information MAP of the user of K in system is built user model to two-dimensional matrix, and mapping result stores with user-label characteristics matrix; As detailed above.
User model construction unit, for each row vector VU in matrix k=(w (T 1); W (T 2); ...; W (T i); W (T n)) represent the user model of a user, wherein T irepresent i-th and user U krelevant resource, w (T i) represent label T iat vectorial VU kin weight,
W ( T i ) = t f ( T i , U k ) &times; l o g ( N N T i )
Wherein tf (T i, U k) represent T ilabel is by user U kthe number of times used, N represents system label sum, represent and at least used a T ithe number of users of label; As detailed above.
The described collaborative filtering label recommendations system based on user quality model, wherein, described similarity calculation module is specially: the model vector obtaining active user, calculates the similarity sim (prof of neighbor user in active user and system u, prof v),
s i m ( prof u , prof v ) : = < prof u , prof v > | | prof u | | | | prof v | |
Wherein prof uand prof vbe respectively the user model vector of active user u and neighbor user v; As detailed above.
In sum, the invention provides a kind of collaborative filtering label recommendation method based on user quality model and system, described method comprises: carry out perfect to situation about occurring in existing system to label system; The information MAP of user in system is built user model to two-dimensional matrix, and stores with user-label two-dimensional matrix form; Obtain the model vector of active user, calculate the similarity of neighbor user in active user and system; The model quality of neighbor user in computing system; According to the model quality of neighbor user in system, produce best recommendation according to the Collaborative Filtering Recommendation Algorithm improved; Best recommendation results is back to user interface by WEB server.The present invention chooses process to the best recommendation user in traditional algorithm and is optimized, and improves accuracy and the recall rate of recommendation, the evolution of the existing label system of system and renewal; And according to user and resource appearance situation in systems in which, choose suitable label source, solve the single problem of cold start-up and label source.
Should be understood that, application of the present invention is not limited to above-mentioned citing, for those of ordinary skills, can be improved according to the above description or convert, and all these improve and convert the protection domain that all should belong to claims of the present invention.

Claims (10)

1., based on a collaborative filtering label recommendation method for user quality model, it is characterized in that, method comprises:
A, user's input information detected, obtain the training set in labeling information database, extract all labels in training set and form the label system of existing system, and carry out perfect according to the situation that resource and user occur in existing system to label system;
B, the information MAP of user in system is built user model to two-dimensional matrix, and to store with user-label two-dimensional matrix form;
The model vector of C, acquisition active user, calculates the similarity of neighbor user in active user and system;
The model quality of neighbor user in D, computing system;
E, model quality according to neighbor user in system, produce best recommendation according to the Collaborative Filtering Recommendation Algorithm improved;
F, best recommendation results is back to user interface by WEB server.
2. the collaborative filtering label recommendation method based on user quality model according to claim 1, it is characterized in that, described steps A specifically comprises:
A1, user's input information detected, obtain the training set in labeling information database, all labels extracted in training set form the label system C{t1 of existing system S, t2 ..., tn};
A2, judge resource R iwith user U isituation about occurring in existing system S;
If A3 or if resource did not occur in existing system, then extracted resource R iin the highest resource key word in title of front X weight add in system label system C;
If A4 namely resource occurred in systems in which, and user did not occur, then extract resource R iin Y the label that frequency of utilization is the highest and the highest resource key word in title of X weight add in system label system C;
If A5 is U i∈ SandR i∈ S, namely user and resource occurred all in systems in which, adopted history tab information.
3. the collaborative filtering label recommendation method based on user quality model according to claim 2, it is characterized in that, described step B specifically comprises:
B1, the information MAP of the user of K in system is built user model to two-dimensional matrix, and mapping result stores with user-label characteristics matrix;
Each row vector VU in B2, matrix k=(w (T 1); W (T 2); W (T i); W (T n)) represent the user model of a user, wherein T irepresent i-th and user U krelevant resource, w (T i) represent label T iat vectorial VU kin weight,
W ( T 1 ) = t f ( T i , U k ) &times; l o g ( N N T i )
Wherein tf (T i, U k) represent T ilabel is by user U kthe number of times used, N represents system label sum, represent and at least used a T ithe number of users of label.
4. the collaborative filtering label recommendation method based on user quality model according to claim 3, it is characterized in that, described step C is specially: the model vector obtaining active user, calculates the similarity sim (prof of neighbor user in active user and system u, prof v)
s i m ( prof u , prof v ) = < prof u , prof v > | | prof u | | | | prof v | |
Wherein prof uand prof vbe respectively the user model vector of active user u and neighbor user v.
5. the collaborative filtering label recommendation method based on user quality model according to claim 4, it is characterized in that, described step D specifically comprises: model quality Qu (v) of neighbor user in computing system,
Q u ( v ) = &Sigma; i = 1 | P l ( v ) | | u l , k i | N u &times; a v g U s i m , l , k i &times; N k i , l N l &times; w ( l , k i ) | P l ( v ) |
Wherein:
a v g U s i m , l , k i = 1 | u l , k i | | u l , k j | &Sigma; u s i m x &Element; U l , k i &Sigma; u s i m y &Element; U l , k j s i m ( u s i m x , u s i m y )
w ( l , k i ) = &Sigma; U r e c &Element; U l , k i s i m ( U r e c , v ) &times; kf k i , v , l &times; log ( N N k i ) max k &Element; k r e c , w l , k
In above-mentioned formula, k ifor i-th label of user v, for k inumber of users normative value, for k ithe average similarity of user, for k iword frequency, w (l, k i) be k ispecificity values, the model quality of neighbor user is the average label quality of this neighbor user.
6. the collaborative filtering label recommendation method based on user quality model according to claim 5, is characterized in that, the best recommendation results in the Collaborative Filtering Recommendation Algorithm of the improvement in described step e is designated as T (u, r), and computing formula is:
N u = argmax v &Element; U k Q u ( v ) s i m ( prof u , prof v )
T ( u , l ) = argmax t &Element; N u n &Sigma; v &Element; U Q u ( v ) s i m ( prof u , prof v ) &delta; ( v , l , t )
δ(v,l,t)=1ifδ(v,l,t)∈U×L×T,else0,
N in above formula uk for active user u the most close individual neighbor user, the best recommendation results that T (u, r) is algorithm, sim (prof u, prof v) be the similarity between active user u and neighbor user v, δ (v, r, t) ∈ U × R × T represents that user v exists label defining relation to resource r.
7., based on a collaborative filtering label recommendations system for user quality model, it is characterized in that, system comprises:
Label system improves module, for user's input information being detected, obtain the training set in labeling information database, extract all labels in training set and form the label system of existing system, and carry out perfect according to the situation that resource and user occur in existing system to label system;
User model builds module, for the information MAP of user in system is built user model to two-dimensional matrix, and stores with user-label two-dimensional matrix form;
Similarity calculation module, for obtaining the model vector of active user, calculates the similarity of neighbor user in active user and system;
Model quality computing module, for the model quality of neighbor user in computing system;
Best recommendation generation module, for the model quality according to neighbor user in system, generates best recommendation according to the Collaborative Filtering Recommendation Algorithm improved;
Result feedback module, for being back to user interface by best recommendation results by WEB server.
8. the collaborative filtering label recommendations system based on user quality model according to claim 7, it is characterized in that, described label system is improved module and is specifically comprised:
Label system Component units, for user's input information being detected, obtains the training set in labeling information database, and all labels extracted in training set form the label system C{t1 of existing system S, t2 ..., tn};
Judging unit, for judging resource R iwith user U isituation about occurring in existing system S;
First processing unit, if for or if resource did not occur in existing system, then extracted resource R iin the highest resource key word in title of front X weight add in system label system C;
Second processing unit, if for namely resource occurred in systems in which, and user did not occur, then extract resource R iin Y the label that frequency of utilization is the highest and the highest resource key word in title of X weight add in system label system C;
3rd processing unit, if for U i∈ SandR i∈ S, namely user and resource occurred all in systems in which, adopted history tab information.
9. the collaborative filtering label recommendations system based on user quality model according to claim 8, is characterized in that, described user model builds module and specifically comprises:
Storage unit, for the information MAP of the user of K in system is built user model to two-dimensional matrix, and mapping result stores with user-label characteristics matrix;
User model construction unit, for each row vector VU in matrix k=(w (T 1); W (T 2); W (T i); W (T n)) represent the user model of a user, wherein T irepresent i-th and user U krelevant resource, w (T i) represent label T iat vectorial VU kin weight,
W ( T i ) = t f ( T i , U k ) &times; l o g ( N N T i )
Wherein tf (T i, U k) represent T ilabel is by user U kthe number of times used, N represents system label sum, represent and at least used a T ithe number of users of label.
10. the collaborative filtering label recommendations system based on user quality model according to claim 9, it is characterized in that, described similarity calculation module is specially: the model vector obtaining active user, calculates the similarity sim (prof of neighbor user in active user and system u, prof v),
s i m ( prof u , prof v ) : = < prof u , prof v > | | prof u | | | | prof v | |
Wherein prof uand prof vbe respectively the user model vector of active user u and neighbor user v.
CN201511018787.5A 2015-12-28 2015-12-28 Collaborative filtering label recommendation method and system based on user quality model Active CN105426550B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201511018787.5A CN105426550B (en) 2015-12-28 2015-12-28 Collaborative filtering label recommendation method and system based on user quality model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201511018787.5A CN105426550B (en) 2015-12-28 2015-12-28 Collaborative filtering label recommendation method and system based on user quality model

Publications (2)

Publication Number Publication Date
CN105426550A true CN105426550A (en) 2016-03-23
CN105426550B CN105426550B (en) 2020-02-07

Family

ID=55504762

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201511018787.5A Active CN105426550B (en) 2015-12-28 2015-12-28 Collaborative filtering label recommendation method and system based on user quality model

Country Status (1)

Country Link
CN (1) CN105426550B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107679242A (en) * 2017-10-30 2018-02-09 河海大学 Merge the label recommendation method that multiple information sources Coupling Tensor is decomposed
CN108848152A (en) * 2018-06-05 2018-11-20 腾讯科技(深圳)有限公司 A kind of method and server of object recommendation
CN108985854A (en) * 2018-07-31 2018-12-11 天津大学 A kind of personalized product conceptual design method that user participates in
CN109145280A (en) * 2017-06-15 2019-01-04 北京京东尚科信息技术有限公司 The method and apparatus of information push
CN109582875A (en) * 2018-12-17 2019-04-05 武汉泰乐奇信息科技有限公司 A kind of personalized recommendation method and system of online medical education resource
CN109977302A (en) * 2019-03-05 2019-07-05 广州海晟科技有限公司 The method of user's portrait acquisition of information
CN111797325A (en) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 Event labeling method and device, storage medium and electronic equipment
CN112100489A (en) * 2020-08-27 2020-12-18 北京百度网讯科技有限公司 Object recommendation method, device and computer storage medium
CN114357292A (en) * 2021-12-29 2022-04-15 阿里巴巴(中国)有限公司 Model training method, device and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101751448A (en) * 2009-07-22 2010-06-23 中国科学院自动化研究所 Commendation method of personalized resource information based on scene information
US20110252044A1 (en) * 2010-04-13 2011-10-13 Konkuk University Industrial Cooperation Corp. Apparatus and method for measuring contents similarity based on feedback information of ranked user and computer readable recording medium storing program thereof
CN102508870A (en) * 2011-10-10 2012-06-20 南京大学 Individualized recommending method in combination of rating data and label data
CN102929959A (en) * 2012-10-10 2013-02-13 杭州东信北邮信息技术有限公司 Book recommendation method based on user actions
CN103246672A (en) * 2012-02-09 2013-08-14 中国科学技术大学 Method and device for performing personalized recommendation on users
CN103345517A (en) * 2013-07-10 2013-10-09 北京邮电大学 Collaborative filtering recommendation algorithm simulating TF-IDF similarity calculation
CN104077357A (en) * 2014-05-31 2014-10-01 浙江工商大学 User based collaborative filtering hybrid recommendation method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101751448A (en) * 2009-07-22 2010-06-23 中国科学院自动化研究所 Commendation method of personalized resource information based on scene information
US20110252044A1 (en) * 2010-04-13 2011-10-13 Konkuk University Industrial Cooperation Corp. Apparatus and method for measuring contents similarity based on feedback information of ranked user and computer readable recording medium storing program thereof
CN102508870A (en) * 2011-10-10 2012-06-20 南京大学 Individualized recommending method in combination of rating data and label data
CN103246672A (en) * 2012-02-09 2013-08-14 中国科学技术大学 Method and device for performing personalized recommendation on users
CN102929959A (en) * 2012-10-10 2013-02-13 杭州东信北邮信息技术有限公司 Book recommendation method based on user actions
CN103345517A (en) * 2013-07-10 2013-10-09 北京邮电大学 Collaborative filtering recommendation algorithm simulating TF-IDF similarity calculation
CN104077357A (en) * 2014-05-31 2014-10-01 浙江工商大学 User based collaborative filtering hybrid recommendation method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
荣辉桂 等: ""基于用户相似度的协同过滤推荐算法"", 《通信学报》 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109145280A (en) * 2017-06-15 2019-01-04 北京京东尚科信息技术有限公司 The method and apparatus of information push
CN107679242A (en) * 2017-10-30 2018-02-09 河海大学 Merge the label recommendation method that multiple information sources Coupling Tensor is decomposed
CN107679242B (en) * 2017-10-30 2018-07-27 河海大学 Merge the label recommendation method that multiple information sources Coupling Tensor is decomposed
CN108848152A (en) * 2018-06-05 2018-11-20 腾讯科技(深圳)有限公司 A kind of method and server of object recommendation
CN108848152B (en) * 2018-06-05 2021-09-21 腾讯科技(深圳)有限公司 Object recommendation method and server
CN108985854A (en) * 2018-07-31 2018-12-11 天津大学 A kind of personalized product conceptual design method that user participates in
CN109582875B (en) * 2018-12-17 2021-02-02 武汉泰乐奇信息科技有限公司 Personalized recommendation method and system for online medical education resources
CN109582875A (en) * 2018-12-17 2019-04-05 武汉泰乐奇信息科技有限公司 A kind of personalized recommendation method and system of online medical education resource
CN109977302A (en) * 2019-03-05 2019-07-05 广州海晟科技有限公司 The method of user's portrait acquisition of information
CN111797325A (en) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 Event labeling method and device, storage medium and electronic equipment
CN112100489A (en) * 2020-08-27 2020-12-18 北京百度网讯科技有限公司 Object recommendation method, device and computer storage medium
CN114357292A (en) * 2021-12-29 2022-04-15 阿里巴巴(中国)有限公司 Model training method, device and storage medium
CN114357292B (en) * 2021-12-29 2023-10-13 杭州溢六发发电子商务有限公司 Model training method, device and storage medium

Also Published As

Publication number Publication date
CN105426550B (en) 2020-02-07

Similar Documents

Publication Publication Date Title
CN105426550A (en) Collaborative filtering tag recommendation method and system based on user quality model
CN104935963B (en) A kind of video recommendation method based on timing driving
CN103593425B (en) Preference-based intelligent retrieval method and system
US9569499B2 (en) Method and apparatus for recommending content on the internet by evaluating users having similar preference tendencies
Lai et al. Novel personal and group-based trust models in collaborative filtering for document recommendation
CN106407420B (en) Multimedia resource recommendation method and system
US20140172415A1 (en) Apparatus, system, and method of providing sentiment analysis result based on text
CN103455487B (en) The extracting method and device of a kind of search term
CN104572797A (en) Individual service recommendation system and method based on topic model
CN102063433A (en) Method and device for recommending related items
CN103106285A (en) Recommendation algorithm based on information security professional social network platform
JP2012160201A (en) Review processing method and system
CN103473291A (en) Personalized service recommendation system and method based on latent semantic probability models
CN104111941A (en) Method and equipment for information display
CN103020049A (en) Searching method and searching system
CN102262653A (en) Label recommendation method and system based on user motivation orientation
CN105430505A (en) IPTV program recommending method based on combined strategy
CN103309869A (en) Method and system for recommending display keyword of data object
CN106354867A (en) Multimedia resource recommendation method and device
CN104050243A (en) Network searching method and system combined with searching and social contact
CN105468649A (en) Method and apparatus for determining matching of to-be-displayed object
Aliannejadi et al. User model enrichment for venue recommendation
CN110213660B (en) Program distribution method, system, computer device and storage medium
WO2012115254A1 (en) Search device, search method, search program, and computer-readable memory medium for recording search program
US20160188595A1 (en) Semantic Network Establishing System and Establishing Method Thereof

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant