CN105426550B - Collaborative filtering label recommendation method and system based on user quality model - Google Patents

Collaborative filtering label recommendation method and system based on user quality model Download PDF

Info

Publication number
CN105426550B
CN105426550B CN201511018787.5A CN201511018787A CN105426550B CN 105426550 B CN105426550 B CN 105426550B CN 201511018787 A CN201511018787 A CN 201511018787A CN 105426550 B CN105426550 B CN 105426550B
Authority
CN
China
Prior art keywords
user
label
model
users
resource
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201511018787.5A
Other languages
Chinese (zh)
Other versions
CN105426550A (en
Inventor
冯研
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
TCL Corp
Original Assignee
TCL Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by TCL Corp filed Critical TCL Corp
Priority to CN201511018787.5A priority Critical patent/CN105426550B/en
Publication of CN105426550A publication Critical patent/CN105426550A/en
Application granted granted Critical
Publication of CN105426550B publication Critical patent/CN105426550B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9562Bookmark management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a collaborative filtering label recommendation method and a collaborative filtering label recommendation system based on a user quality model, wherein the method comprises the following steps: perfecting a label system for the conditions occurring in the existing system; mapping information of users in the system to a two-dimensional matrix to construct a user model, and storing the user model in a user-label two-dimensional matrix form; obtaining a model vector of a current user, and calculating the similarity between the current user and a neighbor user in a system; calculating the model quality of neighbor users in the system; generating optimal recommendation according to the model quality of neighbor users in the system and an improved collaborative filtering recommendation algorithm; and returning the optimal recommendation result to the user interface through the WEB server. The method optimizes the selection process of the traditional optimal recommendation user, improves the accuracy and recall rate of recommendation, and evolves and updates the current label system of the system; and according to the appearance of the user and the resource in the system, a proper label source is selected, so that the problems of cold start and single label source are solved.

Description

Collaborative filtering label recommendation method and system based on user quality model
Technical Field
The invention relates to the technical field of WEB application, in particular to a collaborative filtering tag recommendation method and system based on a user quality model.
Background
With the deep development of network technology, labels have become a standard information organization mode on the internet, and are widely applied to a free classification method, which is a method for users to freely access information, and enables users to label information characteristics in a form of labels by using own voice. The label is used for classifying, organizing and retrieving information of texts, pictures, videos and audio resources, the information is searched and shared, and the method is a unique information organization tool in the internet information environment. In the past few years, tagging systems that users create and share metadata have been explored and applied on the internet, and websites such as Flickrtll, del.
The classified words in the traditional classification system are often lack of popularity and relevance, the words are relatively outdated, relevant information and expected results are difficult for professionals to obtain through traditional classified word search, metadata used in the traditional classification structure is relatively high in cost, a large amount of time and energy of the professionals are consumed for defining and classifying the metadata, and in the label system, a complex metadata definition task is given to users to complete by the system, the label definition is group behaviors of the users to resources, so that the label system is higher in compactness and better in adaptability to the users compared with the traditional fixed hierarchical structure classification system, and is more in line with the current popular trend. The label classification enables the key points of searching to be better displayed and highlighted through labels, and is different from general keywords in that when the keywords are used for searching, only articles containing the keywords in the content can be searched, but tags contain the keywords which are not in the text, and the tags are used for searching, so that articles containing words except the keywords can be searched, and the width and the breadth of searching are enlarged.
Although the tags have excellent advantages in the implementation of information resource retrieval and web page navigation, the use of the tags requires that people have to define the tags in advance, however, the manual tag definition process is time-consuming and tedious, in order to liberate people from the time-consuming and tedious tag definition work and enable free classification to be more widely applied, the introduction of a tag recommendation service is urgent, and the service is implemented by recommending some potential tags which may be interested by users for the users to select from, so that the tag definition is more convenient and faster.
The label recommendation is an emerging field accompanying the popularization and application of network technology, but the following problems exist in the overall view:
1. the label is an old problem. The recommended labels are derived from a fixed label system, and as time goes on, the data volume is continuously increased, labels which are lacked in the original label system and are suitable for new resources need to be added, but the fixed label system cannot evolve as time goes on, and the recommendation quality is inevitably reduced.
2. Cold start problems. The user, the label and the resource are three major elements of a label recommendation system, the occurrence conditions of the three major elements in the system are fully considered during recommendation, but most of the existing label recommendation systems only extract information from the existing user model and the existing resource model, and ignore the data mining problem which needs to be solved when the system faces a new user and a new resource.
3. Uniqueness of the source of the tag. Resource content, user history labels (also called user interest labels) and resource history labels are three main label sources for label recommendation, each label source has own advantages and disadvantages, most of the existing label recommendation systems only focus on one of the label sources, and the multiple label sources are not combined.
Accordingly, the prior art is yet to be improved and developed.
Disclosure of Invention
In view of the defects of the prior art, the invention aims to provide a collaborative filtering tag recommendation method and system based on a user quality model, and aims to solve the problems that a collaborative filtering recommendation algorithm and most of existing tag recommendation algorithms in the prior art have old tag space, cold start, single tag source and the like.
The technical scheme of the invention is as follows:
a collaborative filtering label recommendation method based on a user quality model comprises the following steps:
A. detecting user input information, acquiring a training set in a label classification information database, extracting all labels in the training set to form a label system of the existing system, and perfecting the label system according to resources and the condition of a user in the existing system;
B. mapping information of users in the system to a two-dimensional matrix to construct a user model, and storing the user model in a user-label two-dimensional matrix form;
C. obtaining a model vector of a current user, and calculating the similarity between the current user and a neighbor user in a system;
D. calculating the model quality of neighbor users in the system;
E. generating optimal recommendation according to the model quality of neighbor users in the system and an improved collaborative filtering recommendation algorithm;
F. and returning the optimal recommendation result to the user interface through the WEB server.
The collaborative filtering label recommendation method based on the user quality model includes the following steps:
a1, detecting user input information, acquiring a training set in a tag classification information database, and extracting all tags in the training set to form a tag system C { t1, t2, …, tn } of the existing system S;
a2, judging resource RiAnd user UiThe situation that occurs in the existing system S;
a3, if
Figure GDA0002245046970000031
I.e. the resource has not appeared in the existing system, the resource R is extractediAdding the first X resource title keywords with the highest weight into a system label system C;
a4, if
Figure GDA0002245046970000032
Namely, the resource appears in the system and the user does not appear, the resource R is extractediY labels with the highest use frequency and X resource title keywords with the highest weight are added into a system label system C;
a5, if Ui∈S and RiAnd e, S, namely the user and the resource appear in the system, and history label information is adopted.
The collaborative filtering label recommendation method based on the user quality model, wherein the step B specifically includes:
b1, mapping the information of K users in the system to a two-dimensional matrix to construct a user model, and storing the mapping result in a user-label characteristic matrix;
b2, momentEach row vector VU in the arrayk=(w(T1);w(T2);…;w(Ti);w(Tn) User model in which T represents a useriRepresents the ith and the user UkRelated tag, w (T)i) Presentation label TiIn the vector VUkThe weight in (1) is (are),
Figure GDA0002245046970000041
wherein tf (T)i,Uk) Represents TiTag by user UkThe number of uses, N represents the total number of system tags,
Figure GDA0002245046970000042
indicating at least one use of TiThe number of users of the tag.
The collaborative filtering label recommendation method based on the user quality model is characterized in that the step C specifically comprises the following steps: obtaining a model vector of a current user, and calculating the similarity sim (prof) between the current user and a neighbor user in a systemu,profv)
Figure GDA0002245046970000043
Wherein profuAnd profvUser model vectors of a current user u and a neighbor user v are respectively.
The collaborative filtering label recommendation method based on the user quality model, wherein the step D specifically includes: model qualities Q u (v) of neighbor users in the computing system,
Figure GDA0002245046970000044
wherein:
Figure GDA0002245046970000051
Figure GDA0002245046970000052
in the above formula, kiFor the i-th tag of the user v,
Figure GDA0002245046970000053
is kiThe normalized value of the number of users of (c),is kiThe average degree of similarity of the users of (1),
Figure GDA0002245046970000055
is kiWord frequency of, w (l, k)i) Is kiThe model quality of the neighbor user is the average label quality of the neighbor user.
The collaborative filtering label recommendation method based on the user quality model is characterized in that the optimal recommendation result in the improved collaborative filtering recommendation algorithm in the step E is denoted as T (u, r), and the calculation formula is as follows:
Figure GDA0002245046970000056
Figure GDA0002245046970000057
δ(v,r,t)∶=1ifδ(v,r,t)∈U×R×T else 0,
in the above formula, NuFor k nearest neighbor users of the current user u, T (u, r) is the best recommendation result of the algorithm, sim (prof)u,profv) And delta (v, R, T) is equal to U multiplied by R multiplied by T and represents that the user v has a label definition relation to the resource R for the similarity between the current user U and the neighbor user v.
A collaborative filtering tag recommendation system based on a user quality model, wherein the system comprises:
the system comprises a label system perfecting module, a label classification information database and a label classification information database, wherein the label system perfecting module is used for detecting information input by a user, acquiring a training set in the label classification information database, extracting all labels in the training set to form a label system of the existing system, and perfecting the label system according to resources and the condition of the user in the existing system;
the user model building module is used for mapping the information of the users in the system to a two-dimensional matrix to build a user model and storing the user model in a user-label two-dimensional matrix form;
the similarity calculation module is used for acquiring a model vector of the current user and calculating the similarity between the current user and a neighbor user in the system;
the model quality calculation module is used for calculating the model quality of the neighbor users in the system;
the optimal recommendation generation module is used for generating optimal recommendations according to the model quality of neighbor users in the system and an improved collaborative filtering recommendation algorithm;
and the result feedback module is used for returning the optimal recommendation result to the user interface through the WEB server.
The collaborative filtering label recommendation system based on the user quality model comprises a label system perfecting module, a label system perfecting module and a label model updating module, wherein the label system perfecting module specifically comprises:
the system comprises a label system forming unit, a label classification information database and a label system management unit, wherein the label system forming unit is used for detecting user input information, acquiring a training set in the label classification information database, and extracting all labels in the training set to form a label system C { t1, t2, …, tn } of the existing system S;
a judging unit for judging the resource RiAnd user UiThe situation that occurs in the existing system S;
a first processing unit for processing ifI.e. the resource has not appeared in the existing system, the resource R is extractediAdding the first X resource title keywords with the highest weight into a system label system C;
a second processing unit for processing if
Figure GDA0002245046970000062
I.e., resources are present in the system, users are not,then resource R is extractediY labels with the highest use frequency and X resource title keywords with the highest weight are added into a system label system C;
a third processing unit for processing if Ui∈S and RiAnd e, S, namely the user and the resource appear in the system, and history label information is adopted.
The collaborative filtering label recommendation system based on the user quality model is characterized in that the user model construction module specifically comprises:
the storage unit is used for mapping the information of K users in the system to a two-dimensional matrix to construct a user model, and storing the mapping result in a user-label characteristic matrix;
a user model building unit for each row vector VU in the matrixk=(w(T1);w(T2);…;w(Ti);w(Tn) User model in which T represents a useriRepresents the ith and the user UkRelated tag, w (T)i) Presentation label TiIn the vector VUkThe weight in (1) is (are),
wherein tf (T)i,Uk) Represents TiTag by user UkThe number of uses, N represents the total number of system tags,
Figure GDA0002245046970000072
indicating at least one use of TiThe number of users of the tag.
The collaborative filtering label recommendation system based on the user quality model is characterized in that the similarity calculation module specifically comprises: obtaining a model vector of a current user, and calculating the similarity sim (prof) between the current user and a neighbor user in a systemu,profv),
Figure GDA0002245046970000073
Wherein profuAnd profvUser model vectors of a current user u and a neighbor user v are respectively.
The invention provides a collaborative filtering label recommendation method and system based on a user quality model, wherein a user model quality judgment theory is applied to the traditional collaborative filtering label recommendation, the optimal recommended user selection process in the traditional algorithm is optimized, the recommendation accuracy and recall rate are further improved, the system can realize the evolution and the update of a label system, and the problem of label space obsolescence is solved; meanwhile, the advantages of various label sources are analyzed, and a proper label source is selected according to the appearance conditions of users and resources in the system, so that the problem of cold start and the problem of single label source are solved.
Drawings
Fig. 1 is a flowchart of a collaborative filtering label recommendation method based on a user quality model according to a preferred embodiment of the present invention.
Fig. 2 is a schematic diagram of a specific application embodiment of the collaborative filtering label recommendation method based on the user quality model according to the present invention.
FIG. 3 is a functional block diagram of a preferred embodiment of the collaborative filtering tag recommendation system based on a user quality model according to the present invention.
Detailed Description
In order to make the objects, technical solutions and effects of the present invention clearer and clearer, the present invention is described in further detail below. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The traditional collaborative filtering recommendation system based on the nearest neighbor set is widely and successfully applied, the problem of solving the tag recommendation problem in the same way is a natural choice, but the tag recommendation system has self-specificity because no score is formed in the tag recommendation system, and a tag is replaced by the tag.
The label system is generally composed of three elements of a user, a resource and a label, the user can define the label for the resource in the system, the type of the resource is determined by the type of the system, and a label recommendation system can be composed of the following 4 parts:
1. set U formed by all users in system
2. Set R of all resources in the system
3. Set T composed of all tags in system
4. Function of relationship
Figure GDA0002245046970000081
The relation function expresses that the user U defines a label set for the resourceWherein
Figure GDA0002245046970000083
A user U belongs to U and a resource R belongs to R, so that a label set T (U, R) with scores is generated, and the top n labels with the highest scores in the recommendation set are input into the system.
Similar to the collaborative filtering algorithm, the tag recommendation system also maps the user information to a two-dimensional matrix for storage, and the mapping result obtains two user model matrices: one user-resource matrix with the size of K multiplied by M is recorded as a matrix X; another user-label matrix with size K × L, denoted as Y, where K ═ U |, M: ═ R |, L ═ T |, no scoring information is recorded in the collaborative filtering label system, only the user and resource association information and the user and label association information are recorded in the form of codes in binary matrices X and Y, where X ∈ {0,1}k×m,Y∈{0,1}k×lFor example, if element X in the X matrixk,m1 means that the kth user is associated with the mth resource, and if equal to 0, it means that there is no association. Similarly, element Y in matrix Yk,lWhen the number is 1, it indicates that the kth user and the lth tag are related, and if the number is 0, it indicates no relation.
For a given user u and a resource r, an algorithm firstly finds a user which defines a label for the resource r, then similarity calculation formulas based on a collaborative filtering algorithm of the user are adopted to respectively calculate the similarity between the current user and the users, a neighbor user set of the current user is obtained (the neighbor users are different according to different models adopted by the similarity calculation, because the similarity calculation may be based on two matrix models, namely a user-resource matrix model and a user-label matrix model), then the label of the neighbor user is recommended and scored according to the similarity between the neighbor user and the current user, and the label shared by a plurality of neighbor users has higher recommendation score.
In a label recommendation system with a user set of U, a label set of T and a resource set of R, the collaborative filtering label recommendation algorithm is as follows:
Figure GDA0002245046970000091
Figure GDA0002245046970000092
δ(v,r,t)∶=1ifδ(v,r,t)∈U×R×T else 0
in the above formula, NuFor k nearest neighbor users of the current user u, T (u, r) is the recommendation result of the algorithm, sim (prof)u,profv) And delta (v, R, T) is equal to U multiplied by R multiplied by T and represents that the user v has a label definition relation to the resource R for the similarity between the current user U and the neighbor user v. Wherein, the operator represents dynamic assignment, and each time when the parameter value on the right side in the formula changes, the value on the left side automatically covers the previous value.
In a label recommendation system, a user U belongs to a U model and is usually represented by Pl(u)=∪r∈RD (u, l), where D (u, l) represents the label set defined by user u for resource l, and the user model describes the label set defined by user in the system, so the quality of the label directly determines the quality of the user model. The label is a keyword defined by the user according to personal interests and resource content, so that a good label has individuation and specificity, not only accords with the vocabulary using habits of the user, but also can highly describe the resources and reflect the interest tendency of the user.
User u pair of capitalSource l defines a tag kiThen label kiThe quality of the label can be measured by parameters such as the number of users, the similarity of the users, the word frequency, the specificity of the label and the like.
Label kiThe number of users of (1) is use kiTo define the number of users of resource l. k is a radical ofiThe larger the number of users, the higher the quality. k is a radical ofiThe number of users can be expressed as | ul,ki|,ul,kiTo use the label kiTo define all user sets of resource l, using the total number of system users NallNormalizing the same to obtain:
Figure GDA0002245046970000101
label kiIs using kiTo define the average similarity of those users of resource i. The greater the average user similarity, label kiIs of high quality. The average similarity calculation formula for users is as follows:
Figure GDA0002245046970000102
wherein
Figure GDA0002245046970000103
Indicating usage label kiTo define the set of all users of resource l, | ul,kiI represents the number of users in a user group, usimxAnd usimyShowing any two different users in the user group, sim (u)simx,usimy) The similarity of the user models of the two users is represented and can be obtained by calculating the cosine included angle of the feature vector of the user model.
Label kiThe word frequency of is defined as kiThe number of times that resource/is defined is a percentage of the number of times resource/is defined by all tags. k is a radical ofiThe higher the word frequency, the higher its quality. k is a radical ofiWord frequency availability of tags
Figure GDA0002245046970000104
It is shown that,represents kiThe number of times a tag is used to define resource/NlRepresenting the total number of times resource/is defined by all tags.
Label kiIs used to measure kiAn important index for the degree of characterization of resource l, which shows kiIs used to define the breadth of the different resources. The higher the specificity, the better the label quality. The tag specificity can be calculated by the TF-IDF algorithm:
Figure GDA0002245046970000105
in the above formula
Figure GDA0002245046970000107
Presentation tag kiIs used to define the frequency of resource/, N is the total number of all resources,
Figure GDA0002245046970000108
is at least kiThe number of primary resources defined by the tag.
The higher the overall quality of the user label is, the higher the quality of the user model is, and the user model quality reflects the accuracy and the advisability of the user label definition behavior, that is, the higher the quality of the user model of a user is, the more suitable the label defined by the user is for recommendation.
The traditional collaborative filtering label recommendation algorithm seeks label recommendation from a neighbor user, only considers the similarity of user models of the neighbor user and the current user, but ignores the quality of the user models of the neighbor user, and has low recommendation quality, so that the label recommendation algorithm based on the user model quality can be adopted.
The invention provides a flow chart of a preferred embodiment of a collaborative filtering label recommendation method based on a user quality model, as shown in fig. 1, the method comprises the following steps:
step S100, detecting user input information, acquiring a training set in a label classification information database, extracting all labels in the training set to form a label system of the existing system, and perfecting the label system according to resources and the condition of the user in the existing system.
In specific implementation, a training set is selected from the tag classification information database, all tags in the training set are extracted to form a tag system C { t1, t2, …, tn } of the ready-made system S, and the system is respectively perfected according to the resource and the situation of the user in the existing system. Further, user input information is detected, and a test set is obtained from the label classification information database, wherein the test set is a sampling set of labels of the label classification information database. When all the labels in the training set are extracted to form a label system C { t1, t2, …, tn } of the existing system S, a test set is adopted to detect the label system, whether the current label system C { t1, t2, …, tn } is complete is judged, specifically, when all the labels in the test set are in the current label system, the current label system C { t1, t2, …, tn } is complete is judged, if some labels in all the labels in the test set are not in the current label system, the current label system C { t1, t2, …, tn } is determined to be incomplete, and the existing label system is further improved. Specifically, the training set may be reselected to improve the label system, or the labels that do not appear in the test set may be added to the label system.
In specific implementation, the step S100 specifically includes:
s101, detecting user input information, acquiring a training set in a label classification information database, and extracting all labels in the training set to form a label system C { t1, t2, …, tn } of the existing system S;
step S102, judging resource RiAnd user UiThe situation that occurs in the existing system S;
step S103, if
Figure GDA0002245046970000111
I.e. the resource has not appeared in the existing system, the resource R is extractediAdding the first X resource title keywords with the highest weight into a system label system C;
step S104, if
Figure GDA0002245046970000121
Namely, the resource appears in the system and the user does not appear, the resource R is extractediY labels with the highest use frequency and X resource title keywords with the highest weight are added into a system label system C;
step S105, if Ui∈S and RiAnd e, S, namely the user and the resource appear in the system, and history label information is adopted.
In specific implementation, resource R is analyzediAnd user UiSituation occurring in the existing system S, user UiAnd resource RiThe following 4 cases can occur:
(1)
Figure GDA0002245046970000122
completely in a cold start situation, new user, new resource;
(2)
Figure GDA0002245046970000123
the user appears in the system, and the resource does not appear;
(3)
Figure GDA0002245046970000124
resources appear in the system, and users do not appear;
(4)Ui∈S and Riboth the S-user and the resource are present in the system.
The label perfection measures for different situations are as follows:
when the situations (1) and (2) occur, extracting the resource RiAdding the top X resource title keywords { key1, key2, key3} with the highest weight into a system label system C, namely C ← { key1, key2, key3 };
when the situation (3) occurs, the resource R is extractediAdding Y most popular labels and X resource title keywords with the highest weight into a system label system C;
when the situation (4) occurs, the history tag information is employed.
In specific implementation, X may be preset, preferably 3, and Y may also be preset, preferably 2.
And S200, mapping the information of the users in the system to a two-dimensional matrix to construct a user model, and storing the user model in a user-label two-dimensional matrix form.
In specific implementation, a user model is constructed by mapping information of k users in a system to a two-dimensional matrix, a mapping result is shown in a user-tag characteristic matrix QT, and each row vector VU in the matrixk=(w(T1);w(T2);…;w(Ti);w(Tn) User model in which T represents a useriRepresents the ith and the user UkRelated tag, w (T)i) Presentation label TiIn the vector VUkThe weight in (1).
The step S200 specifically includes:
step S201, mapping information of K users in the system to a two-dimensional matrix to construct a user model, and storing mapping results in a user-label characteristic matrix;
step S202, each row vector VU in the matrixk=(w(T1);w(T2);…;w(Ti);w(Tn) User model in which T represents a useriRepresents the ith and the user UkRelated tag, w (T)i) Presentation label TiIn the vector VUkThe weight in (1) is (are),
Figure GDA0002245046970000131
wherein tf (T)i,Uk) Represents TiTag by user UkThe number of uses, N represents the total number of system tags,
Figure GDA0002245046970000132
indicating at least one use of TiThe number of users of the tag.
And S300, obtaining a model vector of the current user, and calculating the similarity between the current user and a neighbor user in the system.
In practical implementation, the neighbor user refers to a user with a higher correlation with the current user, such as a user in the same area. The user model in the label recommendation system is stored in a user-label two-dimensional matrix form, and the similarity between the current user and other users in the system can be obtained by calculating the cosine similarity value of the user model vector corresponding to the current user and other users in the matrix. Specifically, a model vector of a current user is obtained, and the similarity sim (prof) between the current user and a neighbor user in a system is calculatedu,profv),
Figure GDA0002245046970000133
Wherein profuAnd profvUser model vectors of a current user u and a neighbor user v are respectively.
And S400, calculating the model quality of the neighbor users in the system.
In specific implementation, the user model quality theory shows that the user model quality is influenced by the use frequency of users, the similarity of user groups, the representation frequency of tags and the specificity of tags. Model qualities Q u (v) of neighbor users in the computing system,
Figure GDA0002245046970000141
wherein:
Figure GDA0002245046970000143
in the above formula, kiFor the i-th tag of the user v,is kiTo a userThe value of the number of normalization is,is kiThe average degree of similarity of the users of (1),
Figure GDA0002245046970000146
is kiWord frequency of, w (l, k)i) Is kiThe model quality of the neighbor user is the average label quality of the neighbor user.
And S500, generating the optimal recommendation according to the improved collaborative filtering recommendation algorithm according to the model quality of the neighbor users in the system.
In the specific implementation, in the tag recommendation system, for a current user, as a neighbor user of a recommender, the quality of a user model of the current user has an important influence on a recommendation effect, so that a collaborative filtering tag recommendation algorithm is improved, an optimal recommendation result in the improved collaborative filtering recommendation algorithm is denoted as T (u, r), and a calculation formula is as follows:
Figure GDA0002245046970000148
δ(v,r,t)∶=1 ifδ(v,r,t)∈U×R×T else 0,
in the above formula, NuFor k nearest neighbor users of the current user u, T (u, r) is the best recommendation result of the algorithm, sim (prof)u,profv) And delta (v, R, T) is equal to U multiplied by R multiplied by T and represents that the user v has a label definition relation to the resource R for the similarity between the current user U and the neighbor user v.
And step S600, returning the optimal recommendation result to the user interface through the WEB server.
And returning the optimal recommendation result to the user interface through the WEB server during specific implementation. The user may use a different interface and return to the user's television interface if the user is using a television interface.
The invention also provides a flow chart of a specific application embodiment of the collaborative filtering tag recommendation method based on the user quality model, which is introduced by taking a user television interface as an example, and as shown in fig. 2, the method comprises the following steps:
specifically, the television is connected with a WEB server, and the WEB server is further connected with the database. The database comprises a user information base in which user history information is stored, a resource information base in which resource information is stored, and a tag information base in which tag information is stored.
When a user watches a television through a user television interface of the television, the user watching information is sent to a WEB server, the WEB server carries out data preprocessing on the watching information, user history information is obtained from a user information base, a current user quality model is generated according to the user history information, a core recommendation model is generated according to the current user quality model, resource information of the resource information base and label information of the label information base, a recommendation result is generated according to the core recommendation model and sent to the WEB server, and the WEB server returns the recommendation result to the user television interface through a recommendation page for the user to check.
The invention also provides a functional schematic block diagram of a collaborative filtering label recommendation system based on a user quality model, as shown in fig. 3, wherein the method comprises the following steps:
the label system improvement module 100 is used for detecting information input by a user, acquiring a training set in a label classification information database, extracting all labels in the training set to form a label system of the existing system, and improving the label system according to resources and the condition of the user in the existing system; as described above.
The user model building module 200 is used for mapping the information of the users in the system to a two-dimensional matrix to build a user model and storing the user model in a user-label two-dimensional matrix form; as described above.
The similarity calculation module 300 is configured to obtain a model vector of a current user, and calculate a similarity between the current user and a neighbor user in the system; as described above.
A model quality calculation module 400 for calculating the model quality of the neighbor users in the system; as described above.
The optimal recommendation generation module 500 is used for generating optimal recommendations according to the model quality of neighbor users in the system and an improved collaborative filtering recommendation algorithm; as described above.
A result feedback module 600, configured to return the optimal recommendation result to the user interface through the WEB server; as described above.
The collaborative filtering label recommendation system based on the user quality model comprises a label system perfecting module, a label system perfecting module and a label model updating module, wherein the label system perfecting module specifically comprises:
the system comprises a label system forming unit, a label classification information database and a label system management unit, wherein the label system forming unit is used for detecting user input information, acquiring a training set in the label classification information database, and extracting all labels in the training set to form a label system C { t1, t2, …, tn } of the existing system S;
a judging unit for judging the resource RiAnd user UiThe situation that occurs in the existing system S; as described above.
A first processing unit for processing if
Figure GDA0002245046970000161
I.e. the resource has not appeared in the existing system, the resource R is extractediAdding the first X resource title keywords with the highest weight into a system label system C; as described above.
A second processing unit for processing if
Figure GDA0002245046970000162
Namely, the resource appears in the system and the user does not appear, the resource R is extractediY labels with the highest use frequency and X resource title keywords with the highest weight are added into a system label system C; as described above.
A third processing unit for processing if Ui∈S and RiThe method comprises the following steps that (1) the E belongs to S, namely, a user and a resource are present in a system, and historical label information is adopted; as described above.
The collaborative filtering label recommendation system based on the user quality model is characterized in that the user model construction module specifically comprises:
the storage unit is used for mapping the information of K users in the system to a two-dimensional matrix to construct a user model, and storing the mapping result in a user-label characteristic matrix; as described above.
A user model building unit for each row vector VU in the matrixk=(w(T1);w(T2);…;w(Ti);w(Tn) User model in which T represents a useriRepresents the ith and the user UkRelated tag, w (T)i) Presentation label TiIn the vector VUkThe weight in (1) is (are),
Figure GDA0002245046970000171
wherein tf (T)i,Uk) Represents TiTag by user UkThe number of uses, N represents the total number of system tags,
Figure GDA0002245046970000172
indicating at least one use of TiThe number of users of the tag; as described above.
The collaborative filtering label recommendation system based on the user quality model is characterized in that the similarity calculation module specifically comprises: obtaining a model vector of a current user, and calculating the similarity sim (prof) between the current user and a neighbor user in a systemu,profv),
Figure GDA0002245046970000173
Wherein profuAnd profvRespectively are user model vectors of a current user u and a neighbor user v; as described above.
In summary, the present invention provides a collaborative filtering label recommendation method and system based on a user quality model, the method includes: perfecting a label system for the conditions occurring in the existing system; mapping information of users in the system to a two-dimensional matrix to construct a user model, and storing the user model in a user-label two-dimensional matrix form; obtaining a model vector of a current user, and calculating the similarity between the current user and a neighbor user in a system; calculating the model quality of neighbor users in the system; generating optimal recommendation according to the model quality of neighbor users in the system and an improved collaborative filtering recommendation algorithm; and returning the optimal recommendation result to the user interface through the WEB server. The method optimizes the optimal recommended user selection process in the traditional algorithm, improves the accuracy and recall rate of recommendation, and evolves and updates the current label system of the system; and according to the appearance of the user and the resource in the system, a proper label source is selected, so that the problems of cold start and single label source are solved.
It is to be understood that the invention is not limited to the examples described above, but that modifications and variations may be effected thereto by those of ordinary skill in the art in light of the foregoing description, and that all such modifications and variations are intended to be within the scope of the invention as defined by the appended claims.

Claims (8)

1. A collaborative filtering label recommendation method based on a user quality model is characterized by comprising the following steps:
A. detecting user input information, acquiring a training set in a label classification information database, extracting all labels in the training set to form a label system of the existing system, and perfecting the label system according to resources and the condition of a user in the existing system;
B. mapping information of users in the system to a two-dimensional matrix to construct a user model, and storing the user model in a user-label two-dimensional matrix form;
C. obtaining a model vector of a current user, and calculating the similarity between the current user and a neighbor user in a system;
D. calculating the model quality of neighbor users in the system;
E. generating optimal recommendation according to the model quality of neighbor users in the system and an improved collaborative filtering recommendation algorithm;
F. returning the optimal recommendation result to the user interface through the WEB server;
the step D specifically comprises the following steps: the model quality of the neighbor users in the computing system qu (v),
Figure FDA0002245046960000011
wherein:
Figure FDA0002245046960000012
Figure FDA0002245046960000013
in the above formula, kiFor the i-th tag of the user v,
Figure FDA0002245046960000014
is kiThe normalized value of the number of users of (c),
Figure FDA0002245046960000015
is kiThe average degree of similarity of the users of (1),
Figure FDA0002245046960000016
is kiWord frequency of, w (l, k)i) Is kiThe model quality of the neighbor user is the average label quality of the neighbor user;
Figure FDA0002245046960000021
indicating usage label kiTo define the set of all users of resource l, | ul,kiI represents the number of users in a user group, usimxAnd usimyRepresents any two different users in the user group, sim (u)simx,usimy) Representing user model similarity of two users; n is the total number of all resources,
Figure FDA0002245046960000022
is at least kiThe number of primary resources defined by the label;
the optimal recommendation result in the improved collaborative filtering recommendation algorithm in the step E is denoted as T (u, r), and the calculation formula is as follows:
Figure FDA0002245046960000024
δ(v,r,t)∶=1ifδ(v,r,t)∈U×R×T else 0,
in the above formula, U is a user set, T is a tag set, R is a resource set, and NuFor k nearest neighbor users of the current user u, T (u, r) is the best recommendation result of the algorithm, sim (prof)u,profv) For the similarity between the current user U and the neighbor user v, delta (v, R, T) belongs to U multiplied by R multiplied by T and represents that the user v has a label definition relation to the resource R; where an operator represents a dynamic assignment.
2. The collaborative filtering label recommendation method based on the user quality model according to claim 1, wherein the step a specifically includes:
a1, detecting user input information, acquiring a training set in a tag classification information database, and extracting all tags in the training set to form a tag system C { t1, t2, …, tn } of the existing system S;
a2, judging resource RiAnd user UiThe situation that occurs in the existing system S;
a3, if
Figure FDA0002245046960000025
I.e. the resource has not appeared in the existing system, the resource R is extractediAdding the first X resource title keywords with the highest weight into a system label system C;
a4, if
Figure FDA0002245046960000026
Namely, the resource appears in the system and the user does not appear, the resource R is extractediY labels with the highest use frequency and X resource title keywords with the highest weight are added into a system label system C;
a5, if Ui∈S and RiAnd e, S, namely the user and the resource appear in the system, and history label information is adopted.
3. The collaborative filtering label recommendation method based on the user quality model according to claim 2, wherein the step B specifically includes:
b1, mapping the information of K users in the system to a two-dimensional matrix to construct a user model, and storing the mapping result in a user-label characteristic matrix;
b2, each row vector VU in matrixk=(w(T1);w(T2);…;w(Ti);w(Tn) User model in which T represents a useriRepresents the ith and the user UkRelated tag, w (T)i) Presentation label TiIn the vector VUkThe weight in (1) is (are),
Figure FDA0002245046960000031
wherein tf (T)i,Uk) Represents TiTag by user UkThe number of uses, N represents the total number of system tags,
Figure FDA0002245046960000033
indicating at least one use of TiThe number of users of the tag.
4. The collaborative filtering label recommendation method based on the user quality model according to claim 3, wherein the step C specifically comprises: obtaining model vector of current user, calculating current user andsimilarity sim (prof) of neighbor users in systemu,profv)
Figure FDA0002245046960000032
Wherein profuAnd profvUser model vectors of a current user u and a neighbor user v are respectively.
5. A collaborative filtering tag recommendation system based on a user quality model, the system comprising:
the system comprises a label system perfecting module, a label classification information database and a label classification information database, wherein the label system perfecting module is used for detecting information input by a user, acquiring a training set in the label classification information database, extracting all labels in the training set to form a label system of the existing system, and perfecting the label system according to resources and the condition of the user in the existing system;
the user model building module is used for mapping the information of the users in the system to a two-dimensional matrix to build a user model and storing the user model in a user-label two-dimensional matrix form;
the similarity calculation module is used for acquiring a model vector of the current user and calculating the similarity between the current user and a neighbor user in the system;
the model quality calculation module is used for calculating the model quality of the neighbor users in the system; but also for calculating the model quality qu (v) of the neighbor users in the system,
Figure FDA0002245046960000041
wherein:
Figure FDA0002245046960000043
in the above formula, kiFor user vThe number (i) of the tags is,
Figure FDA0002245046960000044
is kiThe normalized value of the number of users of (c),
Figure FDA0002245046960000045
is kiThe average degree of similarity of the users of (1),is kiWord frequency of, w (l, k)i) Is kiThe model quality of the neighbor user is the average label quality of the neighbor user;
Figure FDA0002245046960000047
indicating usage label kiTo define the set of all users of resource l, | ul,kiI represents the number of users in a user group, usimxAnd usimyRepresents any two different users in the user group, sim (u)simx,usimy) Representing user model similarity of two users; n is the total number of all resources,
Figure FDA0002245046960000048
is at least kiThe number of primary resources defined by the label;
the optimal recommendation generation module is used for generating optimal recommendations according to the model quality of neighbor users in the system and an improved collaborative filtering recommendation algorithm; the method is also used for recording the best recommendation result in the improved collaborative filtering recommendation algorithm as T (u, r), and the calculation formula is as follows:
Figure FDA0002245046960000049
δ(v,r,t)∶=1ifδ(v,r,t)∈U×R×T else 0,
in the above formula, U is a user set, T is a tag set, R is a resource set, and NuFor k nearest neighbor users of the current user u, T (u, r) is the best recommendation result of the algorithm, sim (prof)u,profv) For the similarity between the current user U and the neighbor user v, delta (v, R, T) belongs to U multiplied by R multiplied by T and represents that the user v has a label definition relation to the resource R; wherein the operator represents a dynamic assignment;
and the result feedback module is used for returning the optimal recommendation result to the user interface through the WEB server.
6. The collaborative filtering label recommendation system based on the user quality model according to claim 5, wherein the label system improvement module specifically comprises:
the system comprises a label system forming unit, a label classification information database and a label system management unit, wherein the label system forming unit is used for detecting user input information, acquiring a training set in the label classification information database, and extracting all labels in the training set to form a label system C { t1, t2, …, tn } of the existing system S;
a judging unit for judging the resource RiAnd user UiThe situation that occurs in the existing system S;
a first processing unit for processing if
Figure FDA0002245046960000051
I.e. the resource has not appeared in the existing system, the resource R is extractediAdding the first X resource title keywords with the highest weight into a system label system C;
a second processing unit for processing if
Figure FDA0002245046960000052
Namely, the resource appears in the system and the user does not appear, the resource R is extractediY labels with the highest use frequency and X resource title keywords with the highest weight are added into a system label system C;
a third processing unit for processing if Ui∈S and RiIs e.g. S, i.e. both user and resource are in the systemHistorical tag information was used when the system appeared.
7. The collaborative filtering label recommendation system based on user quality model according to claim 6, wherein the user model building module specifically comprises:
the storage unit is used for mapping the information of K users in the system to a two-dimensional matrix to construct a user model, and storing the mapping result in a user-label characteristic matrix;
a user model building unit for each row vector VU in the matrixk=(w(T1);w(T2);…;w(Ti);w(Tn) User model in which T represents a useriRepresents the ith and the user UkRelated tag, w (T)i) Presentation label TiIn the vector VUkThe weight in (1) is (are),
Figure FDA0002245046960000061
wherein tf (T)i,Uk) Represents TiTag by user UkThe number of uses, N represents the total number of system tags,
Figure FDA0002245046960000062
indicating at least one use of TiThe number of users of the tag.
8. The collaborative filtering label recommendation system based on the user quality model according to claim 7, wherein the similarity calculation module specifically is: obtaining a model vector of a current user, and calculating the similarity sim (prof) between the current user and a neighbor user in a systemu,profv),
Wherein profuAnd profvRespectively a current user u and a neighbor userv user model vector.
CN201511018787.5A 2015-12-28 2015-12-28 Collaborative filtering label recommendation method and system based on user quality model Active CN105426550B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201511018787.5A CN105426550B (en) 2015-12-28 2015-12-28 Collaborative filtering label recommendation method and system based on user quality model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201511018787.5A CN105426550B (en) 2015-12-28 2015-12-28 Collaborative filtering label recommendation method and system based on user quality model

Publications (2)

Publication Number Publication Date
CN105426550A CN105426550A (en) 2016-03-23
CN105426550B true CN105426550B (en) 2020-02-07

Family

ID=55504762

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201511018787.5A Active CN105426550B (en) 2015-12-28 2015-12-28 Collaborative filtering label recommendation method and system based on user quality model

Country Status (1)

Country Link
CN (1) CN105426550B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109145280B (en) * 2017-06-15 2023-05-12 北京京东尚科信息技术有限公司 Information pushing method and device
CN107679242B (en) * 2017-10-30 2018-07-27 河海大学 Merge the label recommendation method that multiple information sources Coupling Tensor is decomposed
CN108848152B (en) * 2018-06-05 2021-09-21 腾讯科技(深圳)有限公司 Object recommendation method and server
CN108985854B (en) * 2018-07-31 2021-03-12 天津大学 Design method of personalized product concept participated by user
CN109582875B (en) * 2018-12-17 2021-02-02 武汉泰乐奇信息科技有限公司 Personalized recommendation method and system for online medical education resources
CN109977302A (en) * 2019-03-05 2019-07-05 广州海晟科技有限公司 The method of user's portrait acquisition of information
CN111797325A (en) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 Event labeling method and device, storage medium and electronic equipment
CN112100489B (en) * 2020-08-27 2022-07-15 北京百度网讯科技有限公司 Object recommendation method, device and computer storage medium
CN114357292B (en) * 2021-12-29 2023-10-13 杭州溢六发发电子商务有限公司 Model training method, device and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101751448A (en) * 2009-07-22 2010-06-23 中国科学院自动化研究所 Commendation method of personalized resource information based on scene information
CN102508870A (en) * 2011-10-10 2012-06-20 南京大学 Individualized recommending method in combination of rating data and label data
CN102929959A (en) * 2012-10-10 2013-02-13 杭州东信北邮信息技术有限公司 Book recommendation method based on user actions
CN103246672A (en) * 2012-02-09 2013-08-14 中国科学技术大学 Method and device for performing personalized recommendation on users
CN103345517A (en) * 2013-07-10 2013-10-09 北京邮电大学 Collaborative filtering recommendation algorithm simulating TF-IDF similarity calculation
CN104077357A (en) * 2014-05-31 2014-10-01 浙江工商大学 User based collaborative filtering hybrid recommendation method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101098871B1 (en) * 2010-04-13 2011-12-26 건국대학교 산학협력단 APPARATUS AND METHOD FOR MEASURING CONTENTS SIMILARITY BASED ON FEEDBACK INFORMATION OF RANKED USER and Computer Readable Recording Medium Storing Program thereof

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101751448A (en) * 2009-07-22 2010-06-23 中国科学院自动化研究所 Commendation method of personalized resource information based on scene information
CN102508870A (en) * 2011-10-10 2012-06-20 南京大学 Individualized recommending method in combination of rating data and label data
CN103246672A (en) * 2012-02-09 2013-08-14 中国科学技术大学 Method and device for performing personalized recommendation on users
CN102929959A (en) * 2012-10-10 2013-02-13 杭州东信北邮信息技术有限公司 Book recommendation method based on user actions
CN103345517A (en) * 2013-07-10 2013-10-09 北京邮电大学 Collaborative filtering recommendation algorithm simulating TF-IDF similarity calculation
CN104077357A (en) * 2014-05-31 2014-10-01 浙江工商大学 User based collaborative filtering hybrid recommendation method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"基于用户相似度的协同过滤推荐算法";荣辉桂 等;《通信学报》;20140228;第35卷(第2期);第16-24页 *

Also Published As

Publication number Publication date
CN105426550A (en) 2016-03-23

Similar Documents

Publication Publication Date Title
CN105426550B (en) Collaborative filtering label recommendation method and system based on user quality model
KR101721338B1 (en) Search engine and implementation method thereof
US8301624B2 (en) Determining user preference of items based on user ratings and user features
WO2017101317A1 (en) Method and apparatus for displaying intelligent recommendations on different terminals
Yu et al. TIIREC: A tensor approach for tag-driven item recommendation with sparse user generated content
US20110191336A1 (en) Contextual image search
US8032469B2 (en) Recommending similar content identified with a neural network
CN106407420B (en) Multimedia resource recommendation method and system
CN110110225B (en) Online education recommendation model based on user behavior data analysis and construction method
US11023503B2 (en) Suggesting text in an electronic document
CN110888990A (en) Text recommendation method, device, equipment and medium
Yang et al. Cross-media retrieval using query dependent search methods
Bouadjenek et al. Persador: personalized social document representation for improving web search
CN110069713B (en) Personalized recommendation method based on user context perception
CN103309869A (en) Method and system for recommending display keyword of data object
CN104050243A (en) Network searching method and system combined with searching and social contact
KR101450453B1 (en) Method and apparatus for recommending contents
Sharma et al. Designing Recommendation or Suggestion Systems: looking to the future
Grivolla et al. A hybrid recommender combining user, item and interaction data
Wei et al. Online education recommendation model based on user behavior data analysis
KR102368043B1 (en) Apparatus and method for recommending news of user interest using user-defined topic modeling
Elsas et al. Shopping for top forums: discovering online discussion for product research
Singh et al. Multi-feature segmentation and cluster based approach for product feature categorization
Movahedian et al. A tag-based recommender system using rule-based collaborative profile enrichment
CN117573844B (en) Data recommendation method and device based on context awareness and related medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant