CN113868545A - Project recommendation method and device, electronic equipment and storage medium - Google Patents

Project recommendation method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113868545A
CN113868545A CN202111473818.1A CN202111473818A CN113868545A CN 113868545 A CN113868545 A CN 113868545A CN 202111473818 A CN202111473818 A CN 202111473818A CN 113868545 A CN113868545 A CN 113868545A
Authority
CN
China
Prior art keywords
user
users
determining
similarity
recommended
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111473818.1A
Other languages
Chinese (zh)
Other versions
CN113868545B (en
Inventor
陈程
王贺
石奕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Zhuoer Digital Media Technology Co ltd
Original Assignee
Wuhan Zhuoer Digital Media Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Zhuoer Digital Media Technology Co ltd filed Critical Wuhan Zhuoer Digital Media Technology Co ltd
Priority to CN202111473818.1A priority Critical patent/CN113868545B/en
Publication of CN113868545A publication Critical patent/CN113868545A/en
Application granted granted Critical
Publication of CN113868545B publication Critical patent/CN113868545B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a project recommendation method, a project recommendation device, electronic equipment and a storage medium, wherein the method comprises the following steps: determining a first similarity matrix based on the user item scoring matrix; clustering the M users based on the user item scoring matrix to obtain M1 user clusters; determining a target user cluster in the M1 user clusters to which a target user belongs, and determining M2 users in the target user cluster, wherein the similarity between the users and the target user meets a second condition on the basis of the first similarity matrix; and predicting the score of the target user for each item to be recommended in the K items to be recommended based on the scores of the M2 users for the K items to be recommended, and recommending part or all of the K items to be recommended to the target user based on the score of each item to be recommended.

Description

Project recommendation method and device, electronic equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a project recommendation method and device, electronic equipment and a storage medium.
Background
With the development of internet technology, information recommendation for users is more and more widely and deeply applied to various websites when the users browse information in the websites. The information recommendation system changes the interaction mode of the user and the information data, and the information is actively pushed to the user instead of being actively acquired by the user. In order to improve the information browsing experience of the user, how to recommend information to the user becomes a problem worthy of research.
Disclosure of Invention
In order to solve the technical problem, embodiments of the present application provide a project recommendation method and apparatus, an electronic device, and a storage medium.
The embodiment of the application provides a project recommendation method, which comprises the following steps:
determining a first similarity matrix based on the user item scoring matrix; the user item scoring matrix is used for indicating the scoring of each item in the N items by each user in the M users; the first similarity matrix is used for indicating the similarity between any two users in the M users;
clustering the M users based on the user item scoring matrix to obtain M1 user clusters; the similarity between users in each of the M1 user clusters meets a first condition;
determining a target user cluster in the M1 user clusters to which a target user belongs, and determining M2 users in the target user cluster, wherein the similarity between the users and the target user meets a second condition on the basis of the first similarity matrix;
and predicting the score of the target user for each item to be recommended in the K items to be recommended based on the scores of the M2 users for the K items to be recommended, and recommending part or all of the K items to be recommended to the target user based on the score of each item to be recommended.
In an optional embodiment of the present application, the clustering the M users based on the user item scoring matrix to obtain M1 user clusters includes:
clustering the M users based on the user item scoring matrix to obtain M1 user clusters and M3 noise points; each of the M3 noise points represents a first user;
determining a center point of each of the M1 user clusters; the central point of each user cluster corresponds to a second user;
for any first user in the M3 noise points, determining a first weighting factor and a second weighting factor corresponding to the first user and each second user corresponding to the central point of the M1 user cluster; wherein a first weighting factor between the first user and each of the second users is determined based on a proportion of common scored items between the first user and the second user to scored items of the first user, and the second weighting factor is determined based on a proportion of common scored items between the first user and the second user to a union of scored items of the first user and scored items of the second user;
and determining the minimum distance between the first user and each second user based on the first weighting factor and the second weighting factor, and determining the user cluster to which the first user belongs based on the minimum distance.
In an optional embodiment of the present application, the determining a minimum distance between the first user and each of the second users based on the first weighting factor and the second weighting factor includes:
determining a corresponding cosine similarity distance and variance distance between the first user and each second user based on the first weighting factor and the second weighting factor;
and determining the minimum distance between the first user and each second user based on the corresponding cosine similarity distance and variance distance between the first user and each second user.
In an optional embodiment of the present application, the determining, based on the minimum distance, a user cluster to which the first user belongs includes:
determining a minimum distance value of the minimum distance between the first user and each second user, and determining whether the minimum distance value meets a third condition;
and determining the user cluster to which the second user corresponding to the minimum distance value belongs as the user cluster to which the first user belongs under the condition that the minimum distance value meets a third condition.
In an optional embodiment of the application, the determining, based on the first similarity matrix, M2 users in the target user cluster whose similarity to the target user satisfies a second condition includes:
determining the similarity between each user belonging to the target user cluster and the target user from the first similarity matrix;
and determining the users with the similarity value of M2 bits at the top based on the similarity between each user belonging to the target user cluster and the target user.
In an optional embodiment of the present application, the determining a first similarity matrix based on a user item scoring matrix includes:
determining a second similarity matrix based on the user item scoring matrix; the second similarity matrix is used for indicating the similarity between any two users in the M users; wherein missing values exist between the second similarity matrices;
filling missing values in the second similarity matrix by using the average value of the similarities of the M users to obtain a filled second similarity matrix;
and performing dimensionality reduction on the filled second similarity matrix to obtain a first similarity matrix.
In an optional embodiment of the application, the predicting, based on the scores of the M2 users for the K items to be recommended, the score of the target user for each item to be recommended in the K items to be recommended includes:
determining the average scores of the M2 users for each item to be recommended in the K items to be recommended;
determining the average score corresponding to each item to be recommended in the K items to be recommended as the score of the target user for each item to be recommended in the K items to be recommended.
An embodiment of the present application further provides an item recommendation device, where the device includes:
the first determining unit is used for determining a first similarity matrix based on the user item scoring matrix; the user item scoring matrix is used for indicating the scoring of each item in the N items by each user in the M users; the first similarity matrix is used for indicating the similarity between any two users in the M users;
the clustering unit is used for clustering the M users based on the user item scoring matrix to obtain M1 user clusters; the similarity between users in each of the M1 user clusters meets a first condition;
a second determining unit, configured to determine a target user cluster in the M1 user clusters to which a target user belongs, and determine, based on the first similarity matrix, M2 users in the target user cluster whose similarities with the target user satisfy a second condition;
and the recommending unit is used for predicting the score of the target user for each item to be recommended in the K items to be recommended based on the scores of the M2 users for the K items to be recommended and recommending part or all of the K items to be recommended to the target user based on the scores of the items to be recommended.
In an optional embodiment of the present application, the clustering unit is specifically configured to: clustering the M users based on the user item scoring matrix to obtain M1 user clusters and M3 noise points; each of the M3 noise points represents a first user; determining a center point of each of the M1 user clusters; the central point of each user cluster corresponds to a second user; for any first user in the M3 noise points, determining a first weighting factor and a second weighting factor corresponding to the first user and each second user corresponding to the central point of the M1 user cluster; wherein a first weighting factor between the first user and each of the second users is determined based on a proportion of common scored items between the first user and the second user to scored items of the first user, and the second weighting factor is determined based on a proportion of common scored items between the first user and the second user to a union of scored items of the first user and scored items of the second user; and determining the minimum distance between the first user and each second user based on the first weighting factor and the second weighting factor, and determining the user cluster to which the first user belongs based on the minimum distance.
In an optional embodiment of the present application, the clustering unit is specifically configured to: determining a corresponding cosine similarity distance and variance distance between the first user and each second user based on the first weighting factor and the second weighting factor; and determining the minimum distance between the first user and each second user based on the corresponding cosine similarity distance and variance distance between the first user and each second user.
In an optional embodiment of the present application, the clustering unit is specifically configured to: determining a minimum distance value of the minimum distance between the first user and each second user, and determining whether the minimum distance value meets a third condition; and determining the user cluster to which the second user corresponding to the minimum distance value belongs as the user cluster to which the first user belongs under the condition that the minimum distance value meets a third condition.
In an optional embodiment of the present application, the second determining unit is specifically configured to: determining the similarity between each user belonging to the target user cluster and the target user from the first similarity matrix; and determining the users with the similarity value of M2 bits at the top based on the similarity between each user belonging to the target user cluster and the target user.
In an optional embodiment of the present application, the first determining unit is specifically configured to: determining a second similarity matrix based on the user item scoring matrix; the second similarity matrix is used for indicating the similarity between any two users in the M users; wherein missing values exist between the second similarity matrices; filling missing values in the second similarity matrix by using the average value of the similarities of the M users to obtain a filled second similarity matrix; and performing dimensionality reduction on the filled second similarity matrix to obtain a first similarity matrix.
In an optional embodiment of the present application, the recommending unit is specifically configured to: determining the average scores of the M2 users for each item to be recommended in the K items to be recommended; determining the average score corresponding to each item to be recommended in the K items to be recommended as the score of the target user for each item to be recommended in the K items to be recommended.
The embodiment of the present application further provides an electronic device, where the electronic device includes: the computer-readable medium may include a memory and a processor, wherein the memory stores computer-executable instructions, and the processor can implement the method of the above-mentioned embodiment when executing the computer-executable instructions on the memory.
The embodiment of the present application further provides a computer storage medium, where the storage medium stores executable instructions, and the executable instructions, when executed by a processor, implement the method according to the foregoing embodiment.
According to the technical scheme of the embodiment of the application, a first similarity matrix is determined based on a user project scoring matrix; the user item scoring matrix is used for indicating the scoring of each item in the N items by each user in the M users; the first similarity matrix is used for indicating the similarity between any two users in the M users; clustering the M users based on the user item scoring matrix to obtain M1 user clusters; the similarity between users in each of the M1 user clusters meets a first condition; determining a target user cluster in the M1 user clusters to which a target user belongs, and determining M2 users in the target user cluster, wherein the similarity between the users and the target user meets a second condition on the basis of the first similarity matrix; and predicting the score of the target user for each item to be recommended in the K items to be recommended based on the scores of the M2 users for the K items to be recommended, and recommending part or all of the K items to be recommended to the target user based on the score of each item to be recommended. Therefore, the user can be divided into different user groups, the interest degree of the user to be recommended in each item to be recommended is predicted based on the scores of the items to be recommended of the users, which belong to the same user group with the user to be recommended and have higher similarity with the user to be recommended, and the items to be recommended are recommended to the user to be recommended.
Drawings
Fig. 1 is a schematic flowchart of an item recommendation method according to an embodiment of the present application;
fig. 2 is a schematic structural composition diagram of an item recommendation device according to an embodiment of the present application;
fig. 3 is a schematic structural component diagram of an electronic device according to an embodiment of the present application.
Detailed Description
So that the manner in which the features and elements of the present embodiments can be understood in detail, a more particular description of the embodiments, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings.
Fig. 1 is a schematic flow chart of an item recommendation method provided in an embodiment of the present application, and as shown in fig. 1, the item recommendation method provided in the embodiment of the present application includes the following steps:
step 101: a first similarity matrix is determined based on the user item scoring matrix.
In the embodiment of the application, the user project scoring matrix is used for indicating each user of M users to score each project of N projects; the first similarity matrix is used for indicating the similarity between any two users in the M users. Here, M and N are both integers equal to or greater than 1, and the first similarity matrix may also be referred to as a user similarity matrix.
The user item scoring matrix may be represented in the form of a matrix:
Figure 878143DEST_PATH_IMAGE001
(1)
in the matrix (1), the user item scoring matrix R is a matrix with M rows and N columns, M is the number of users, N is the number of items, and users and items are in the set U and the set S, respectively.
In an optional embodiment of the present application, the step 101 specifically includes the following steps:
determining a second similarity matrix based on the user item scoring matrix; the second similarity matrix is used for indicating the similarity between any two users in the M users; wherein missing values exist between the second similarity matrices;
filling missing values in the second similarity matrix by using the average value of the similarities of the M users to obtain a filled second similarity matrix;
and performing dimensionality reduction on the filled second similarity matrix to obtain a first similarity matrix.
Specifically, as an alternative implementation manner, the second similarity matrix may be obtained based on the user item scoring matrix in the following manner.
Regarding the scoring vector of the project of the user as a point in a high-dimensional space, calculating the distance between two users and the scoring vector of the same project in the high-dimensional space through Euclidean distance to obtain the similarity of the two users, wherein the calculation formula of the Euclidean distance of the two users is as follows:
Figure 857601DEST_PATH_IMAGE002
(2)
wherein,
Figure 902917DEST_PATH_IMAGE003
is the rating of the item i by the user x,
Figure 111176DEST_PATH_IMAGE004
is the score of the user y on the item i, and N is the number of items.
In the formula (2), the first and second groups,
Figure 659969DEST_PATH_IMAGE005
the larger the distance between two users, i.e. the more dissimilar the two users, the more dissimilar the calculation formula of the similarity between the two users is as follows:
Figure 748011DEST_PATH_IMAGE006
(3)
wherein,
Figure 811781DEST_PATH_IMAGE007
a larger value indicates that the two users are more similar,
Figure 10682DEST_PATH_IMAGE008
has a value range of (0, 1)]。
As an optional implementation manner, the missing values in the second similarity matrix may be filled by using an average value of the similarities between all the users, and then, the filled second similarity matrix may be subjected to dimension reduction processing to obtain the first similarity matrix after the dimension reduction. For example, the first similarity Matrix may be obtained by performing dimensionality reduction on the second similarity Matrix by using a non-Negative Matrix Factorization (NMF) Matrix dimensionality reduction method.
Step 102: and clustering the M users based on the user item scoring matrix to obtain M1 user clusters.
In this embodiment of the present application, the similarity between users in each of the M1 user clusters satisfies a first condition.
Here, the fact that the similarity between the users satisfies the first condition may be specifically understood that the similarity between the users is greater than or equal to a first similarity threshold, and for example, a value of the first similarity threshold may be 0.5.
In an optional embodiment of the present application, the step 102 specifically includes the following steps:
step 2-1): clustering the M users based on the user item scoring matrix to obtain M1 user clusters and M3 noise points; each of the M3 noise points represents a first user;
step 2-2): determining a center point of each of the M1 user clusters; the central point of each user cluster corresponds to a second user;
step 2-3): for any first user in the M3 noise points, determining a first weighting factor and a second weighting factor corresponding to the first user and each second user corresponding to the central point of the M1 user cluster; wherein a first weighting factor between the first user and each of the second users is determined based on a proportion of common scored items between the first user and the second user to scored items of the first user, and the second weighting factor is determined based on a proportion of common scored items between the first user and the second user to a union of scored items of the first user and scored items of the second user;
step 2-4): and determining the minimum distance between the first user and each second user based on the first weighting factor and the second weighting factor, and determining the user cluster to which the first user belongs based on the minimum distance.
Specifically, in the embodiment of the present application, a main process of clustering M users based on a user item scoring matrix is as follows:
1) firstly, selecting a lower Eps (distance) value, and obtaining M1 user clusters through a Clustering algorithm (DBSCAN, sensitivity-Based Spatial Clustering of Applications with Noise) Based on a user item scoring matrix
Figure 492610DEST_PATH_IMAGE009
And M3 noise points
Figure 751553DEST_PATH_IMAGE010
. Each point in each user cluster represents a user, the correlation between each point in each user cluster is high, and each noise point also represents a user.
2) Calculating the center point of each user cluster
Figure 505882DEST_PATH_IMAGE011
The average value of all the points in the user cluster may be used as the center point, or the last point of each user cluster after the DBSCAN algorithm iteration in step (1) is completed may be directly used as the center point of the user cluster.
3) A first weighting factor and a second weighting factor between each of the M3 noise points and a center point of each user cluster are determined.
Here, the first weighting factor is a parameter considering a ratio of the scored items common to the two users to the scored items of the target user, and the first weighting factor may be calculated using the following formula (4):
Figure 39632DEST_PATH_IMAGE012
(4)
wherein,
Figure 563017DEST_PATH_IMAGE013
representing a common set of scored items for target user u and user v,
Figure 71490DEST_PATH_IMAGE014
a set of scored items representing a target user u.
Although the first weighting factor takes into account the proportion of the common scored items of the two users, but neglects the proportion of the common scored items of the two users in the union of the scored items of the two users, the application introduces a second weighting factor, which represents the proportion of the common scored items of the two users in the union of the scored items of the two users, and the calculation formula of the second weighting factor is as follows:
Figure 47536DEST_PATH_IMAGE015
(5)
wherein,
Figure 588239DEST_PATH_IMAGE013
representing a common set of scored items for target user u and user v,
Figure 294027DEST_PATH_IMAGE016
a union of the scored items representing two users u and v.
In the embodiment of the present application, a weighting factor for calculating a distance between a noise point and a center point of a user cluster may be determined based on a first weighting factor and a second weighting factor, and the weighting factor is determined by the following formula (6).
Figure 629193DEST_PATH_IMAGE017
(6)
By the above weighting factors
Figure 905585DEST_PATH_IMAGE018
The distance of each noise point of the M3 noise points from the center point of each cluster of users can be calculated.
4) Determining a distance between each noise point of the M3 noise points and a center point of each user cluster based on the first weighting factor and the second weighting factor, and determining whether each noise point can be classified as a user cluster of the user clusters based on the determined distance between each noise point and the center point of each user cluster.
In an optional embodiment of the present application, the step 2-4) specifically includes the following steps:
determining a corresponding cosine similarity distance and variance distance between the first user and each second user based on the first weighting factor and the second weighting factor;
and determining the minimum distance between the first user and each second user based on the corresponding cosine similarity distance and variance distance between the first user and each second user.
Specifically, in the embodiment of the present application, a weighting factor for calculating a distance between a noise point and a center point of a user cluster may be determined by using a first weighting factor and a second weighting factor
Figure 984399DEST_PATH_IMAGE019
By using
Figure 482377DEST_PATH_IMAGE019
The cosine similarity distance and variance distance between each noise point in the M3 noise points and the center point of each user cluster can be calculated.
Here, the formula for calculating the user similarity distance between users using the first and second weighting factors is as the following formula (7):
Figure 316340DEST_PATH_IMAGE020
(7)
wherein,
Figure 532558DEST_PATH_IMAGE021
represents the scoring of the a item by the target user u,
Figure 962533DEST_PATH_IMAGE022
represents the rating of the user v for the a project.
The formula for calculating the user similarity distance between users using the first weighting factor and the second weighting factor is as the following formula (8):
Figure 315017DEST_PATH_IMAGE023
(8)
wherein,
Figure 257566DEST_PATH_IMAGE024
representing the number of sets of commonly scored items for target user u and user v.
In the embodiment of the present application, based on the above formula (7) and formula (8), the cosine similarity distance between each noise point of M3 noise points and the center points of all user clusters can be calculated
Figure 757817DEST_PATH_IMAGE025
And calculating the variance distance between each noise point and the center point of all user clusters
Figure 178434DEST_PATH_IMAGE026
Minimum value of product of cosine similarity distance and variance distance calculated from above for each noise point
Figure 729632DEST_PATH_IMAGE027
The minimum value is determined as the minimum distance of the noise point from the corresponding user cluster.
In an optional embodiment of the present application, the step 2-4) further includes the following steps:
determining a minimum distance value of the minimum distance between the first user and each second user, and determining whether the minimum distance value meets a third condition;
and determining the user cluster to which the second user corresponding to the minimum distance value belongs as the user cluster to which the first user belongs under the condition that the minimum distance value meets a third condition.
In particular, if the minimum distance is
Figure 577503DEST_PATH_IMAGE028
At a certain preset threshold
Figure 299471DEST_PATH_IMAGE029
In situ namely
Figure 523779DEST_PATH_IMAGE030
And if so, classifying the noise point as a user cluster corresponding to the minimum distance, otherwise, discarding the noise point. And finishing the division of the user clusters until all samples are processed and no new noise point is added into each user cluster.
Step 103: determining a target user cluster in the M1 user clusters to which a target user belongs, and determining M2 users in the target user cluster, wherein the similarity between the users and the target user meets a second condition on the basis of the first similarity matrix.
Specifically, in the embodiment of the application, after the user group is divided based on the user item scoring matrix, the user group to which the target user belongs may be determined, and based on the user group to which the target user belongs, a user similar to the target user may be determined.
In an optional embodiment of the present application, the step 103 specifically includes the following steps:
determining the similarity between each user belonging to the target user cluster and the target user from the first similarity matrix;
and determining the users with the similarity value of M2 bits at the top based on the similarity between each user belonging to the target user cluster and the target user.
Specifically, after the users similar to the target user can be determined based on the user group to which the target user belongs, M2 users that belong to the same user group as the target user and have the highest similarity with the target user can be found from the first similarity matrix, and the M2 users are specifically determined based on the numerical value in the first similarity matrix.
Step 104: and predicting the score of the target user for each item to be recommended in the K items to be recommended based on the scores of the M2 users for the K items to be recommended, and recommending part or all of the K items to be recommended to the target user based on the score of each item to be recommended.
In the implementation of the application, the determined M2 users belong to the same user group as the target user and have higher similarity with the target user, so the interest preference of the users can be represented more accurately by using the historical item scores of the M2 users, and the recommendation of the items aiming at the target user is realized.
In an optional embodiment of the present application, the step 104 specifically includes the following steps:
determining the average scores of the M2 users for each item to be recommended in the K items to be recommended;
determining the average score corresponding to each item to be recommended in the K items to be recommended as the score of the target user for each item to be recommended in the K items to be recommended.
And for M2 users with the highest similarity to the target user u found in the first similarity matrix, when the score of the target user u on the item i is predicted, taking the average value of the scores of the M2 users on the item i as the predicted score of the target user u on the item i. In the case that recommendation of a plurality of items i is required for the target user u, a similar method can be adopted to predict the rating of the target user u for each item i in the plurality of items i, and the recommendation of the plurality of items i is performed in sequence based on the level of the rating value.
According to the technical scheme, the user groups are divided into different user groups, the interest degree of the user to be recommended in each item to be recommended can be predicted based on the scores of the items to be recommended of the users to be recommended, which belong to the same user group with the user to be recommended and have higher similarity with the user to be recommended, and therefore the items to be recommended are recommended to the user to be recommended.
In addition, when the user is divided into different user groups, the cosine similarity distance and the variance distance between the two users can be calculated by combining the factor of the proportion of the common scored items of the two users in the scored item union of the two users, the noise points are divided when the users are clustered based on the two distance values, the user groups can be divided more accurately, the users with higher similarity to the user to be recommended can be determined by the accurately divided user groups, the scoring of the user to be recommended to each item to be recommended can be predicted based on the item scoring of the user with higher similarity to the user to be recommended, and the purpose of more accurately recommending the item to the user to be recommended is finally achieved.
An embodiment of the present application further provides an item recommendation device, fig. 2 is a schematic structural composition diagram of the item recommendation device provided in the embodiment of the present application, and as shown in fig. 2, the device includes:
a first determining unit 201, configured to determine a first similarity matrix based on the user item scoring matrix; the user item scoring matrix is used for indicating the scoring of each item in the N items by each user in the M users; the first similarity matrix is used for indicating the similarity between any two users in the M users;
a clustering unit 202, configured to cluster the M users based on the user item scoring matrix, to obtain M1 user clusters; the similarity between users in each of the M1 user clusters meets a first condition;
a second determining unit 203, configured to determine a target user cluster in the M1 user clusters to which a target user belongs, and determine, based on the first similarity matrix, M2 users in the target user cluster whose similarities with the target user satisfy a second condition;
the recommending unit 204 is configured to predict, based on the scores of the M2 users for the K items to be recommended, the score of the target user for each item to be recommended in the K items to be recommended, and recommend, to the target user, based on the scores of the items to be recommended, part or all of the K items to be recommended.
In an optional embodiment of the present application, the clustering unit 202 is specifically configured to: clustering the M users based on the user item scoring matrix to obtain M1 user clusters and M3 noise points; each of the M3 noise points represents a first user; determining a center point of each of the M1 user clusters; the central point of each user cluster corresponds to a second user; for any first user in the M3 noise points, determining a first weighting factor and a second weighting factor corresponding to the first user and each second user corresponding to the central point of the M1 user cluster; wherein a first weighting factor between the first user and each of the second users is determined based on a proportion of common scored items between the first user and the second user to scored items of the first user, and the second weighting factor is determined based on a proportion of common scored items between the first user and the second user to a union of scored items of the first user and scored items of the second user; and determining the minimum distance between the first user and each second user based on the first weighting factor and the second weighting factor, and determining the user cluster to which the first user belongs based on the minimum distance.
In an optional embodiment of the present application, the clustering unit 202 is specifically configured to: determining a corresponding cosine similarity distance and variance distance between the first user and each second user based on the first weighting factor and the second weighting factor; and determining the minimum distance between the first user and each second user based on the corresponding cosine similarity distance and variance distance between the first user and each second user.
In an optional embodiment of the present application, the clustering unit 202 is specifically configured to: determining a minimum distance value of the minimum distance between the first user and each second user, and determining whether the minimum distance value meets a third condition; and determining the user cluster to which the second user corresponding to the minimum distance value belongs as the user cluster to which the first user belongs under the condition that the minimum distance value meets a third condition.
In an optional embodiment of the present application, the second determining unit 203 is specifically configured to: determining the similarity between each user belonging to the target user cluster and the target user from the first similarity matrix; and determining the users with the similarity value of M2 bits at the top based on the similarity between each user belonging to the target user cluster and the target user.
In an optional implementation manner of this application, the first determining unit 201 is specifically configured to: determining a second similarity matrix based on the user item scoring matrix; the second similarity matrix is used for indicating the similarity between any two users in the M users; wherein missing values exist between the second similarity matrices; filling missing values in the second similarity matrix by using the average value of the similarities of the M users to obtain a filled second similarity matrix; and performing dimensionality reduction on the filled second similarity matrix to obtain a first similarity matrix.
In an optional embodiment of the present application, the recommending unit 204 is specifically configured to: determining the average scores of the M2 users for each item to be recommended in the K items to be recommended; determining the average score corresponding to each item to be recommended in the K items to be recommended as the score of the target user for each item to be recommended in the K items to be recommended.
It will be appreciated by those skilled in the art that the functions performed by the elements of the item recommendation apparatus shown in FIG. 2 may be understood by reference to the foregoing description of the item recommendation method. The functions of the units in the item recommendation device shown in fig. 2 may be implemented by a program running on a processor, or may be implemented by specific logic circuits.
The embodiment of the application also provides the electronic equipment. Fig. 3 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application, and as shown in fig. 3, the electronic device includes: a communication component 303 for data transmission, at least one processor 301 and a memory 302 for storing computer programs capable of running on the processor 301. The various components in the terminal are coupled together by a bus system 304. It will be appreciated that the bus system 304 is used to enable communications among the components. The bus system 304 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 304 in fig. 3.
Wherein the processor 301 executes the computer program to perform at least the steps of the method shown in fig. 1.
It will be appreciated that the memory 302 can be either volatile memory or nonvolatile memory, and can include both volatile and nonvolatile memory. Among them, the nonvolatile Memory may be a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), a magnetic random access Memory (FRAM), a Flash Memory (Flash Memory), a magnetic surface Memory, an optical disk, or a Compact Disc Read-Only Memory (CD-ROM); the magnetic surface storage may be disk storage or tape storage. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Synchronous Static Random Access Memory (SSRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic Random Access Memory (SDRAM), Double Data Rate Synchronous Dynamic Random Access Memory (DDRSDRAM), Enhanced Synchronous Dynamic Random Access Memory (ESDRAM), Enhanced Synchronous Dynamic Random Access Memory (Enhanced DRAM), Synchronous Dynamic Random Access Memory (SLDRAM), Direct Memory (DRmb Access), and Random Access Memory (DRAM). The memory 302 described in embodiments herein is intended to comprise, without being limited to, these and any other suitable types of memory.
The method disclosed in the embodiment of the present application may be applied to the processor 301, or implemented by the processor 301. The processor 301 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 301. The processor 301 described above may be a general purpose processor, a DSP, or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. The processor 301 may implement or perform the methods, steps, and logic blocks disclosed in the embodiments of the present application. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software modules may be located in a storage medium located in the memory 302, and the processor 301 reads the information in the memory 302 and performs the steps of the aforementioned methods in conjunction with its hardware.
In an exemplary embodiment, the electronic Device may be implemented by one or more Application Specific Integrated Circuits (ASICs), DSPs, Programmable Logic Devices (PLDs), Complex Programmable Logic Devices (CPLDs), FPGAs, general purpose processors, controllers, MCUs, microprocessors (microprocessors), or other electronic components for performing the aforementioned call recording method.
An embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is configured to, when executed by a processor, perform at least the steps of the method shown in fig. 1. The computer readable storage medium may be specifically a memory. The memory may be memory 302 as shown in fig. 3.
The technical solutions described in the embodiments of the present application can be arbitrarily combined without conflict.
In the several embodiments provided in the present application, it should be understood that the disclosed method and intelligent device may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, all functional units in the embodiments of the present application may be integrated into one second processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application.

Claims (10)

1. A method for recommending items, the method comprising:
determining a first similarity matrix based on the user item scoring matrix; the user item scoring matrix is used for indicating the scoring of each item in the N items by each user in the M users; the first similarity matrix is used for indicating the similarity between any two users in the M users;
clustering the M users based on the user item scoring matrix to obtain M1 user clusters; the similarity between users in each of the M1 user clusters meets a first condition;
determining a target user cluster in the M1 user clusters to which a target user belongs, and determining M2 users in the target user cluster, wherein the similarity between the users and the target user meets a second condition on the basis of the first similarity matrix;
and predicting the score of the target user for each item to be recommended in the K items to be recommended based on the scores of the M2 users for the K items to be recommended, and recommending part or all of the K items to be recommended to the target user based on the score of each item to be recommended.
2. The method of claim 1, wherein said clustering said M users based on said user item scoring matrix, resulting in M1 user clusters, comprises:
clustering the M users based on the user item scoring matrix to obtain M1 user clusters and M3 noise points; each of the M3 noise points represents a first user;
determining a center point of each of the M1 user clusters; the central point of each user cluster corresponds to a second user;
for any first user in the M3 noise points, determining a first weighting factor and a second weighting factor corresponding to the first user and each second user corresponding to the central point of the M1 user cluster; wherein a first weighting factor between the first user and each of the second users is determined based on a proportion of common scored items between the first user and the second user to scored items of the first user, and the second weighting factor is determined based on a proportion of common scored items between the first user and the second user to a union of scored items of the first user and scored items of the second user;
and determining the minimum distance between the first user and each second user based on the first weighting factor and the second weighting factor, and determining the user cluster to which the first user belongs based on the minimum distance.
3. The method of claim 2, wherein determining the minimum distance between the first user and the second users based on the first weighting factor and the second weighting factor comprises:
determining a corresponding cosine similarity distance and variance distance between the first user and each second user based on the first weighting factor and the second weighting factor;
and determining the minimum distance between the first user and each second user based on the corresponding cosine similarity distance and variance distance between the first user and each second user.
4. The method of claim 3, wherein the determining the cluster of users to which the first user belongs based on the minimum distance comprises:
determining a minimum distance value of the minimum distance between the first user and each second user, and determining whether the minimum distance value meets a third condition;
and determining the user cluster to which the second user corresponding to the minimum distance value belongs as the user cluster to which the first user belongs under the condition that the minimum distance value meets a third condition.
5. The method according to claim 1, wherein the determining, based on the first similarity matrix, M2 users in the target user cluster whose similarities with the target user satisfy a second condition includes:
determining the similarity between each user belonging to the target user cluster and the target user from the first similarity matrix;
and determining the users with the similarity value of M2 bits at the top based on the similarity between each user belonging to the target user cluster and the target user.
6. The method of any of claims 1-5, wherein determining a first similarity matrix based on a user item scoring matrix comprises:
determining a second similarity matrix based on the user item scoring matrix; the second similarity matrix is used for indicating the similarity between any two users in the M users; wherein missing values exist between the second similarity matrices;
filling missing values in the second similarity matrix by using the average value of the similarities of the M users to obtain a filled second similarity matrix;
and performing dimensionality reduction on the filled second similarity matrix to obtain a first similarity matrix.
7. The method according to any one of claims 1 to 5, wherein the predicting the score of the target user for each item to be recommended in the K items to be recommended based on the scores of the M2 users for the K items to be recommended comprises:
determining the average scores of the M2 users for each item to be recommended in the K items to be recommended;
determining the average score corresponding to each item to be recommended in the K items to be recommended as the score of the target user for each item to be recommended in the K items to be recommended.
8. An item recommendation apparatus, characterized in that the apparatus comprises:
the first determining unit is used for determining a first similarity matrix based on the user item scoring matrix; the user item scoring matrix is used for indicating the scoring of each item in the N items by each user in the M users; the first similarity matrix is used for indicating the similarity between any two users in the M users;
the clustering unit is used for clustering the M users based on the user item scoring matrix to obtain M1 user clusters; the similarity between users in each of the M1 user clusters meets a first condition;
a second determining unit, configured to determine a target user cluster in the M1 user clusters to which a target user belongs, and determine, based on the first similarity matrix, M2 users in the target user cluster whose similarities with the target user satisfy a second condition;
and the recommending unit is used for predicting the score of the target user for each item to be recommended in the K items to be recommended based on the scores of the M2 users for the K items to be recommended and recommending part or all of the K items to be recommended to the target user based on the scores of the items to be recommended.
9. An electronic device, characterized in that the electronic device comprises: a memory having computer-executable instructions stored thereon and a processor operable to implement the method of any of claims 1 to 7 when executing the computer-executable instructions on the memory.
10. A computer storage medium having stored thereon executable instructions that when executed by a processor implement the method of any one of claims 1 to 7.
CN202111473818.1A 2021-11-30 2021-11-30 Project recommendation method and device, electronic equipment and storage medium Active CN113868545B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111473818.1A CN113868545B (en) 2021-11-30 2021-11-30 Project recommendation method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111473818.1A CN113868545B (en) 2021-11-30 2021-11-30 Project recommendation method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113868545A true CN113868545A (en) 2021-12-31
CN113868545B CN113868545B (en) 2022-02-22

Family

ID=78986085

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111473818.1A Active CN113868545B (en) 2021-11-30 2021-11-30 Project recommendation method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113868545B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100125585A1 (en) * 2008-11-17 2010-05-20 Yahoo! Inc. Conjoint Analysis with Bilinear Regression Models for Segmented Predictive Content Ranking
CN104899232A (en) * 2014-03-07 2015-09-09 华为技术有限公司 Cooperative clustering method and cooperative clustering equipment
CN107391582A (en) * 2017-06-21 2017-11-24 浙江工商大学 The information recommendation method of user preference similarity is calculated based on context ontology tree
CN109241415A (en) * 2018-08-20 2019-01-18 平安科技(深圳)有限公司 Item recommendation method, device, computer equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100125585A1 (en) * 2008-11-17 2010-05-20 Yahoo! Inc. Conjoint Analysis with Bilinear Regression Models for Segmented Predictive Content Ranking
CN104899232A (en) * 2014-03-07 2015-09-09 华为技术有限公司 Cooperative clustering method and cooperative clustering equipment
CN107391582A (en) * 2017-06-21 2017-11-24 浙江工商大学 The information recommendation method of user preference similarity is calculated based on context ontology tree
CN109241415A (en) * 2018-08-20 2019-01-18 平安科技(深圳)有限公司 Item recommendation method, device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN113868545B (en) 2022-02-22

Similar Documents

Publication Publication Date Title
CN110866181B (en) Resource recommendation method, device and storage medium
CN111966912B (en) Recommendation method and device based on knowledge graph, computer equipment and storage medium
CN110827924B (en) Clustering method and device for gene expression data, computer equipment and storage medium
US20190050672A1 (en) INCREMENTAL AUTOMATIC UPDATE OF RANKED NEIGHBOR LISTS BASED ON k-th NEAREST NEIGHBORS
CN107944931A (en) Seed user expanding method, electronic equipment and computer-readable recording medium
CN111797319B (en) Recommendation method, recommendation device, recommendation equipment and storage medium
GB2581458A (en) Authentication of users at multiple terminals
CN112381620A (en) Product recommendation method and device, electronic equipment and computer-readable storage medium
CN110442623B (en) Big data mining method and device and data mining server
CN109446515A (en) Group information analysis method, electronic device and computer readable storage medium
CN114219664B (en) Product recommendation method, device, computer equipment and storage medium
CN112905885B (en) Method, apparatus, device, medium and program product for recommending resources to user
CN113868545B (en) Project recommendation method and device, electronic equipment and storage medium
CN113110843A (en) Contract generation model training method, contract generation method and electronic equipment
CN112650940A (en) Recommendation method and device of application program, computer equipment and storage medium
CN109657153A (en) It is a kind of for determining the method and apparatus of the association financial information of user
CN111695917B (en) Commodity recommendation method, commodity recommendation system, electronic equipment and storage medium
CN115794806A (en) Gridding processing system, method and device for financial data and computing equipment
CN113505276A (en) Scoring method, device, equipment and storage medium of pre-calculation model
CN112597161A (en) Data processing method, electronic device and readable storage medium
CN112259239A (en) Parameter processing method and device, electronic equipment and storage medium
CN117216803B (en) Intelligent finance-oriented user information protection method and system
CN113886723B (en) Method and device for determining ordering stability, storage medium and electronic equipment
CN118037355A (en) Information click rate prediction method and device, electronic equipment and storage medium
CN112632102A (en) Data query method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant