CN112307332B - Collaborative filtering recommendation method and system based on user portrait clustering and storage medium - Google Patents

Collaborative filtering recommendation method and system based on user portrait clustering and storage medium Download PDF

Info

Publication number
CN112307332B
CN112307332B CN202011114490.XA CN202011114490A CN112307332B CN 112307332 B CN112307332 B CN 112307332B CN 202011114490 A CN202011114490 A CN 202011114490A CN 112307332 B CN112307332 B CN 112307332B
Authority
CN
China
Prior art keywords
user
data
behavior
clustering
adopting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011114490.XA
Other languages
Chinese (zh)
Other versions
CN112307332A (en
Inventor
尚天淇
彭德中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN202011114490.XA priority Critical patent/CN112307332B/en
Publication of CN112307332A publication Critical patent/CN112307332A/en
Application granted granted Critical
Publication of CN112307332B publication Critical patent/CN112307332B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

According to the collaborative filtering recommendation method, system and storage medium based on user portrait clustering, user data including attribute data and behavior data are obtained; characterizing user data to form user characterization information; carrying out dimension reduction compression on the user representation information to form a low-dimensional user portrait; clustering the low-dimensional user portrait by adopting a clustering method to form a user interest cluster; and recommending the target user in the interest cluster in which the target user is positioned by adopting a user-based collaborative filtering method. The change of user behavior along with time information is considered, the self-adaptive fusion is carried out on the inherent attribute information of the user, the past behavior of the user and the short-term behavior of the user, the user is clustered according to the low-dimensional user portrait, the collaborative filtering recommendation based on the user is carried out on the user according to the clustered category, the efficiency and the precision can be considered, the calculation complexity is reduced, the higher recommendation speed and the higher recommendation accuracy are ensured, and the self-adaptive recommendation for the user behavior change is realized.

Description

Collaborative filtering recommendation method and system based on user portrait clustering and storage medium
Technical Field
The invention relates to the technical field of big data artificial intelligence, in particular to a collaborative filtering recommendation method and system based on user portrait clustering and a storage medium.
Background
With the rapid development of the internet and the automatic technology, more and more people have smart phones, tablet computers and other intelligent terminals, so that data information of production and life is increased explosively, and the information overload problem is caused. When a user searches for information of interest, a lot of time and effort are spent on filtering out useless information, however, the result is often not satisfactory for the user, and therefore, the personalized recommendation technology is timely. The personalized recommendation technology is to recommend interested contents to a user by using certain interest points and purchase characteristics of the user, and is an effective way for solving the problem of information overload. In the personalized recommendation technology, the collaborative filtering recommendation technology is the most mature and widely applied technology. The collaborative filtering is to simply predict the information of interest of the user according to the user group with related interests and recommend the information to the target user. However, due to the rapid increase of the number of users and commodities, the traditional collaborative filtering recommendation method has the problems of cold start, data sparsity, low efficiency and the like.
In order to improve the performance of the conventional collaborative recommendation method, researchers have studied the traditional collaborative recommendation method based on the above problems.
Aiming at the problem of data sparsity, a user score sparse matrix is filled usually, and similarity calculation factors are introduced to calculate user similarity; and the high-dimensional sparse data can be preprocessed by adopting a matrix decomposition algorithm, so that the data sparsity is reduced.
For the cold start problem, a collaborative filtering algorithm that information (such as social information, attribute information and the like) of inherent attributes of the extended user is fused into user behaviors is generally adopted, so that the cold start problem of the user is effectively relieved.
Aiming at the problem of low efficiency, users with higher interest and preference similarity degrees are classified into the same cluster by analyzing the user evaluation matrix and adopting a K-means clustering algorithm so as to reduce the time for searching nearest neighbors.
The above methods, while solving some of the problems to a great extent, lack integrity, efficiency or accuracy of the pursuit of one side.
For example, a patent publication No. CN106548255A proposes a commodity recommendation method based on massive user behaviors, and although the patent application adopts a clustering method, the similarity calculation of the clustering method is invalid due to the ultra-high dimensionality of the massive behaviors; and a collaborative filtering recommendation algorithm based on dimensionality reduction and clustering, for example, a PCA and K-means combined mode is adopted to analyze a user scoring matrix, and although efficiency is considered, the cold start problem still exists, and the user behavior change problem is not considered.
Well characterized users are the key and prerequisite of collaborative filtering recommendation algorithms. It is a well-established idea to exploit as much of the user data as possible, however it is difficult to exploit this data well. On one hand, different data are treated differently, particularly because of possible changes of user interests, data types need to be distinguished and proper processing needs to be carried out; on the other hand, the data fusion can not be realized simply by data splicing or a simple weighting algorithm; in addition, improper data fusion may result in higher dimensionality of user data, not only leading to possible dimension cursing problems that render computation ineffective, but also may render a subtly designed algorithm inefficient.
Disclosure of Invention
The invention provides a collaborative filtering recommendation method, a collaborative filtering recommendation system and a storage medium based on user portrait clustering, which mainly solve the technical problems that: how to process the user data so as to reduce the complexity of data processing and improve the accuracy of recommendation.
In order to solve the technical problem, the invention provides a collaborative filtering recommendation method based on user portrait clustering, which comprises the following steps:
s1: acquiring user data comprising attribute data and behavior data;
s2: characterizing user data to form user characterization information;
s3: performing dimension reduction compression on the user representation information to form a low-dimensional user portrait;
s4: clustering the low-dimensional user portrait by adopting a clustering method to form a user interest cluster;
s5: and recommending the target user in the interest cluster in which the target user is positioned by adopting a user-based collaborative filtering method.
Optionally, the behavior data includes: the behavior data is divided into historical behavior data and recent behavior data according to the time information of behavior occurrence.
Optionally, the S2 characterizes the user data, and forming the user characterization information includes:
s21: encoding the attribute data by adopting an One-Hot method, and fusing by adopting Concat to form a representation user attribute;
s22: encoding the behavior data by adopting an LSTM network, and performing self-adaptive fusion by adopting an Attention neural network to form a representation user behavior;
s23: and performing self-adaptive fusion on the attribute of the representation user and the behavior of the representation user by adopting an Attention neural network to form user representation information.
Optionally, the S22 encoding the behavior data by using an LSTM network includes:
the recent behavior data and the historical behavior data are respectively encoded by two LSTM networks which are connected in parallel, and the working process of the LSTM networks can be described as follows:
fk=σ(xkWf+hk-1Uf+bf)
ik=σ(xkWi+hk-1Ui+bi)
ck=fk⊙ck-1+ik⊙φ(xkWc+hk-1Uc+bc)
ok=σ(xkWo+hk-1Uo+bo)
hk=ok⊙φ(ck)
wherein, the hkHigh state, W, for the k-th item*As a weight, U*Is hkWeight of (f)k、ik、okRespectively a forgetting gate, an input gate and an output gate, ckIs cell state, xkIs input, < > is dot product, b*For the network bias term, σ is the activation function and φ is the tanh function.
Optionally, the performing adaptive fusion by using the Attention neural network to form the user characterization information includes:
the Attention neural network is used for two inputs p1、p2Carrying out data self-adaptive fusion, wherein the number of network layers is N, and a fusion formula is as follows:
α=σ(Wm[p1,p2]+bm)
p=α·p1+(1-α)p2
wherein, WmAs a weight of the m-th layer of the network, bmA network mth layer bias term, wherein m is less than or equal to N; σ is the activation function, p1、p2Respectively inputting the Attention neural network; when the behavior data is matchedWhen fusion is carried out, the p1、p2Respectively the recent behavior data and the historical behavior data; when the characterization user attributes and the characterization user behaviors are merged, the p1、p2Respectively representing the attribute of the characterization user and the behavior of the characterization user; the p is the user characterization information, and the alpha is the output of the Attention network.
Optionally, the S3 performing dimension reduction compression on the user characterization information includes:
performing dimensionality reduction compression on the user characterization information by adopting an Auto-Encoder neural network, wherein the Auto-Encoder neural network consists of an encoding layer, a decoding layer and a hidden layer; the hidden layer is low-dimensional data, the number of layers of the coding layer and the decoding layer is the same, the hidden layer is symmetrically distributed by taking the hidden layer as an axis, and the number of layers of the Auto-Encoder neural network depends on the compression ratio of the data; the loss function adopted in the training process of the Auto-Encoder neural network dimension reduction method is as follows: l | ru-D(E(p))‖2Wherein E (-) is an encoded layer transform, D (-) is a decoded layer transform, and E (p) is a low-dimensional user representation.
Optionally, the clustering method adopted in S4 is a K-means clustering method.
Optionally, the S5 includes: recommending the object liked by the user v in the interest cluster to the target user u based on the collaborative filtering recommendation method of the user:
s51, calculating cosine similarity of the user u and the user v:
Figure BDA0002728510520000041
wherein, n (u) represents an object set that the target user u has a positive feedback behavior, and n (v) represents an object set that the user v has a positive feedback behavior;
s52, calculating the behavior similarity of the jth object target user u: p (u, j) ═ Σv∈iwuvrv,jWherein r isv,jAnd representing the behavior of the user v on the jth object, wherein the behavior is 1 if the behavior is existed, and the behavior is 0 if the behavior is not existed.
The invention also provides a collaborative filtering recommendation system based on user portrait clustering, which comprises:
the data acquisition module is used for acquiring user data comprising attribute data and behavior data;
the characterization module is used for characterizing the user data to form user characterization information;
the compression module is used for carrying out dimension reduction compression on the user representation information to form a low-dimensional user portrait;
the clustering module is used for clustering the low-dimensional user portrait by adopting a clustering method to form a user interest cluster;
and the recommending module is used for recommending the target user in the interest cluster in which the target user is positioned by adopting a user-based collaborative filtering method.
The present invention also provides a storage medium storing one or more programs executable by one or more processors to implement the steps of the collaborative filtering recommendation method based on user portrait clustering as described above.
The invention has the beneficial effects that:
according to the collaborative filtering recommendation method, the collaborative filtering recommendation system and the storage medium based on the user portrait clustering, user data including attribute data and behavior data are obtained; characterizing user data to form user characterization information; carrying out dimension reduction compression on the user representation information to form a low-dimensional user portrait; clustering the low-dimensional user portrait by adopting a clustering method to form a user interest cluster; and recommending the target user in the interest cluster in which the target user is positioned by adopting a user-based collaborative filtering method. The change of user behavior along with time information is considered, the self-adaptive fusion is carried out on the inherent attribute information of the user, the past behavior of the user and the short-term behavior of the user, the user is clustered according to the low-dimensional user portrait, the collaborative filtering recommendation based on the user is carried out on the user according to the clustered category, the efficiency and the precision can be considered, the calculation complexity is reduced, the higher recommendation speed and the higher recommendation accuracy are ensured, and the self-adaptive recommendation for the user behavior change is realized.
Drawings
FIG. 1 is a schematic flow chart of a collaborative filtering recommendation method based on user portrait clustering according to an embodiment of the present invention;
FIG. 2 is a block diagram of a collaborative filtering recommendation according to a first embodiment of the present invention;
fig. 3 is a structure diagram of a user data encoding fusion framework according to a first embodiment of the present invention;
FIG. 4 is a schematic diagram of an Auto-Encoder network according to a first embodiment of the present invention;
fig. 5 is a schematic structural diagram of a collaborative filtering recommendation system based on user portrait clustering according to a second embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following detailed description and accompanying drawings. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The first embodiment is as follows:
in order to solve the problems that a traditional recommendation method is not suitable in data processing and efficiency or accuracy is pursued one-sidedly, the embodiment provides the collaborative filtering recommendation method based on user portrait clustering.
In the embodiment, basic data are acquired by a Movie-Lens Movie recommendation system for analysis, wherein the basic data comprise 1000209 scoring records and scoring time of 3900 movies by 6040 users, the scoring range is 1-5 points, and the higher the scoring value of a Movie by a user is, the more the user likes the Movie. The user information includes attribute information such as gender, age, occupation, zip code and the like.
Referring to fig. 1-3, the method comprises the following steps:
step S1: and acquiring user attribute data and behavior data according to the user movie data set to form a user information base.
User attributes including gender, age, occupation, and geographic location; and the user behavior comprises scoring data and corresponding scoring time.
The user behavior can be divided into recent behavior data and historical behavior data according to the time information of behavior occurrence. The embodiment is divided according to the scoring number, the latest 5% of scoring data is taken as the latest behavior data, and the rest scoring data is taken as the historical behavior data. Therefore, three pieces of user information are obtained, namely attribute data U, historical behavior data R and recent behavior data Rn of the user.
In an alternative embodiment of the present invention, the scoring data of the last month may be taken as the latest new book data, and the scoring data before the month may be taken as the historical behavior data.
Step S2: and characterizing the user data to form user characterization information, and storing the user characterization information in a user characterization information base. The method specifically comprises the following steps:
step S21: and encoding the user attribute data U by adopting an One-Hot (One-bit effective encoding) method, and fusing by adopting Concat to form the representation user attribute. This information is encoded using One-Hot encoding, where gender is 2-dimensional, age is 7-dimensional, occupation is 21-dimensional, geographical location is 99-dimensional, and finally a vector with dimension 129 is obtained and is denoted as Uc.
Both the historical behavior data and the recent behavior data of the user are scoring data, and are vectors with dimensions 3900 (one user generates one scoring data for each movie).
Step S22: the behavior data is encoded by adopting LSTM (Long short-term memory network), and adaptive fusion is carried out by adopting an Attention neural network, so as to form the characteristic user behavior.
And respectively encoding recent behavior data and historical behavior data of the user by adopting two parallel LSTM networks to form a 40-dimensional vector Rnc and a 40-dimensional vector Rc. The working process can be described by a formula as follows:
fk=σ(xkWf+hk-1Uf+bf)
ik=σ(xkWi+hk-1Ui+bi)
ck=fk⊙ck-1+ik⊙φ(xkWc+hk-1Uc+bc)
ok=σ(xkWo+hk-1Uo+bo)
hk=ok⊙φ(ck)
wherein h iskHigh state, W, for the k-th item*As a weight, U*Is hkWeight of (f)k、ik、okRespectively a forgetting gate, an input gate and an output gate, ckIs cell status, xkIs input, < > is dot product, b*For the network bias term, σ is the activation function and φ is the tanh function.
Then, self-adaptive fusion is carried out on Rc and Rnc by adopting an Attention network to obtain a characterization user behavior of an 80-dimensional vector, and the characterization user behavior is recorded as
Figure BDA0002728510520000071
The adaptive fusion formula of the Attention network is as follows:
α=σ(Wm[Rc,Rnc]+bm)
Figure BDA0002728510520000072
wherein, WmAs a weight of the m-th layer of the network, bmFor the mth layer bias term of the network, σ is the activation function, and Rc and Rnc are the inputs of the Attention network, respectively.
Step S23: for characterizing user attribute Uc (129 dimension) and characterizing user behavior
Figure BDA0002728510520000073
(80D) adopting an Attention network to carry out self-adaptive fusion to form user characterization information consisting of 209D data vectors, which is recorded as
Figure BDA0002728510520000074
The adaptive fusion formula of the Attention network is as follows:
α=σ(Wm[Uc,Rc]+bm)
Figure BDA0002728510520000075
wherein, WmAs a weight of the m-th layer of the network, bmFor the m-th layer bias term of the network, σ is the activation function, Uc and
Figure BDA0002728510520000076
respectively, the inputs of the Attention network.
Step S3: representing information to users by adopting dimension reduction compression method
Figure BDA0002728510520000077
Performing dimension reduction to obtain a low-dimensional user portrait
Figure BDA0002728510520000078
Its dimension is 20.
The present embodiment performs dimensionality reduction using an Auto-Encoder (Auto-Encoder, a neural network that makes an output value equal to an input value using a back propagation algorithm) neural network.
With reference to table 1, the Auto-Encoder is composed of 4 coding layers, 4 decoding layers and 1 hidden layer; the hidden layer is low-dimensional data, the number of layers of the coding layer and the decoding layer is the same, and the coding layer and the decoding layer are symmetrically distributed by taking the hidden layer as an axis, please refer to fig. 4. When the training is verified, the Auto-Encoder network coding layer and the decoding layer are trained, and the loss function corresponding to the training process is as follows:
Figure BDA0002728510520000081
wherein E (-) is the encoded layer transform and D (-) is the decoded layer transform,
Figure BDA0002728510520000082
is a low dimensional user representation. Is normalIn operation, the Auto-Encoder network uses only a 4-layer coding layer and a 1-layer hidden layer to implement functions.
TABLE 1 Auto-Encoder algorithm of this embodiment
Figure BDA0002728510520000083
It should be understood that the number of layers of the Auto-Encoder neural network depends on the compression ratio of the data, where the user characterizes the information
Figure BDA0002728510520000084
Is 209 (about 2)8) Dimensional data vectors and low dimensional user portrayal
Figure BDA0002728510520000085
The dimension of the material needs to be up to 20 (about 2)4) And therefore the compression ratio is about 4.
In other embodiments of the present invention, the dimension reduction method in step S3 may also adopt a PCA (Principal Components Analysis), an MLP (Multi-Layer per) network method, or an SVD (Singular Value Decomposition) method.
Step S4: method for collecting low-dimensional user images U by adopting K-means clustering methodfAnd clustering to form a user interest cluster.
The K-manes method is a well established clustering algorithm, as shown in Table 2 below.
TABLE 2K-means Algorithm of this example
Figure BDA0002728510520000091
Under the condition of meeting the precision requirement, the more the classification cluster number is, the better the classification cluster number is, the calculation complexity can be reduced, and the calculation efficiency is improved. In example k is 4; the accuracy meeting requirement when the classification cluster number is 4 can be obtained through training, the user data with the user number of 6040 is divided into 4 groups of user data with the user numbers of 1373, 2520, 749 and 1398 respectively, and the original calculation is complexThe degree is as follows:
Figure BDA0002728510520000092
the new computational complexity after the clustering process is:
Figure BDA0002728510520000093
the complexity is greatly reduced by about 4 times. The estimation formula of the computational complexity is as follows:
Figure BDA0002728510520000094
where num (i) is the number of users in the ith user interest cluster.
In other alternative embodiments of the present invention, the clustering method in step S4 may also use a density peak method.
Step S5: and for the target user, performing Top-N recommendation on the target user in the corresponding interest cluster by adopting a user-based collaborative filtering method.
The collaborative filtering recommendation method based on the users recommends movies which are in the same cluster with a target user u to be recommended and are liked by a user v with common interests and hobbies, and comprises the following specific calculation steps:
1) calculating cosine similarity of the user u and the user v:
Figure BDA0002728510520000101
wherein N (u) represents the movie set with positive feedback behavior of user u, and N (v) represents the movie set with positive feedback behavior of user v;
user behavior categories need to be set and screened in advance, and behaviors such as malicious bad comments cannot be specified as positive feedback behaviors.
2) Calculate the possible rating of user u for the jth movie: p (u, j) ═ Σv∈iwuvrv,jWherein r isv,jRepresenting the behavior of the user v on the jth movie, the behavior is 1 if any, and 0 if not.
Recommending the k movies with the highest scores to the target user u.
Further, the clustering method and the collaborative filtering recommendation method based on the user need combined training in the training process so as to determine the number i of clustering categories, ensure the model precision and reduce the calculation complexity.
It should be understood that the collaborative filtering recommendation method based on user portrait clustering provided by the present embodiment is not limited to recommendation of movies, and is also applicable to other objects (e.g., commodities, scenic spots, etc.).
According to the collaborative filtering recommendation method, system and storage medium based on user portrait clustering, user data including attribute data and behavior data are obtained; characterizing user data to form user characterization information; carrying out dimension reduction compression on the user representation information to form a low-dimensional user portrait; clustering the low-dimensional user portrait by adopting a clustering method to form a user interest cluster; and recommending the target user in the interest cluster in which the target user is positioned by adopting a user-based collaborative filtering method. The change of user behavior along with time information is considered, the self-adaptive fusion is carried out on the inherent attribute information of the user, the past behavior of the user and the short-term behavior of the user, the user is clustered according to the low-dimensional user portrait, the collaborative filtering recommendation based on the user is carried out on the user according to the clustered category, the efficiency and the precision can be considered, the calculation complexity is reduced, the higher recommendation speed and the higher recommendation accuracy are ensured, and the self-adaptive recommendation for the user behavior change is realized.
Example two:
in this embodiment, on the basis of the first embodiment, a collaborative filtering recommendation system based on user portrait clustering is provided, which has functional modules capable of implementing the steps of the collaborative filtering recommendation method based on user portrait clustering in the first embodiment, please refer to fig. 5, and the system includes:
the data acquisition module 51 is used for acquiring user data, including attribute data and behavior data;
the representation module 52 is configured to represent the user data to form user representation information;
the compression module 53 is used for performing dimension reduction compression on the user representation information to form a low-dimensional user portrait;
the clustering module 54 is configured to cluster the low-dimensional user portraits by using a clustering method to form a user interest cluster;
the recommending module 55 is configured to recommend the target user in the interest cluster where the target user is located by using a collaborative filtering method based on the user.
For specific functions of the collaborative filtering recommendation system based on user portrait clustering provided in this embodiment, reference may be made to the description of relevant steps in this embodiment, which is not described herein again.
Example three:
this embodiment provides a storage medium storing one or more programs, which are executable by one or more processors, to implement the steps of the collaborative filtering recommendation method based on user portrait clustering as described in the first embodiment. For details, reference is made to the description of the relevant steps in the embodiments, which are not repeated herein.
It will be apparent to those skilled in the art that the modules or steps of the invention described above may be implemented in a general purpose computing device, they may be centralized on a single computing device or distributed across a network of computing devices, and optionally they may be implemented in program code executable by a computing device, such that they may be stored on a computer storage medium (ROM/RAM, magnetic disks, optical disks) and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The foregoing is a more detailed description of the present invention that is presented in conjunction with specific embodiments, and the practice of the invention is not to be considered limited to those descriptions. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims (7)

1. A collaborative filtering recommendation method based on user portrait clustering is characterized by comprising the following steps:
s1: acquiring user data comprising attribute data, historical behavior data and recent behavior data;
s21: encoding the attribute data by adopting an One-Hot method, and fusing by adopting Concat to form a representation user attribute;
s22: coding the recent behavior data and the historical behavior data by adopting two LSTM networks connected in parallel, and performing adaptive fusion by adopting an Attention neural network to form a representation user behavior;
s23: adopting an Attention neural network to perform self-adaptive fusion on the attribute of the representation user and the behavior of the representation user to form user representation information;
in steps S22 and S23, the adaptive fusion using the Attention neural network includes:
the Attention neural network is applied to two inputs p1、p2Carrying out data self-adaptive fusion, wherein the number of network layers is N, and a fusion formula is as follows:
α=σ(Wm[p1,p2]+bm)
p=α·p1+(1-α)p2
in the formula, WmAs a weight of the m-th layer of the network, bmIs the m layer bias term of the network, m is less than or equal to N, sigma is the activation function, p1、p2Respectively, input of the Attention neural network, and the alpha is output of the Attention neural network; when fusing behavioral data, the p1、p2Respectively the recent behavior data and the historical behavior data, wherein p is the characteristic user behavior; when the characterization user attributes and the characterization user behaviors are merged, the p1、p2Respectively representing the attribute of the representation user and the behavior of the representation user, wherein p is the representation information of the user;
s3: performing dimension reduction compression on the user representation information to form a low-dimensional user portrait;
s4: clustering the low-dimensional user portrait by adopting a clustering method to form a user interest cluster;
s5: and recommending the target user in the interest cluster in which the target user is positioned by adopting a collaborative filtering method.
2. The collaborative filtering recommendation method based on user portrait clustering of claim 1, wherein the S22 employs two LSTM networks connected in parallel to encode the recent behavior data and the historical behavior data respectively, including:
the working process of the LSTM network can be described by the formula:
fk=σ(xkWf+hk-1Uf+bf)
ik=σ(xkWi+hk-1Ui+bi)
ck=fk⊙ck-1+ik⊙φ(xkWc+hk-1Uc+bc)
ok=σ(xkWo+hk-1Uo+bo)
hk=ok⊙φ(ck)
wherein, the hkHigh state, W, for the k-th item*As a weight, U*Is hkWeight of (f)k、ik、okRespectively a forgetting gate, an input gate and an output gate, ckIs cell state, xkIs input, < > is dot product, b*For the network bias term, σ is the activation function and φ is the tanh function.
3. The collaborative filtering recommendation method based on user portrait clustering of claim 1, wherein the S3 performing dimension reduction compression on the user representation information comprises:
performing dimensionality reduction compression on the user characterization information by adopting an Auto-Encoder neural network, wherein the Auto-Encoder neural network is used for carrying out dimensionality reduction compression on the user characterization informationThe network is composed of a coding layer, a decoding layer and a hidden layer; the hidden layer is low-dimensional data, the number of layers of the coding layer and the decoding layer is the same, the hidden layer is symmetrically distributed by taking the hidden layer as an axis, and the number of layers of the Auto-Encoder neural network depends on the compression ratio of the data; the loss function adopted in the training process of the Auto-Encoder neural network dimension reduction method is as follows: l | | | ru-D(E(ru))||2Wherein, r isuFor user u's data, E (-) is the encoded layer transform, D (-) is the decoded layer transform, E (r)u) Is a low dimensional user representation.
4. The collaborative filtering recommendation method based on user portrait clustering of claim 1, wherein the clustering method adopted in S4 is a K-means clustering method.
5. The collaborative filtering recommendation method based on user portrait clustering of claim 1, wherein the S5 includes: recommending the object liked by the user v in the interest cluster to the target user u based on the collaborative filtering recommendation method of the user:
s51, calculating cosine similarity of the user u and the user v:
Figure FDA0003161984530000021
wherein, n (u) represents an object set that the target user u has a positive feedback behavior, and n (v) represents an object set that the user v has a positive feedback behavior;
s52, calculating the behavior similarity of the jth object target user u: p (u, j) ═ Σv∈iwuvrv,jWherein r isv,jAnd representing the behavior of the user v on the jth object, wherein the behavior is 1 if the behavior is existed, and the behavior is 0 if the behavior is not existed.
6. A collaborative filtering recommendation system based on user portrait clustering, comprising:
the data acquisition module is used for acquiring user data comprising attribute data, historical behavior data and recent behavior data;
the characterization module is used for encoding the attribute data by adopting an One-Hot method and fusing by adopting Concat to form a characterization user attribute; coding the recent behavior data and the historical behavior data by adopting two LSTM networks connected in parallel, and performing adaptive fusion by adopting an Attention neural network to form a representation user behavior; adopting an Attention neural network to perform self-adaptive fusion on the attribute of the representation user and the behavior of the representation user to form user representation information; the characterization module adopts the Attention neural network to carry out self-adaptive fusion and comprises the following steps:
the Attention neural network is applied to two inputs p1、p2Carrying out data self-adaptive fusion, wherein the number of network layers is N, and a fusion formula is as follows:
α=σ(Wm[p1,p2]+bm)
p=α·p1+(1-α)p2
in the formula, WmAs a weight of the m-th layer of the network, bmIs the m layer bias term of the network, m is less than or equal to N, sigma is the activation function, p1、p2Respectively, input of the Attention neural network, and the alpha is output of the Attention neural network; when fusing behavioral data, the p1、p2Respectively the recent behavior data and the historical behavior data, wherein p is the characteristic user behavior; when the characterization user attributes and the characterization user behaviors are merged, the p1、p2Respectively representing the attribute of the representation user and the behavior of the representation user, wherein p is the representation information of the user;
the compression module is used for carrying out dimension reduction compression on the user representation information to form a low-dimensional user portrait;
the clustering module is used for clustering the low-dimensional user portrait by adopting a clustering method to form a user interest cluster;
and the recommending module is used for recommending the target user in the interest cluster in which the target user is positioned by adopting a user-based collaborative filtering method.
7. A storage medium storing one or more programs, the one or more programs being executable by one or more processors to perform the steps of the collaborative filtering recommendation method based on user representation clustering of claims 1 to 5.
CN202011114490.XA 2020-10-16 2020-10-16 Collaborative filtering recommendation method and system based on user portrait clustering and storage medium Active CN112307332B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011114490.XA CN112307332B (en) 2020-10-16 2020-10-16 Collaborative filtering recommendation method and system based on user portrait clustering and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011114490.XA CN112307332B (en) 2020-10-16 2020-10-16 Collaborative filtering recommendation method and system based on user portrait clustering and storage medium

Publications (2)

Publication Number Publication Date
CN112307332A CN112307332A (en) 2021-02-02
CN112307332B true CN112307332B (en) 2021-08-24

Family

ID=74327695

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011114490.XA Active CN112307332B (en) 2020-10-16 2020-10-16 Collaborative filtering recommendation method and system based on user portrait clustering and storage medium

Country Status (1)

Country Link
CN (1) CN112307332B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112560831B (en) * 2021-03-01 2021-05-04 四川大学 Pedestrian attribute identification method based on multi-scale space correction
CN113343127B (en) * 2021-04-25 2023-03-21 武汉理工大学 Tourism route recommendation method, system, server and storage medium
CN113515697A (en) * 2021-05-27 2021-10-19 武汉理工大学 Group dynamic tour route recommendation method and system based on multiple intentions of user
CN113255801A (en) * 2021-06-02 2021-08-13 北京字节跳动网络技术有限公司 Data processing method and device, computer equipment and storage medium
CN115017419A (en) * 2022-08-10 2022-09-06 玫斯江苏宠物食品科技有限公司 Customized pet food method and system based on personalized recommendation

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103279552A (en) * 2013-06-06 2013-09-04 浙江大学 Collaborative filtering recommendation method based on user interest groups
CN104391849B (en) * 2014-06-30 2017-12-15 浙江大学苏州工业技术研究院 Incorporate the collaborative filtering recommending method of time contextual information
CN107391713B (en) * 2017-07-29 2020-04-28 内蒙古工业大学 Method and system for solving cold start problem in collaborative filtering recommendation technology
CN107423442B (en) * 2017-08-07 2020-09-25 火烈鸟网络(广州)股份有限公司 Application recommendation method and system based on user portrait behavior analysis, storage medium and computer equipment
CN107818306B (en) * 2017-10-31 2020-08-07 天津大学 Video question-answering method based on attention model
CN110543603B (en) * 2019-09-06 2023-06-30 上海喜马拉雅科技有限公司 Collaborative filtering recommendation method, device, equipment and medium based on user behaviors
CN111079056A (en) * 2019-10-11 2020-04-28 深圳壹账通智能科技有限公司 Method, device, computer equipment and storage medium for extracting user portrait
CN110851718B (en) * 2019-11-11 2022-06-28 重庆邮电大学 Movie recommendation method based on long and short term memory network and user comments

Also Published As

Publication number Publication date
CN112307332A (en) 2021-02-02

Similar Documents

Publication Publication Date Title
CN112307332B (en) Collaborative filtering recommendation method and system based on user portrait clustering and storage medium
CN111931062B (en) Training method and related device of information recommendation model
CN112785397B (en) Product recommendation method, device and storage medium
CN111127142B (en) Article recommendation method based on generalized nerve attention
Bu et al. Improving collaborative recommendation via user-item subgroups
US20190251435A1 (en) Matching cross domain user affinity with co-embeddings
CN111737578B (en) Recommendation method and system
Cheng et al. Semantic-based facial expression recognition using analytical hierarchy process
Basilico et al. A joint framework for collaborative and content filtering
Alfarhood et al. DeepHCF: a deep learning based hybrid collaborative filtering approach for recommendation systems
Wang et al. Low-rank and sparse matrix factorization with prior relations for recommender systems
Chen et al. Deformable convolutional matrix factorization for document context-aware recommendation in social networks
CN115080868A (en) Product pushing method, product pushing device, computer equipment, storage medium and program product
Li et al. Learning latent multi-criteria ratings from user reviews for recommendations
Gu et al. Sequence neural network for recommendation with multi-feature fusion
Singh et al. Image collection summarization: Past, present and future
Zeng et al. User Personalized Recommendation Algorithm Based on GRU Network Model in Social Networks
CN110769288A (en) Video cold start recommendation method and system
Alabdulrahman et al. Active learning and deep learning for the cold-start problem in recommendation system: A comparative study
Yin et al. Deep collaborative filtering: a recommendation method for crowdfunding project based on the integration of deep neural network and collaborative filtering
Yong-sheng Image Tag Recommendation Algorithm Using Tensor Factorization.
Hanafi et al. Word Sequential Using Deep LSTM and Matrix Factorization to Handle Rating Sparse Data for E‐Commerce Recommender System
CN113495969B (en) Digital fingerprint generation method, media data recommendation method, device and computer equipment
CN111931035B (en) Service recommendation method, device and equipment
Mi et al. Matrix regression-based classification for face recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant