CN111737427A - Mu lesson forum post recommendation method integrating forum interaction behavior and user reading preference - Google Patents
Mu lesson forum post recommendation method integrating forum interaction behavior and user reading preference Download PDFInfo
- Publication number
- CN111737427A CN111737427A CN202010391330.3A CN202010391330A CN111737427A CN 111737427 A CN111737427 A CN 111737427A CN 202010391330 A CN202010391330 A CN 202010391330A CN 111737427 A CN111737427 A CN 111737427A
- Authority
- CN
- China
- Prior art keywords
- user
- matrix
- interaction
- post
- forum
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000003993 interaction Effects 0.000 title claims abstract description 78
- 238000000034 method Methods 0.000 title claims abstract description 29
- 239000011159 matrix material Substances 0.000 claims abstract description 214
- 230000006399 behavior Effects 0.000 claims abstract description 47
- 230000009467 reduction Effects 0.000 claims abstract description 20
- 238000000354 decomposition reaction Methods 0.000 claims abstract description 11
- 239000013598 vector Substances 0.000 claims description 31
- 230000002452 interceptive effect Effects 0.000 claims description 9
- 238000007781 pre-processing Methods 0.000 claims description 4
- 150000001875 compounds Chemical class 0.000 claims description 3
- 238000009795 derivation Methods 0.000 claims description 3
- 238000012805 post-processing Methods 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims 1
- 230000006870 function Effects 0.000 abstract description 32
- 238000013135 deep learning Methods 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000000605 extraction Methods 0.000 description 3
- 238000007476 Maximum Likelihood Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 241001502129 Mullus Species 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3346—Query execution using probabilistic model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/335—Filtering based on additional data, e.g. user or group profiles
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Human Computer Interaction (AREA)
- Artificial Intelligence (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a method for recommending a mu lesson forum post by fusing forum interaction behavior and user reading preference, which comprises the following steps: 1) constructing a target scoring matrix, a user interaction scoring matrix and a user interaction frequency matrix of the posts by the user; 2) decomposing the user interaction scoring matrix to obtain a user target function and a user behavior characteristic matrix; 3) calculating the number of times of user interaction behaviors, extracting embedded features, and obtaining a user matrix based on a user behavior feature matrix and a user embedded matrix; 4) extracting a post theme through a noise reduction self-encoder to obtain an object target function, and decomposing a target scoring matrix to obtain a scoring matrix target function; 5) optimizing the objective function of the scoring matrix, the objective function of the user and the objective function of the article, and providing a recommendation list for the user. According to the method, through deep learning and probability-based matrix decomposition, subtle interaction among the users, posts and users is integrated into the model, the cold start problem is relieved, and accurate recommendation is realized by combining reading preference of the users.
Description
Technical Field
The invention relates to the technical field of teaching education data mining and natural language processing, in particular to a method for recommending a forum section of an admiration lesson, which integrates forum interaction behavior and user reading preference.
Background
In the era of high-speed network information transmission and rapid development of computer technology, the online learning platform enables a new generation of netizens to learn courses of various colleges at any time and any place. Large-scale Open Online Course (admiration Course for short) attracts thousands of students. By the beginning of 2018, the number of people in China breaks through 7000 ten thousand times. Although the number of the study people is large, due to the openness of the admire class, the level of the students is uneven, the study purposes are different, and the rate of lessons returned by the admire class is high, and the participation degree is low. The admiration lesson forum is used as an important module for promoting students to exchange knowledge and promote the participation of courses, and has a certain effect on reducing the class withdrawal rate.
The admire class forum has the problems of unbalanced information load and lost information. Although most of the current admiration lesson platforms organize forums by the sub-sections, students cannot select the corresponding sub-sections for published contents, and the problem of information confusion still exists. On the other hand, the single ordering mechanism of the forum enables many new questions to be inundated with other posts before the questions are completely solved, and the unanswered rate is high. The one-to-one response forum posts bring greater information load to teachers and teaching aids. Ideally, the information load of the forum is balanced, students can discuss intensely, and the teachers are not relied on to solve the problems; students can be resources answering questions of each other, assisting each other in forums, and becoming learning groups sharing knowledge. The admiration lesson forum needs an individualized recommendation technology, combines the reading preference and the interaction condition of students, and realizes the information routing. The existing mu lesson forum recommendation algorithm adopts traditional methods such as LDA topic models, word co-occurrence statistics, associated words and the like to extract user reading preference, however, the traditional methods such as the topic models are difficult to combine with user behavior information to perform end-to-end recommendation, and the recommendation accuracy is not high. On the other hand, the prior art lacks a scene of considering low activity of the user recommended by the mu lesson forum post, and does not provide a solution for the cold start problem.
The invention provides a recommendation method for fusing forum interaction behaviors and user reading preferences, and aims to improve the technical defects that the recommendation precision of the existing recommendation technology of the forum post of the mullet course is not high and the cold start of a user is not fully considered.
Disclosure of Invention
The invention aims to improve the technical defects that the recommendation precision of the existing admiration forum post recommendation technology is not high and the cold start of a user is not fully considered, and provides an admiration forum post recommendation method integrating forum interaction behaviors and user reading preferences.
In order to achieve the purpose, the technical scheme provided by the invention is as follows: the mu lesson forum post recommendation method integrating forum interaction behaviors and user reading preferences comprises the following steps:
1) constructing a target scoring matrix, a user interaction scoring matrix and a user interaction frequency matrix of the posts by the user by using browsing records of the user in the forum;
2) introducing a matrix UaAnd UbDecomposing a user interaction scoring matrix by using a user interaction frequency matrix as a constraint item to obtain a user target function, and calculating the UaAnd UbAdding the data to obtain a user behavior characteristic matrix;
3) counting the times of various interactive behaviors of a user, constructing a user embedded matrix with a corresponding user ID, and splicing the user behavior characteristic matrix and the user embedded matrix to obtain a user matrix U;
4) integrating the historical posts of the user, introducing an article matrix V, extracting the subject of the post through a noise reduction self-encoder to obtain an article objective function, and decomposing a target scoring matrix by using a user matrix U and the article matrix V to obtain a scoring matrix objective function;
5) optimizing the objective function of the scoring matrix, the objective function of the user and the objective function of the article, and providing a recommendation list for the user.
In the step 1), the target scoring matrix is a two-dimensional matrix, the horizontal direction of the matrix is a user ID, the column items of the matrix are post IDs, the numerical value of the matrix identifies whether a user is interested in posts, the score of the user on the posts is set to be 1, the fact that the user browses the corresponding posts is shown, the default user is possibly interested in the subject of the posts, the score of the user on the posts is set to be 0, the fact that the user does not browse the corresponding posts is shown, and the default user is possibly not interested in the subject of the posts; the user interaction scoring matrix and the user interaction frequency matrix are two-dimensional matrices, whether interaction exists between users and the total number of the interaction are respectively identified, the interaction comprises the actions of reply, comment, praise and browse among the users, if the interaction occurs between the users, the value corresponding to the user interaction scoring matrix is 1, otherwise, the value corresponding to the user interaction frequency matrix is 0, and the value corresponding to the user interaction frequency matrix is the number of the interaction occurring between the users.
In step 2), obtaining a user objective function and a user feature matrix, comprising the following steps:
2.1) introducing a matrix UaAnd UbSuppose UaAnd UbObeys the following gaussian distribution:
where i denotes each row of the matrix,is a hyperparameter λaThe inverse number of (c) is,is a hyperparameter λbThe reciprocal of (a);
2.2) let the user interaction scoring matrix be Q and the user interaction frequency matrix be C, assuming that each value in Q obeys the following Gaussian distribution:
in the formula, QijThe term values in the ith row, jth column of the matrix are scored for user interaction,is a matrix UaThe vector of the ith row, andis a matrix UbVector of j-th column, C-1The inverse matrix of the user interaction frequency matrix C is the variance of the Gaussian distribution;
2.3) obtaining a derivation formula according to a matrix decomposition method based on probability:
in the formula, P (|) represents posterior probability, and oc represents proportional relation;
substituted into Q, Ua,UbThe user target function is obtained through deduction according to the Gaussian distribution;
2.4) mixing UaAnd UbAnd adding to obtain a user behavior characteristic matrix.
In step 3), counting various interaction behavior times of the user, including the praised times, the browsed amount, the replied number, the reply number, the posting number and the concerned number of the user, and constructing a user embedding matrix with the corresponding user ID, and splicing the user behavior characteristic matrix and the user embedding matrix to obtain a user matrix U, includes the following steps:
3.1) discretizing the times of various interactive behaviors to obtain 6 discretization characteristics;
3.2) taking 6 discretization features and the user ID as the input of a model embedding layer to obtain 7 embedding vectors;
3.3) stitching 7 embedding vectorsThen the embedded layer vectors are used as the embedded layer vectors of the users, and the embedded layer vectors of all the users form a user embedded matrix Uc;
3.4) splicing the user behavior characteristic matrix and the user embedded matrix UcThe user matrix U is constructed according to the following equation:
in the formula, the symbol [;]indicating a splicing operation, Ua+UbFor a user behavior feature matrix, UcThe matrix is embedded for the user and,is a hyperparameter λuThe inverse of (d) is the variance of the gaussian distribution.
In step 4), an object objective function and a scoring matrix objective function are obtained, comprising the following steps:
4.1) combining the title of the forum post, the detailed description of the post and a series of historical replies into a content post, and preprocessing the text;
4.2) converting the content of each post into a bit sequence code, inputting the bit sequence code into an embedding layer, or directly initializing the embedding layer of the text through a pre-training word vector, and outputting the embedded representation of the text;
4.3) inputting the embedded representation of the text into a noise reduction self-encoder, restoring the input information through the noise reduction self-encoder, and extracting the theme vector of the text by utilizing the middle layer of the noise reduction self-encoderThe weights of the noise-reducing self-encoder network obey the following gaussian distribution:
wherein W is the weight of each layer of the noise reduction self-encoder,is a hyperparameter λwThe inverse of (d), is the variance of the gaussian distribution;
introducing user preferences, and constructing an article matrix V by combining the user preferences and the theme vector, so that the article matrix V comprises theme information and user preference information:
obey the following gaussian distribution:
in the formula (I), the compound is shown in the specification,is a hyperparameter λvThe inverse of (d), is the variance of the gaussian distribution;
4.4) restoring input information by using a noise reduction self-encoder, substituting into an expression of W and V to obtain an object objective function;
4.5) decomposing the objective scoring matrix by using the user matrix U and the item matrix V according to the following formula:
in the formula, RijThe item value of the ith row and the jth column of the target scoring matrix is obtained; u shapeiI-th row, V, representing the user matrix UjThe jth row of the commodity matrix V is represented,is a hyperparameter λrThe inverse of (d), is the variance of the gaussian distribution;
and obtaining a scoring matrix target function according to a probability-based matrix decomposition method.
In step 5), optimizing a scoring matrix objective function, a user objective function and an article objective function, correcting the representation of a user matrix U and an article matrix V within a preset training threshold value, performing point multiplication on the final user matrix and the article matrix, retrieving the scores of a series of articles by a target user from the result of the point multiplication, and sequencing the score list of each user to obtain the top M posts with high scores, so as to provide a recommendation list for the user.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. the invention utilizes the noise reduction self-encoder to extract the subject information of the text, compared with the related method, the method can better mine the reading preference of the user by utilizing the advantage of deep learning in natural language processing.
2. According to the method, the forum interaction scoring matrix of the user is decomposed by using a probability-based matrix decomposition method, so that the behavior characteristics of the user can be extracted by decomposing the sparse matrix when only a few interactions exist, and the cold start problem is relieved to a certain extent.
3. Aiming at the problem that the text topic model is difficult to combine with the user behavior information for end-to-end recommendation, the forum interaction information and the topic model are fused by a deep learning and probability-based matrix decomposition method, so that end-to-end accurate recommendation is realized.
4. The model of the invention has expandability and supports the embedding of various user information.
Drawings
FIG. 1 is a logic flow diagram of the method of the present invention.
FIG. 2 is a decomposition and analysis diagram of a user interaction matrix according to the present invention.
FIG. 3 is a diagram of a forum recommendation model in accordance with the present invention.
FIG. 4 is a diagram illustrating the extraction of text hidden vectors from a coder with noise reduction according to the present invention.
Detailed Description
The present invention will be further described with reference to the following specific examples.
As shown in fig. 1, the method for recommending a mu lesson forum post integrating forum interaction behavior and user reading preference provided by the embodiment includes the following steps:
1) and constructing a target scoring matrix, wherein the target scoring matrix is a two-dimensional matrix, the horizontal direction of the matrix is a user ID, the column items of the matrix are post IDs, the numerical value of the matrix identifies whether the user is interested in the posts, the score of the user on the posts is set to be 1, the user is shown to browse the corresponding posts, the default user is possibly interested in the subject of the posts, the score of the user on the posts is set to be 0, the user is shown not to browse the corresponding posts, and the default user is possibly not interested in the subject of the posts.
Because the forum has a lot of unanswered posts, the target scoring matrix is very sparse, that is, most of the posts in the matrix are 0, the original data needs to be filtered, posts with too few words and more special symbols are removed, a user browsing amount threshold value is set, and users with browsing post numbers smaller than the threshold value are removed, so that some 'zombie' users and 'garbage' posts can be removed. And then, constructing a scoring matrix, and randomly deleting the items with the median value of 0 in the matrix according to a certain negative sampling proportion. Specifically, a mask matrix with the same row and column as the original matrix can be used, 1 is used for identifying items needing to be reserved in the original matrix, 0 is used for identifying items needing to be deleted from the original matrix and having a score of 0, the mask matrix is calculated in advance according to the negative sampling proportion, and the mask matrix and the original score matrix are subjected to AND operation, and the downsampling matrix can be obtained. The downsampling matrix is the actual data that is subsequently input to the model.
And constructing a user interaction scoring matrix and an interaction frequency matrix which are also two-dimensional matrices and respectively identify whether the users have interaction and the total number of the interaction, wherein the interaction comprises the actions of reply, comment, praise and browse among the users, if the interaction occurs among the users, the value corresponding to the user interaction scoring matrix is 1, otherwise, the value corresponding to the user interaction frequency matrix is 0, and the value corresponding to the user interaction frequency matrix is the number of the interaction occurring among the users.
2) The method for obtaining the user objective function and the user characteristic matrix comprises the following steps:
2.1) introducing a matrix UaAnd UbSuppose UaAnd UbObeys the following gaussian distribution:
where i denotes each row of the matrix,is a hyperparameter λaThe inverse number of (c) is,is a hyperparameter λbThe reciprocal of (c).
2.2) let the user interaction scoring matrix be Q and the user interaction frequency matrix be C, assuming that each value in Q obeys the following Gaussian distribution:
in the formula, QijThe term values in the ith row, jth column of the matrix are scored for user interaction,is a matrix UaThe vector of the ith row, andis a matrix UbVector of j-th column, C-1The inverse matrix of the user interaction frequency matrix C is the variance of the Gaussian distribution; the user interaction frequency matrix is used as the variance, so that the more the user interaction times are, the greater the punishment of the corresponding item is, namely, the more active the user is in the forum, and the higher the requirement on the accuracy of the feature representation is.
2.3) obtaining a derivation formula according to the maximum posterior probability:
in the formula, P (|) represents posterior probability, and oc represents proportional relation;
it is noted thatThe above Q is obtained by the statistics of the user interaction behavior and is a known observation, so the above problem is equivalent to using the known observation Q to reverse the model parameter U most likely to cause the observationaAnd Ub. Thus, U can be introducedaAnd UbIs a gaussian distribution, so that using maximum likelihood estimation, U is calculatedaAnd UbAnd obtaining the user objective function according to the maximum posterior probability.
2.4) due to UaAnd UbThe results are the results of the user interaction scoring matrix decomposition, and they respectively contain information of the matrix horizontal and column entries, as shown in fig. 2, the column entry of the user interaction scoring matrix represents the active reply behavior of the user, and describes the characteristic that the user is actively in the forum, and the horizontal entry represents the reply behavior of other users to the user, and describes the characteristic of the passive behavior of the user in the forum. Will UaAnd UbAnd adding to obtain a user behavior characteristic matrix, so that the change result contains information of the active and passive behaviors of the user.
3) Counting various interactive behavior times of a user, including the counted praise times, browsed amount, replied number, reply number, posting number and concerned number of the user, constructing a user embedding matrix with a corresponding user ID, and splicing a user behavior characteristic matrix and the user embedding matrix to obtain a user matrix U, wherein the method comprises the following steps:
3.1) discretizing the times (numerical values) of various interactive behaviors to obtain 6 discretization characteristics; taking the bin discretization as an example, assuming that sample points of the user's approval are {12,13,24,4,5,34,56,98,8} and the bin interval is 5, the discretized data is {3,3,5,1,1,7,12,20,2 }. The binning interval may be determined from a histogram of the data and is typically set to a value that makes the data distribution more uniform.
3.2) taking 6 discretization features and the user ID as the input of a model embedding layer to obtain 7 embedding vectors; the output dimensionality of the embedding layer of the model can be automatically adjusted according to the distribution range of the data, if the discrete numerical value is not large, the dimensionality can be set to be smaller, and the specific dimensionality value can refer to fig. 3.
3.3) insert 7 intoAfter quantity splicing, the quantity is used as an embedded layer vector of a user, and the embedded layer vectors of all the users form a user embedded matrix Uc。
3.4) splicing the user behavior characteristic matrix and the user embedded matrix UcThe user matrix U is constructed according to the following equation:
in the formula, the symbol [;]indicating a splicing operation, Ua+UbFor a user behavior feature matrix, UcThe matrix is embedded for the user and,is a hyperparameter λuThe inverse of (d) is the variance of the gaussian distribution.
4) Obtaining an object objective function and a scoring matrix objective function, comprising the following steps:
4.1) combining the title of the forum post, the detailed description of the post and a series of historical replies into a content post, and preprocessing the text, wherein the detailed steps are as follows: firstly, combining the title of the forum post, the detailed description of the post and a series of historical postbacks into a content post, and then preprocessing the text, including punctuation and character separation, case-to-case conversion, removal of meaningless punctuation characters and linkage.
4.2) converting the text into bit sequence codes and inputting the codes into an embedding layer, or directly initializing the embedding layer of the text through a pre-training word vector. Taking bit-order coding as an example, the principle of the embedding layer is explained: suppose post X? The word in X has a bit sequence of { How:899, many:1456, layer:600, doss: 3245, the:6723, OSI:28, model:876, relationship: 547, of:2323} in a text library, and the post X is converted into a bit sequence of {899,1456,600,3245,6723,28,876,547,2323 }; assuming that the dimension of the embedding layer is d and the size of the lexicon is m, the embedding layer will be randomly initialized to a matrix with the size of m × d, so that the words in post X can be indexed by bit order to obtain the corresponding word vector (embedding vector). When the text pre-training vector is used as input, the Word vector representation of each Word of the text can be obtained through large-scale public Word embedding, such as Glove or Word2vec, and the embedding layer matrix is initialized. The embedded representation of the text is then output by the embedding layer. The specific output dimension may be self-defined or set with reference to fig. 3.
4.3) inputting the embedded representation of the text into a noise reduction self-encoder, wherein the noise reduction self-encoder is composed of a plurality of layers of feedforward neural networks and comprises an encoding layer, an intermediate layer and a decoding layer, and the weight of each layer of the network is randomly initialized by the following Gaussian distribution; the noise reduction self-encoder restores original information of a text through a decoder, an implicit description of the text is captured through fewer neural units in an intermediate layer, the implicit description is an information representation of the text with more abstract and fewer dimensions, and from the interpretability aspect, an output vector of the intermediate layerSubject information including text.
Wherein W is the weight of each layer of the noise reduction self-encoder,is a hyperparameter λwThe inverse of (d), is the variance of the gaussian distribution;
introducing user preferences, and constructing an article matrix V by combining the user preferences and the theme vector, so that the article matrix comprises theme information and user preference information:
obey the following gaussian distribution:
in the formula (I), the compound is shown in the specification,is a hyperparameter λvThe inverse of (d) is the variance of the gaussian distribution.
And 4.4) restoring the input information by using a noise reduction self-encoder, and substituting the input information into the expression of W and V to obtain an object function.
4.5) decomposing the objective scoring matrix by using the user matrix U and the item matrix V according to the following formula:
in the formula, RijScoring the ith row and jth column of the matrix for the targetiI-th row, V, representing the user matrix UjThe jth row of the commodity matrix V is represented,is a hyperparameter λrThe inverse of (d), is the variance of the gaussian distribution;
note that the above R is obtained statistically and is a known observation, so the above problem is equivalent to using the known observation R to extrapolate back the model parameters U and V that most probably lead to this observation, and thus, by introducing the prior distribution of U and V, the maximum posterior probability of U and V can be calculated using the maximum likelihood estimation to obtain the scoring matrix objective function.
As can be seen from FIG. 2 and the above steps, the decomposition and prediction of the target scoring matrix involves user reading preference extraction, user feature extraction and user interaction matrix decomposition, and all components are fused into a model for end-to-end recommendation.
5) The objective function of the scoring matrix, the objective function of the user and the objective function of the article are optimized, in order to minimize the objective function, the gradient of the objective function can be solved by utilizing an optimizer mature in the industry, such as Adagrad, RMSprop and SGD, so that the parameters of the model are updated through back propagation, the representation of the user matrix U and the article matrix V is corrected within a certain training threshold, and the training threshold can be generally set through observing the stability and convergence degree of the predicted result without a fixed setting value.
And after the threshold value is reached, performing point multiplication on the final user matrix and the item matrix, retrieving the scores of the target users on a series of items from the result of the point multiplication, and sequencing the score list of each user to obtain posts with higher scores corresponding to the first M posts, so as to provide a recommendation list for the users.
The above-mentioned embodiments are merely preferred embodiments of the present invention, and the scope of the present invention is not limited thereto, so that the changes in the shape and principle of the present invention should be covered within the protection scope of the present invention.
Claims (6)
1. The method for recommending the forum section of the admiration lesson integrating the interactive behavior of the forum and the reading preference of the user is characterized by comprising the following steps of:
1) constructing a target scoring matrix, a user interaction scoring matrix and a user interaction frequency matrix of the posts by the user by using browsing records of the user in the forum;
2) introducing a matrix UaAnd UbDecomposing a user interaction scoring matrix by using a user interaction frequency matrix as a constraint item to obtain a user target function, and calculating the UaAnd UbAdding the data to obtain a user behavior characteristic matrix;
3) counting the times of various interactive behaviors of a user, constructing a user embedded matrix with a corresponding user ID, and splicing the user behavior characteristic matrix and the user embedded matrix to obtain a user matrix U;
4) integrating the historical posts of the user, introducing an article matrix V, extracting the subject of the post through a noise reduction self-encoder to obtain an article objective function, and decomposing a target scoring matrix by using a user matrix U and the article matrix V to obtain a scoring matrix objective function;
5) optimizing the objective function of the scoring matrix, the objective function of the user and the objective function of the article, and providing a recommendation list for the user.
2. The method as claimed in claim 1, wherein in step 1), the objective scoring matrix is a two-dimensional matrix, the horizontal direction of the matrix is a user ID, the columns of the matrix are post IDs, the value of the matrix identifies whether the user is interested in the post, the score of the post by the user is set to 1, which indicates that the user browses the corresponding post, and the default user may be interested in the subject of the post, and the score of the post by the user is set to 0, which indicates that the user does not browse the corresponding post, and the default user may not be interested in the subject of the post; the user interaction scoring matrix and the user interaction frequency matrix are two-dimensional matrices, whether interaction exists between users and the total number of the interaction are respectively identified, the interaction comprises the actions of reply, comment, praise and browse among the users, if the interaction occurs between the users, the value corresponding to the user interaction scoring matrix is 1, otherwise, the value corresponding to the user interaction frequency matrix is 0, and the value corresponding to the user interaction frequency matrix is the number of the interaction occurring between the users.
3. The mu lesson forum post recommendation method integrating forum interaction behavior and user reading preference as claimed in claim 1, wherein in step 2), obtaining a user objective function and a user characteristic matrix comprises the following steps:
2.1) introducing a matrix UaAnd UbSuppose UaAnd UbObeys the following gaussian distribution:
where i denotes each row of the matrix,is a hyperparameter λaThe inverse number of (c) is,is a hyperparameter λbThe reciprocal of (a);
2.2) let the user interaction scoring matrix be Q and the user interaction frequency matrix be C, assuming that each value in Q obeys the following Gaussian distribution:
in the formula, QijThe term values in the ith row, jth column of the matrix are scored for user interaction,is a matrix UaThe vector of the ith row, andis a matrix UbVector of j-th column, C-1The inverse matrix of the user interaction frequency matrix C is the variance of the Gaussian distribution;
2.3) obtaining a derivation formula according to a matrix decomposition method based on probability:
in the formula, P (|) represents posterior probability, and oc represents proportional relation;
substituted into Q, Ua,UbThe user target function is obtained through deduction according to the Gaussian distribution;
2.4) mixing UaAnd UbAnd adding to obtain a user behavior characteristic matrix.
4. The mu lesson forum post recommendation method fusing forum interactive behaviors and user reading preferences according to claim 1, wherein in step 3), the statistics of the various interactive behaviors of the user includes the statistics of the praise times, the browsed quantity, the replied number, the reply number, the post number and the concerned number of the user, the construction of a user embedding matrix with the corresponding user ID, and the splicing of the user behavior characteristic matrix and the user embedding matrix to obtain a user matrix U, and comprises the following steps:
3.1) discretizing the times of various interactive behaviors to obtain 6 discretization characteristics;
3.2) taking 6 discretization features and the user ID as the input of a model embedding layer to obtain 7 embedding vectors;
3.3) splicing 7 embedded vectors to be used as embedded layer vectors of users, wherein the embedded layer vectors of all the users form a user embedded matrix Uc;
3.4) splicing the user behavior characteristic matrix and the user embedded matrix UcThe user matrix U is constructed according to the following equation:
5. The mu lesson forum post recommendation method integrating forum interaction behavior and user reading preference of claim 1, wherein in step 4), an object objective function and a scoring matrix objective function are obtained, comprising the steps of:
4.1) combining the title of the forum post, the detailed description of the post and a series of historical replies into a content post, and preprocessing the text;
4.2) converting the content of each post into a bit sequence code, inputting the bit sequence code into an embedding layer, or directly initializing the embedding layer of the text through a pre-training word vector, and outputting the embedded representation of the text;
4.3) inputting the embedded representation of the text into a noise reduction self-encoder, restoring the input information through the noise reduction self-encoder, and extracting the theme vector of the text by utilizing the middle layer of the noise reduction self-encoderThe weights of the noise-reducing self-encoder network obey the following gaussian distribution:
wherein W is the weight of each layer of the noise reduction self-encoder,is a hyperparameter λwThe inverse of (d), is the variance of the gaussian distribution;
introducing user preferences, and constructing an article matrix V by combining the user preferences and the theme vector, so that the article matrix V comprises theme information and user preference information:
obey the following gaussian distribution:
in the formula (I), the compound is shown in the specification,is a hyperparameter λvThe inverse of (d), is the variance of the gaussian distribution;
4.4) restoring input information by using a noise reduction self-encoder, substituting into an expression of W and V to obtain an object objective function;
4.5) decomposing the objective scoring matrix by using the user matrix U and the item matrix V according to the following formula:
in the formula, RijThe item value of the ith row and the jth column of the target scoring matrix is obtained; u shapeiI-th row, V, representing the user matrix UjThe jth row of the commodity matrix V is represented,is a hyperparameter λrThe inverse of (d), is the variance of the gaussian distribution;
and obtaining a scoring matrix target function according to a probability-based matrix decomposition method.
6. The mu lesson forum post recommendation method fusing forum interaction behavior and user reading preference according to claim 1, wherein in step 5), the scoring matrix objective function, the user objective function and the item objective function are optimized, the representation of the user matrix U and the representation of the item matrix V are modified within a preset training threshold, the final user matrix and the item matrix are dot-multiplied, the scores of the target user on a series of items are retrieved from the dot-multiplied result, the scoring lists of each user are sorted to obtain posts corresponding to the top M scores, and a recommendation list is provided for the user.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010391330.3A CN111737427B (en) | 2020-05-11 | 2020-05-11 | Method for recommending lesson forum posts by combining forum interaction behaviors and user reading preference |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010391330.3A CN111737427B (en) | 2020-05-11 | 2020-05-11 | Method for recommending lesson forum posts by combining forum interaction behaviors and user reading preference |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111737427A true CN111737427A (en) | 2020-10-02 |
CN111737427B CN111737427B (en) | 2024-03-22 |
Family
ID=72647039
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010391330.3A Active CN111737427B (en) | 2020-05-11 | 2020-05-11 | Method for recommending lesson forum posts by combining forum interaction behaviors and user reading preference |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111737427B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112966148A (en) * | 2021-03-05 | 2021-06-15 | 安徽师范大学 | Video recommendation method and system based on deep learning and feature fusion |
CN113449210A (en) * | 2021-07-01 | 2021-09-28 | 深圳市数字尾巴科技有限公司 | Personalized recommendation method and device based on space-time characteristics, electronic equipment and storage medium |
CN114996487A (en) * | 2022-05-24 | 2022-09-02 | 北京达佳互联信息技术有限公司 | Media resource recommendation method and device, electronic equipment and storage medium |
CN117312542A (en) * | 2023-11-29 | 2023-12-29 | 泰山学院 | Reading recommendation method and system based on artificial intelligence |
CN118096237A (en) * | 2024-03-08 | 2024-05-28 | 北京嘉华铭品牌策划有限公司广东分公司 | Deep learning driven customer behavior prediction model |
US20240281613A1 (en) * | 2022-09-14 | 2024-08-22 | iCIMS, Inc. | Methods and apparatus for analyzing internal communication within an organization using natural language processing to recommend improved interactions and identify key personnel |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090299996A1 (en) * | 2008-06-03 | 2009-12-03 | Nec Laboratories America, Inc. | Recommender system with fast matrix factorization using infinite dimensions |
CN106951547A (en) * | 2017-03-27 | 2017-07-14 | 西安电子科技大学 | A kind of cross-domain recommendation method based on intersection user |
CN107273438A (en) * | 2017-05-24 | 2017-10-20 | 深圳大学 | A kind of recommendation method, device, equipment and storage medium |
CN110807154A (en) * | 2019-11-08 | 2020-02-18 | 内蒙古工业大学 | Recommendation method and system based on hybrid deep learning model |
CN111127165A (en) * | 2019-12-26 | 2020-05-08 | 纪信智达(广州)信息技术有限公司 | Sequence recommendation method based on self-attention self-encoder |
-
2020
- 2020-05-11 CN CN202010391330.3A patent/CN111737427B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090299996A1 (en) * | 2008-06-03 | 2009-12-03 | Nec Laboratories America, Inc. | Recommender system with fast matrix factorization using infinite dimensions |
CN106951547A (en) * | 2017-03-27 | 2017-07-14 | 西安电子科技大学 | A kind of cross-domain recommendation method based on intersection user |
CN107273438A (en) * | 2017-05-24 | 2017-10-20 | 深圳大学 | A kind of recommendation method, device, equipment and storage medium |
CN110807154A (en) * | 2019-11-08 | 2020-02-18 | 内蒙古工业大学 | Recommendation method and system based on hybrid deep learning model |
CN111127165A (en) * | 2019-12-26 | 2020-05-08 | 纪信智达(广州)信息技术有限公司 | Sequence recommendation method based on self-attention self-encoder |
Non-Patent Citations (3)
Title |
---|
XU CHONGHUAN: "A novel recommendation method based on social network using matrix factorization technique", INFORMATION PROCESSING AND MANAGEMENT, vol. 54, 27 February 2018 (2018-02-27), pages 463 - 474 * |
欧辉思 等: "面向跨领域的推荐系统研究现状与趋势", 小型微型计算机系统, no. 07, 31 July 2016 (2016-07-31), pages 1411 - 1416 * |
胡思才 等: "基于深度神经网络和概率矩阵分解的混合推荐算法", 四川大学学报(自然科学版), vol. 56, no. 06, 30 November 2019 (2019-11-30), pages 1033 - 1041 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112966148A (en) * | 2021-03-05 | 2021-06-15 | 安徽师范大学 | Video recommendation method and system based on deep learning and feature fusion |
CN113449210A (en) * | 2021-07-01 | 2021-09-28 | 深圳市数字尾巴科技有限公司 | Personalized recommendation method and device based on space-time characteristics, electronic equipment and storage medium |
CN113449210B (en) * | 2021-07-01 | 2023-01-31 | 深圳市数字尾巴科技有限公司 | Personalized recommendation method and device based on space-time characteristics, electronic equipment and storage medium |
CN114996487A (en) * | 2022-05-24 | 2022-09-02 | 北京达佳互联信息技术有限公司 | Media resource recommendation method and device, electronic equipment and storage medium |
CN114996487B (en) * | 2022-05-24 | 2023-04-07 | 北京达佳互联信息技术有限公司 | Media resource recommendation method and device, electronic equipment and storage medium |
US20240281613A1 (en) * | 2022-09-14 | 2024-08-22 | iCIMS, Inc. | Methods and apparatus for analyzing internal communication within an organization using natural language processing to recommend improved interactions and identify key personnel |
CN117312542A (en) * | 2023-11-29 | 2023-12-29 | 泰山学院 | Reading recommendation method and system based on artificial intelligence |
CN117312542B (en) * | 2023-11-29 | 2024-02-13 | 泰山学院 | Reading recommendation method and system based on artificial intelligence |
CN118096237A (en) * | 2024-03-08 | 2024-05-28 | 北京嘉华铭品牌策划有限公司广东分公司 | Deep learning driven customer behavior prediction model |
Also Published As
Publication number | Publication date |
---|---|
CN111737427B (en) | 2024-03-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108021616B (en) | Community question-answer expert recommendation method based on recurrent neural network | |
CN111737427B (en) | Method for recommending lesson forum posts by combining forum interaction behaviors and user reading preference | |
CN108363743B (en) | Intelligent problem generation method and device and computer readable storage medium | |
CN112084335B (en) | Social media user account classification method based on information fusion | |
CN109598995B (en) | Intelligent teaching system based on Bayesian knowledge tracking model | |
CN107766324B (en) | Text consistency analysis method based on deep neural network | |
CN110222163B (en) | Intelligent question-answering method and system integrating CNN and bidirectional LSTM | |
CN112199608B (en) | Social media rumor detection method based on network information propagation graph modeling | |
CN110516245A (en) | Fine granularity sentiment analysis method, apparatus, computer equipment and storage medium | |
CN108363790A (en) | For the method, apparatus, equipment and storage medium to being assessed | |
CN111753207B (en) | Collaborative filtering method for neural map based on comments | |
CN111831831A (en) | Knowledge graph-based personalized learning platform and construction method thereof | |
CN109726745A (en) | A kind of sensibility classification method based on target incorporating description knowledge | |
CN113283488B (en) | Learning behavior-based cognitive diagnosis method and system | |
CN107832295A (en) | The title system of selection of reading machine people and system | |
CN108364066B (en) | Artificial neural network chip and its application method based on N-GRAM and WFST model | |
CN113934846B (en) | Online forum topic modeling method combining behavior-emotion-time sequence | |
CN115238199A (en) | Knowledge graph-based online community learning path recommendation method, system and equipment | |
CN115510814A (en) | Chapter-level complex problem generation method based on double planning | |
CN115906816A (en) | Text emotion analysis method of two-channel Attention model based on Bert | |
CN113220964B (en) | Viewpoint mining method based on short text in network message field | |
CN113065342B (en) | Course recommendation method based on association relation analysis | |
CN113987124A (en) | Depth knowledge tracking method, system and storage medium | |
Aliyanto et al. | Supervised probabilistic latent semantic analysis (sPLSA) for estimating technology readiness level | |
CN113361615B (en) | Text classification method based on semantic relevance |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |