CN111737427A - Mu lesson forum post recommendation method integrating forum interaction behavior and user reading preference - Google Patents

Mu lesson forum post recommendation method integrating forum interaction behavior and user reading preference Download PDF

Info

Publication number
CN111737427A
CN111737427A CN202010391330.3A CN202010391330A CN111737427A CN 111737427 A CN111737427 A CN 111737427A CN 202010391330 A CN202010391330 A CN 202010391330A CN 111737427 A CN111737427 A CN 111737427A
Authority
CN
China
Prior art keywords
user
matrix
interaction
post
forum
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010391330.3A
Other languages
Chinese (zh)
Other versions
CN111737427B (en
Inventor
许卓佳
袁华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202010391330.3A priority Critical patent/CN111737427B/en
Publication of CN111737427A publication Critical patent/CN111737427A/en
Application granted granted Critical
Publication of CN111737427B publication Critical patent/CN111737427B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3346Query execution using probabilistic model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for recommending a mu lesson forum post by fusing forum interaction behavior and user reading preference, which comprises the following steps: 1) constructing a target scoring matrix, a user interaction scoring matrix and a user interaction frequency matrix of the posts by the user; 2) decomposing the user interaction scoring matrix to obtain a user target function and a user behavior characteristic matrix; 3) calculating the number of times of user interaction behaviors, extracting embedded features, and obtaining a user matrix based on a user behavior feature matrix and a user embedded matrix; 4) extracting a post theme through a noise reduction self-encoder to obtain an object target function, and decomposing a target scoring matrix to obtain a scoring matrix target function; 5) optimizing the objective function of the scoring matrix, the objective function of the user and the objective function of the article, and providing a recommendation list for the user. According to the method, through deep learning and probability-based matrix decomposition, subtle interaction among the users, posts and users is integrated into the model, the cold start problem is relieved, and accurate recommendation is realized by combining reading preference of the users.

Description

Mu lesson forum post recommendation method integrating forum interaction behavior and user reading preference
Technical Field
The invention relates to the technical field of teaching education data mining and natural language processing, in particular to a method for recommending a forum section of an admiration lesson, which integrates forum interaction behavior and user reading preference.
Background
In the era of high-speed network information transmission and rapid development of computer technology, the online learning platform enables a new generation of netizens to learn courses of various colleges at any time and any place. Large-scale Open Online Course (admiration Course for short) attracts thousands of students. By the beginning of 2018, the number of people in China breaks through 7000 ten thousand times. Although the number of the study people is large, due to the openness of the admire class, the level of the students is uneven, the study purposes are different, and the rate of lessons returned by the admire class is high, and the participation degree is low. The admiration lesson forum is used as an important module for promoting students to exchange knowledge and promote the participation of courses, and has a certain effect on reducing the class withdrawal rate.
The admire class forum has the problems of unbalanced information load and lost information. Although most of the current admiration lesson platforms organize forums by the sub-sections, students cannot select the corresponding sub-sections for published contents, and the problem of information confusion still exists. On the other hand, the single ordering mechanism of the forum enables many new questions to be inundated with other posts before the questions are completely solved, and the unanswered rate is high. The one-to-one response forum posts bring greater information load to teachers and teaching aids. Ideally, the information load of the forum is balanced, students can discuss intensely, and the teachers are not relied on to solve the problems; students can be resources answering questions of each other, assisting each other in forums, and becoming learning groups sharing knowledge. The admiration lesson forum needs an individualized recommendation technology, combines the reading preference and the interaction condition of students, and realizes the information routing. The existing mu lesson forum recommendation algorithm adopts traditional methods such as LDA topic models, word co-occurrence statistics, associated words and the like to extract user reading preference, however, the traditional methods such as the topic models are difficult to combine with user behavior information to perform end-to-end recommendation, and the recommendation accuracy is not high. On the other hand, the prior art lacks a scene of considering low activity of the user recommended by the mu lesson forum post, and does not provide a solution for the cold start problem.
The invention provides a recommendation method for fusing forum interaction behaviors and user reading preferences, and aims to improve the technical defects that the recommendation precision of the existing recommendation technology of the forum post of the mullet course is not high and the cold start of a user is not fully considered.
Disclosure of Invention
The invention aims to improve the technical defects that the recommendation precision of the existing admiration forum post recommendation technology is not high and the cold start of a user is not fully considered, and provides an admiration forum post recommendation method integrating forum interaction behaviors and user reading preferences.
In order to achieve the purpose, the technical scheme provided by the invention is as follows: the mu lesson forum post recommendation method integrating forum interaction behaviors and user reading preferences comprises the following steps:
1) constructing a target scoring matrix, a user interaction scoring matrix and a user interaction frequency matrix of the posts by the user by using browsing records of the user in the forum;
2) introducing a matrix UaAnd UbDecomposing a user interaction scoring matrix by using a user interaction frequency matrix as a constraint item to obtain a user target function, and calculating the UaAnd UbAdding the data to obtain a user behavior characteristic matrix;
3) counting the times of various interactive behaviors of a user, constructing a user embedded matrix with a corresponding user ID, and splicing the user behavior characteristic matrix and the user embedded matrix to obtain a user matrix U;
4) integrating the historical posts of the user, introducing an article matrix V, extracting the subject of the post through a noise reduction self-encoder to obtain an article objective function, and decomposing a target scoring matrix by using a user matrix U and the article matrix V to obtain a scoring matrix objective function;
5) optimizing the objective function of the scoring matrix, the objective function of the user and the objective function of the article, and providing a recommendation list for the user.
In the step 1), the target scoring matrix is a two-dimensional matrix, the horizontal direction of the matrix is a user ID, the column items of the matrix are post IDs, the numerical value of the matrix identifies whether a user is interested in posts, the score of the user on the posts is set to be 1, the fact that the user browses the corresponding posts is shown, the default user is possibly interested in the subject of the posts, the score of the user on the posts is set to be 0, the fact that the user does not browse the corresponding posts is shown, and the default user is possibly not interested in the subject of the posts; the user interaction scoring matrix and the user interaction frequency matrix are two-dimensional matrices, whether interaction exists between users and the total number of the interaction are respectively identified, the interaction comprises the actions of reply, comment, praise and browse among the users, if the interaction occurs between the users, the value corresponding to the user interaction scoring matrix is 1, otherwise, the value corresponding to the user interaction frequency matrix is 0, and the value corresponding to the user interaction frequency matrix is the number of the interaction occurring between the users.
In step 2), obtaining a user objective function and a user feature matrix, comprising the following steps:
2.1) introducing a matrix UaAnd UbSuppose UaAnd UbObeys the following gaussian distribution:
Figure BDA0002485886320000031
Figure BDA0002485886320000032
where i denotes each row of the matrix,
Figure BDA0002485886320000033
is a hyperparameter λaThe inverse number of (c) is,
Figure BDA0002485886320000034
is a hyperparameter λbThe reciprocal of (a);
2.2) let the user interaction scoring matrix be Q and the user interaction frequency matrix be C, assuming that each value in Q obeys the following Gaussian distribution:
Figure BDA0002485886320000035
in the formula, QijThe term values in the ith row, jth column of the matrix are scored for user interaction,
Figure BDA0002485886320000036
is a matrix UaThe vector of the ith row, and
Figure BDA0002485886320000037
is a matrix UbVector of j-th column, C-1The inverse matrix of the user interaction frequency matrix C is the variance of the Gaussian distribution;
2.3) obtaining a derivation formula according to a matrix decomposition method based on probability:
Figure BDA0002485886320000041
in the formula, P (|) represents posterior probability, and oc represents proportional relation;
substituted into Q, Ua,UbThe user target function is obtained through deduction according to the Gaussian distribution;
2.4) mixing UaAnd UbAnd adding to obtain a user behavior characteristic matrix.
In step 3), counting various interaction behavior times of the user, including the praised times, the browsed amount, the replied number, the reply number, the posting number and the concerned number of the user, and constructing a user embedding matrix with the corresponding user ID, and splicing the user behavior characteristic matrix and the user embedding matrix to obtain a user matrix U, includes the following steps:
3.1) discretizing the times of various interactive behaviors to obtain 6 discretization characteristics;
3.2) taking 6 discretization features and the user ID as the input of a model embedding layer to obtain 7 embedding vectors;
3.3) stitching 7 embedding vectorsThen the embedded layer vectors are used as the embedded layer vectors of the users, and the embedded layer vectors of all the users form a user embedded matrix Uc
3.4) splicing the user behavior characteristic matrix and the user embedded matrix UcThe user matrix U is constructed according to the following equation:
Figure BDA0002485886320000042
in the formula, the symbol [;]indicating a splicing operation, Ua+UbFor a user behavior feature matrix, UcThe matrix is embedded for the user and,
Figure BDA0002485886320000043
is a hyperparameter λuThe inverse of (d) is the variance of the gaussian distribution.
In step 4), an object objective function and a scoring matrix objective function are obtained, comprising the following steps:
4.1) combining the title of the forum post, the detailed description of the post and a series of historical replies into a content post, and preprocessing the text;
4.2) converting the content of each post into a bit sequence code, inputting the bit sequence code into an embedding layer, or directly initializing the embedding layer of the text through a pre-training word vector, and outputting the embedded representation of the text;
4.3) inputting the embedded representation of the text into a noise reduction self-encoder, restoring the input information through the noise reduction self-encoder, and extracting the theme vector of the text by utilizing the middle layer of the noise reduction self-encoder
Figure BDA0002485886320000051
The weights of the noise-reducing self-encoder network obey the following gaussian distribution:
Figure BDA0002485886320000052
wherein W is the weight of each layer of the noise reduction self-encoder,
Figure BDA0002485886320000053
is a hyperparameter λwThe inverse of (d), is the variance of the gaussian distribution;
introducing user preferences, and constructing an article matrix V by combining the user preferences and the theme vector, so that the article matrix V comprises theme information and user preference information:
Figure BDA0002485886320000054
obey the following gaussian distribution:
Figure BDA0002485886320000055
in the formula (I), the compound is shown in the specification,
Figure BDA0002485886320000056
is a hyperparameter λvThe inverse of (d), is the variance of the gaussian distribution;
4.4) restoring input information by using a noise reduction self-encoder, substituting into an expression of W and V to obtain an object objective function;
4.5) decomposing the objective scoring matrix by using the user matrix U and the item matrix V according to the following formula:
Figure BDA0002485886320000057
in the formula, RijThe item value of the ith row and the jth column of the target scoring matrix is obtained; u shapeiI-th row, V, representing the user matrix UjThe jth row of the commodity matrix V is represented,
Figure BDA0002485886320000058
is a hyperparameter λrThe inverse of (d), is the variance of the gaussian distribution;
and obtaining a scoring matrix target function according to a probability-based matrix decomposition method.
In step 5), optimizing a scoring matrix objective function, a user objective function and an article objective function, correcting the representation of a user matrix U and an article matrix V within a preset training threshold value, performing point multiplication on the final user matrix and the article matrix, retrieving the scores of a series of articles by a target user from the result of the point multiplication, and sequencing the score list of each user to obtain the top M posts with high scores, so as to provide a recommendation list for the user.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. the invention utilizes the noise reduction self-encoder to extract the subject information of the text, compared with the related method, the method can better mine the reading preference of the user by utilizing the advantage of deep learning in natural language processing.
2. According to the method, the forum interaction scoring matrix of the user is decomposed by using a probability-based matrix decomposition method, so that the behavior characteristics of the user can be extracted by decomposing the sparse matrix when only a few interactions exist, and the cold start problem is relieved to a certain extent.
3. Aiming at the problem that the text topic model is difficult to combine with the user behavior information for end-to-end recommendation, the forum interaction information and the topic model are fused by a deep learning and probability-based matrix decomposition method, so that end-to-end accurate recommendation is realized.
4. The model of the invention has expandability and supports the embedding of various user information.
Drawings
FIG. 1 is a logic flow diagram of the method of the present invention.
FIG. 2 is a decomposition and analysis diagram of a user interaction matrix according to the present invention.
FIG. 3 is a diagram of a forum recommendation model in accordance with the present invention.
FIG. 4 is a diagram illustrating the extraction of text hidden vectors from a coder with noise reduction according to the present invention.
Detailed Description
The present invention will be further described with reference to the following specific examples.
As shown in fig. 1, the method for recommending a mu lesson forum post integrating forum interaction behavior and user reading preference provided by the embodiment includes the following steps:
1) and constructing a target scoring matrix, wherein the target scoring matrix is a two-dimensional matrix, the horizontal direction of the matrix is a user ID, the column items of the matrix are post IDs, the numerical value of the matrix identifies whether the user is interested in the posts, the score of the user on the posts is set to be 1, the user is shown to browse the corresponding posts, the default user is possibly interested in the subject of the posts, the score of the user on the posts is set to be 0, the user is shown not to browse the corresponding posts, and the default user is possibly not interested in the subject of the posts.
Because the forum has a lot of unanswered posts, the target scoring matrix is very sparse, that is, most of the posts in the matrix are 0, the original data needs to be filtered, posts with too few words and more special symbols are removed, a user browsing amount threshold value is set, and users with browsing post numbers smaller than the threshold value are removed, so that some 'zombie' users and 'garbage' posts can be removed. And then, constructing a scoring matrix, and randomly deleting the items with the median value of 0 in the matrix according to a certain negative sampling proportion. Specifically, a mask matrix with the same row and column as the original matrix can be used, 1 is used for identifying items needing to be reserved in the original matrix, 0 is used for identifying items needing to be deleted from the original matrix and having a score of 0, the mask matrix is calculated in advance according to the negative sampling proportion, and the mask matrix and the original score matrix are subjected to AND operation, and the downsampling matrix can be obtained. The downsampling matrix is the actual data that is subsequently input to the model.
And constructing a user interaction scoring matrix and an interaction frequency matrix which are also two-dimensional matrices and respectively identify whether the users have interaction and the total number of the interaction, wherein the interaction comprises the actions of reply, comment, praise and browse among the users, if the interaction occurs among the users, the value corresponding to the user interaction scoring matrix is 1, otherwise, the value corresponding to the user interaction frequency matrix is 0, and the value corresponding to the user interaction frequency matrix is the number of the interaction occurring among the users.
2) The method for obtaining the user objective function and the user characteristic matrix comprises the following steps:
2.1) introducing a matrix UaAnd UbSuppose UaAnd UbObeys the following gaussian distribution:
Figure BDA0002485886320000071
Figure BDA0002485886320000081
where i denotes each row of the matrix,
Figure BDA0002485886320000082
is a hyperparameter λaThe inverse number of (c) is,
Figure BDA0002485886320000083
is a hyperparameter λbThe reciprocal of (c).
2.2) let the user interaction scoring matrix be Q and the user interaction frequency matrix be C, assuming that each value in Q obeys the following Gaussian distribution:
Figure BDA0002485886320000084
in the formula, QijThe term values in the ith row, jth column of the matrix are scored for user interaction,
Figure BDA0002485886320000085
is a matrix UaThe vector of the ith row, and
Figure BDA0002485886320000086
is a matrix UbVector of j-th column, C-1The inverse matrix of the user interaction frequency matrix C is the variance of the Gaussian distribution; the user interaction frequency matrix is used as the variance, so that the more the user interaction times are, the greater the punishment of the corresponding item is, namely, the more active the user is in the forum, and the higher the requirement on the accuracy of the feature representation is.
2.3) obtaining a derivation formula according to the maximum posterior probability:
Figure BDA0002485886320000087
in the formula, P (|) represents posterior probability, and oc represents proportional relation;
it is noted thatThe above Q is obtained by the statistics of the user interaction behavior and is a known observation, so the above problem is equivalent to using the known observation Q to reverse the model parameter U most likely to cause the observationaAnd Ub. Thus, U can be introducedaAnd UbIs a gaussian distribution, so that using maximum likelihood estimation, U is calculatedaAnd UbAnd obtaining the user objective function according to the maximum posterior probability.
2.4) due to UaAnd UbThe results are the results of the user interaction scoring matrix decomposition, and they respectively contain information of the matrix horizontal and column entries, as shown in fig. 2, the column entry of the user interaction scoring matrix represents the active reply behavior of the user, and describes the characteristic that the user is actively in the forum, and the horizontal entry represents the reply behavior of other users to the user, and describes the characteristic of the passive behavior of the user in the forum. Will UaAnd UbAnd adding to obtain a user behavior characteristic matrix, so that the change result contains information of the active and passive behaviors of the user.
3) Counting various interactive behavior times of a user, including the counted praise times, browsed amount, replied number, reply number, posting number and concerned number of the user, constructing a user embedding matrix with a corresponding user ID, and splicing a user behavior characteristic matrix and the user embedding matrix to obtain a user matrix U, wherein the method comprises the following steps:
3.1) discretizing the times (numerical values) of various interactive behaviors to obtain 6 discretization characteristics; taking the bin discretization as an example, assuming that sample points of the user's approval are {12,13,24,4,5,34,56,98,8} and the bin interval is 5, the discretized data is {3,3,5,1,1,7,12,20,2 }. The binning interval may be determined from a histogram of the data and is typically set to a value that makes the data distribution more uniform.
3.2) taking 6 discretization features and the user ID as the input of a model embedding layer to obtain 7 embedding vectors; the output dimensionality of the embedding layer of the model can be automatically adjusted according to the distribution range of the data, if the discrete numerical value is not large, the dimensionality can be set to be smaller, and the specific dimensionality value can refer to fig. 3.
3.3) insert 7 intoAfter quantity splicing, the quantity is used as an embedded layer vector of a user, and the embedded layer vectors of all the users form a user embedded matrix Uc
3.4) splicing the user behavior characteristic matrix and the user embedded matrix UcThe user matrix U is constructed according to the following equation:
Figure BDA0002485886320000091
in the formula, the symbol [;]indicating a splicing operation, Ua+UbFor a user behavior feature matrix, UcThe matrix is embedded for the user and,
Figure BDA0002485886320000092
is a hyperparameter λuThe inverse of (d) is the variance of the gaussian distribution.
4) Obtaining an object objective function and a scoring matrix objective function, comprising the following steps:
4.1) combining the title of the forum post, the detailed description of the post and a series of historical replies into a content post, and preprocessing the text, wherein the detailed steps are as follows: firstly, combining the title of the forum post, the detailed description of the post and a series of historical postbacks into a content post, and then preprocessing the text, including punctuation and character separation, case-to-case conversion, removal of meaningless punctuation characters and linkage.
4.2) converting the text into bit sequence codes and inputting the codes into an embedding layer, or directly initializing the embedding layer of the text through a pre-training word vector. Taking bit-order coding as an example, the principle of the embedding layer is explained: suppose post X? The word in X has a bit sequence of { How:899, many:1456, layer:600, doss: 3245, the:6723, OSI:28, model:876, relationship: 547, of:2323} in a text library, and the post X is converted into a bit sequence of {899,1456,600,3245,6723,28,876,547,2323 }; assuming that the dimension of the embedding layer is d and the size of the lexicon is m, the embedding layer will be randomly initialized to a matrix with the size of m × d, so that the words in post X can be indexed by bit order to obtain the corresponding word vector (embedding vector). When the text pre-training vector is used as input, the Word vector representation of each Word of the text can be obtained through large-scale public Word embedding, such as Glove or Word2vec, and the embedding layer matrix is initialized. The embedded representation of the text is then output by the embedding layer. The specific output dimension may be self-defined or set with reference to fig. 3.
4.3) inputting the embedded representation of the text into a noise reduction self-encoder, wherein the noise reduction self-encoder is composed of a plurality of layers of feedforward neural networks and comprises an encoding layer, an intermediate layer and a decoding layer, and the weight of each layer of the network is randomly initialized by the following Gaussian distribution; the noise reduction self-encoder restores original information of a text through a decoder, an implicit description of the text is captured through fewer neural units in an intermediate layer, the implicit description is an information representation of the text with more abstract and fewer dimensions, and from the interpretability aspect, an output vector of the intermediate layer
Figure BDA0002485886320000101
Subject information including text.
Figure BDA0002485886320000102
Wherein W is the weight of each layer of the noise reduction self-encoder,
Figure BDA0002485886320000103
is a hyperparameter λwThe inverse of (d), is the variance of the gaussian distribution;
introducing user preferences, and constructing an article matrix V by combining the user preferences and the theme vector, so that the article matrix comprises theme information and user preference information:
Figure BDA0002485886320000111
obey the following gaussian distribution:
Figure BDA0002485886320000112
in the formula (I), the compound is shown in the specification,
Figure BDA0002485886320000113
is a hyperparameter λvThe inverse of (d) is the variance of the gaussian distribution.
And 4.4) restoring the input information by using a noise reduction self-encoder, and substituting the input information into the expression of W and V to obtain an object function.
4.5) decomposing the objective scoring matrix by using the user matrix U and the item matrix V according to the following formula:
Figure BDA0002485886320000114
in the formula, RijScoring the ith row and jth column of the matrix for the targetiI-th row, V, representing the user matrix UjThe jth row of the commodity matrix V is represented,
Figure BDA0002485886320000115
is a hyperparameter λrThe inverse of (d), is the variance of the gaussian distribution;
note that the above R is obtained statistically and is a known observation, so the above problem is equivalent to using the known observation R to extrapolate back the model parameters U and V that most probably lead to this observation, and thus, by introducing the prior distribution of U and V, the maximum posterior probability of U and V can be calculated using the maximum likelihood estimation to obtain the scoring matrix objective function.
As can be seen from FIG. 2 and the above steps, the decomposition and prediction of the target scoring matrix involves user reading preference extraction, user feature extraction and user interaction matrix decomposition, and all components are fused into a model for end-to-end recommendation.
5) The objective function of the scoring matrix, the objective function of the user and the objective function of the article are optimized, in order to minimize the objective function, the gradient of the objective function can be solved by utilizing an optimizer mature in the industry, such as Adagrad, RMSprop and SGD, so that the parameters of the model are updated through back propagation, the representation of the user matrix U and the article matrix V is corrected within a certain training threshold, and the training threshold can be generally set through observing the stability and convergence degree of the predicted result without a fixed setting value.
And after the threshold value is reached, performing point multiplication on the final user matrix and the item matrix, retrieving the scores of the target users on a series of items from the result of the point multiplication, and sequencing the score list of each user to obtain posts with higher scores corresponding to the first M posts, so as to provide a recommendation list for the users.
The above-mentioned embodiments are merely preferred embodiments of the present invention, and the scope of the present invention is not limited thereto, so that the changes in the shape and principle of the present invention should be covered within the protection scope of the present invention.

Claims (6)

1. The method for recommending the forum section of the admiration lesson integrating the interactive behavior of the forum and the reading preference of the user is characterized by comprising the following steps of:
1) constructing a target scoring matrix, a user interaction scoring matrix and a user interaction frequency matrix of the posts by the user by using browsing records of the user in the forum;
2) introducing a matrix UaAnd UbDecomposing a user interaction scoring matrix by using a user interaction frequency matrix as a constraint item to obtain a user target function, and calculating the UaAnd UbAdding the data to obtain a user behavior characteristic matrix;
3) counting the times of various interactive behaviors of a user, constructing a user embedded matrix with a corresponding user ID, and splicing the user behavior characteristic matrix and the user embedded matrix to obtain a user matrix U;
4) integrating the historical posts of the user, introducing an article matrix V, extracting the subject of the post through a noise reduction self-encoder to obtain an article objective function, and decomposing a target scoring matrix by using a user matrix U and the article matrix V to obtain a scoring matrix objective function;
5) optimizing the objective function of the scoring matrix, the objective function of the user and the objective function of the article, and providing a recommendation list for the user.
2. The method as claimed in claim 1, wherein in step 1), the objective scoring matrix is a two-dimensional matrix, the horizontal direction of the matrix is a user ID, the columns of the matrix are post IDs, the value of the matrix identifies whether the user is interested in the post, the score of the post by the user is set to 1, which indicates that the user browses the corresponding post, and the default user may be interested in the subject of the post, and the score of the post by the user is set to 0, which indicates that the user does not browse the corresponding post, and the default user may not be interested in the subject of the post; the user interaction scoring matrix and the user interaction frequency matrix are two-dimensional matrices, whether interaction exists between users and the total number of the interaction are respectively identified, the interaction comprises the actions of reply, comment, praise and browse among the users, if the interaction occurs between the users, the value corresponding to the user interaction scoring matrix is 1, otherwise, the value corresponding to the user interaction frequency matrix is 0, and the value corresponding to the user interaction frequency matrix is the number of the interaction occurring between the users.
3. The mu lesson forum post recommendation method integrating forum interaction behavior and user reading preference as claimed in claim 1, wherein in step 2), obtaining a user objective function and a user characteristic matrix comprises the following steps:
2.1) introducing a matrix UaAnd UbSuppose UaAnd UbObeys the following gaussian distribution:
Figure FDA0002485886310000021
Figure FDA0002485886310000022
where i denotes each row of the matrix,
Figure FDA0002485886310000023
is a hyperparameter λaThe inverse number of (c) is,
Figure FDA0002485886310000024
is a hyperparameter λbThe reciprocal of (a);
2.2) let the user interaction scoring matrix be Q and the user interaction frequency matrix be C, assuming that each value in Q obeys the following Gaussian distribution:
Figure FDA0002485886310000025
in the formula, QijThe term values in the ith row, jth column of the matrix are scored for user interaction,
Figure FDA0002485886310000026
is a matrix UaThe vector of the ith row, and
Figure FDA0002485886310000027
is a matrix UbVector of j-th column, C-1The inverse matrix of the user interaction frequency matrix C is the variance of the Gaussian distribution;
2.3) obtaining a derivation formula according to a matrix decomposition method based on probability:
Figure FDA0002485886310000028
in the formula, P (|) represents posterior probability, and oc represents proportional relation;
substituted into Q, Ua,UbThe user target function is obtained through deduction according to the Gaussian distribution;
2.4) mixing UaAnd UbAnd adding to obtain a user behavior characteristic matrix.
4. The mu lesson forum post recommendation method fusing forum interactive behaviors and user reading preferences according to claim 1, wherein in step 3), the statistics of the various interactive behaviors of the user includes the statistics of the praise times, the browsed quantity, the replied number, the reply number, the post number and the concerned number of the user, the construction of a user embedding matrix with the corresponding user ID, and the splicing of the user behavior characteristic matrix and the user embedding matrix to obtain a user matrix U, and comprises the following steps:
3.1) discretizing the times of various interactive behaviors to obtain 6 discretization characteristics;
3.2) taking 6 discretization features and the user ID as the input of a model embedding layer to obtain 7 embedding vectors;
3.3) splicing 7 embedded vectors to be used as embedded layer vectors of users, wherein the embedded layer vectors of all the users form a user embedded matrix Uc
3.4) splicing the user behavior characteristic matrix and the user embedded matrix UcThe user matrix U is constructed according to the following equation:
Figure FDA0002485886310000031
in the formula, the symbol [;]indicating a splicing operation, Ua+UbIn the form of a matrix of user behavior characteristics,
Figure FDA0002485886310000032
is a hyperparameter λuThe inverse of (d) is the variance of the gaussian distribution.
5. The mu lesson forum post recommendation method integrating forum interaction behavior and user reading preference of claim 1, wherein in step 4), an object objective function and a scoring matrix objective function are obtained, comprising the steps of:
4.1) combining the title of the forum post, the detailed description of the post and a series of historical replies into a content post, and preprocessing the text;
4.2) converting the content of each post into a bit sequence code, inputting the bit sequence code into an embedding layer, or directly initializing the embedding layer of the text through a pre-training word vector, and outputting the embedded representation of the text;
4.3) inputting the embedded representation of the text into a noise reduction self-encoder, restoring the input information through the noise reduction self-encoder, and extracting the theme vector of the text by utilizing the middle layer of the noise reduction self-encoder
Figure FDA0002485886310000033
The weights of the noise-reducing self-encoder network obey the following gaussian distribution:
Figure FDA0002485886310000041
wherein W is the weight of each layer of the noise reduction self-encoder,
Figure FDA0002485886310000042
is a hyperparameter λwThe inverse of (d), is the variance of the gaussian distribution;
introducing user preferences, and constructing an article matrix V by combining the user preferences and the theme vector, so that the article matrix V comprises theme information and user preference information:
Figure FDA0002485886310000043
obey the following gaussian distribution:
Figure FDA0002485886310000044
in the formula (I), the compound is shown in the specification,
Figure FDA0002485886310000045
is a hyperparameter λvThe inverse of (d), is the variance of the gaussian distribution;
4.4) restoring input information by using a noise reduction self-encoder, substituting into an expression of W and V to obtain an object objective function;
4.5) decomposing the objective scoring matrix by using the user matrix U and the item matrix V according to the following formula:
Figure FDA0002485886310000046
in the formula, RijThe item value of the ith row and the jth column of the target scoring matrix is obtained; u shapeiI-th row, V, representing the user matrix UjThe jth row of the commodity matrix V is represented,
Figure FDA0002485886310000047
is a hyperparameter λrThe inverse of (d), is the variance of the gaussian distribution;
and obtaining a scoring matrix target function according to a probability-based matrix decomposition method.
6. The mu lesson forum post recommendation method fusing forum interaction behavior and user reading preference according to claim 1, wherein in step 5), the scoring matrix objective function, the user objective function and the item objective function are optimized, the representation of the user matrix U and the representation of the item matrix V are modified within a preset training threshold, the final user matrix and the item matrix are dot-multiplied, the scores of the target user on a series of items are retrieved from the dot-multiplied result, the scoring lists of each user are sorted to obtain posts corresponding to the top M scores, and a recommendation list is provided for the user.
CN202010391330.3A 2020-05-11 2020-05-11 Method for recommending lesson forum posts by combining forum interaction behaviors and user reading preference Active CN111737427B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010391330.3A CN111737427B (en) 2020-05-11 2020-05-11 Method for recommending lesson forum posts by combining forum interaction behaviors and user reading preference

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010391330.3A CN111737427B (en) 2020-05-11 2020-05-11 Method for recommending lesson forum posts by combining forum interaction behaviors and user reading preference

Publications (2)

Publication Number Publication Date
CN111737427A true CN111737427A (en) 2020-10-02
CN111737427B CN111737427B (en) 2024-03-22

Family

ID=72647039

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010391330.3A Active CN111737427B (en) 2020-05-11 2020-05-11 Method for recommending lesson forum posts by combining forum interaction behaviors and user reading preference

Country Status (1)

Country Link
CN (1) CN111737427B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966148A (en) * 2021-03-05 2021-06-15 安徽师范大学 Video recommendation method and system based on deep learning and feature fusion
CN113449210A (en) * 2021-07-01 2021-09-28 深圳市数字尾巴科技有限公司 Personalized recommendation method and device based on space-time characteristics, electronic equipment and storage medium
CN114996487A (en) * 2022-05-24 2022-09-02 北京达佳互联信息技术有限公司 Media resource recommendation method and device, electronic equipment and storage medium
CN117312542A (en) * 2023-11-29 2023-12-29 泰山学院 Reading recommendation method and system based on artificial intelligence
CN118096237A (en) * 2024-03-08 2024-05-28 北京嘉华铭品牌策划有限公司广东分公司 Deep learning driven customer behavior prediction model
US20240281613A1 (en) * 2022-09-14 2024-08-22 iCIMS, Inc. Methods and apparatus for analyzing internal communication within an organization using natural language processing to recommend improved interactions and identify key personnel

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090299996A1 (en) * 2008-06-03 2009-12-03 Nec Laboratories America, Inc. Recommender system with fast matrix factorization using infinite dimensions
CN106951547A (en) * 2017-03-27 2017-07-14 西安电子科技大学 A kind of cross-domain recommendation method based on intersection user
CN107273438A (en) * 2017-05-24 2017-10-20 深圳大学 A kind of recommendation method, device, equipment and storage medium
CN110807154A (en) * 2019-11-08 2020-02-18 内蒙古工业大学 Recommendation method and system based on hybrid deep learning model
CN111127165A (en) * 2019-12-26 2020-05-08 纪信智达(广州)信息技术有限公司 Sequence recommendation method based on self-attention self-encoder

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090299996A1 (en) * 2008-06-03 2009-12-03 Nec Laboratories America, Inc. Recommender system with fast matrix factorization using infinite dimensions
CN106951547A (en) * 2017-03-27 2017-07-14 西安电子科技大学 A kind of cross-domain recommendation method based on intersection user
CN107273438A (en) * 2017-05-24 2017-10-20 深圳大学 A kind of recommendation method, device, equipment and storage medium
CN110807154A (en) * 2019-11-08 2020-02-18 内蒙古工业大学 Recommendation method and system based on hybrid deep learning model
CN111127165A (en) * 2019-12-26 2020-05-08 纪信智达(广州)信息技术有限公司 Sequence recommendation method based on self-attention self-encoder

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
XU CHONGHUAN: "A novel recommendation method based on social network using matrix factorization technique", INFORMATION PROCESSING AND MANAGEMENT, vol. 54, 27 February 2018 (2018-02-27), pages 463 - 474 *
欧辉思 等: "面向跨领域的推荐系统研究现状与趋势", 小型微型计算机系统, no. 07, 31 July 2016 (2016-07-31), pages 1411 - 1416 *
胡思才 等: "基于深度神经网络和概率矩阵分解的混合推荐算法", 四川大学学报(自然科学版), vol. 56, no. 06, 30 November 2019 (2019-11-30), pages 1033 - 1041 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966148A (en) * 2021-03-05 2021-06-15 安徽师范大学 Video recommendation method and system based on deep learning and feature fusion
CN113449210A (en) * 2021-07-01 2021-09-28 深圳市数字尾巴科技有限公司 Personalized recommendation method and device based on space-time characteristics, electronic equipment and storage medium
CN113449210B (en) * 2021-07-01 2023-01-31 深圳市数字尾巴科技有限公司 Personalized recommendation method and device based on space-time characteristics, electronic equipment and storage medium
CN114996487A (en) * 2022-05-24 2022-09-02 北京达佳互联信息技术有限公司 Media resource recommendation method and device, electronic equipment and storage medium
CN114996487B (en) * 2022-05-24 2023-04-07 北京达佳互联信息技术有限公司 Media resource recommendation method and device, electronic equipment and storage medium
US20240281613A1 (en) * 2022-09-14 2024-08-22 iCIMS, Inc. Methods and apparatus for analyzing internal communication within an organization using natural language processing to recommend improved interactions and identify key personnel
CN117312542A (en) * 2023-11-29 2023-12-29 泰山学院 Reading recommendation method and system based on artificial intelligence
CN117312542B (en) * 2023-11-29 2024-02-13 泰山学院 Reading recommendation method and system based on artificial intelligence
CN118096237A (en) * 2024-03-08 2024-05-28 北京嘉华铭品牌策划有限公司广东分公司 Deep learning driven customer behavior prediction model

Also Published As

Publication number Publication date
CN111737427B (en) 2024-03-22

Similar Documents

Publication Publication Date Title
CN108021616B (en) Community question-answer expert recommendation method based on recurrent neural network
CN111737427B (en) Method for recommending lesson forum posts by combining forum interaction behaviors and user reading preference
CN108363743B (en) Intelligent problem generation method and device and computer readable storage medium
CN112084335B (en) Social media user account classification method based on information fusion
CN109598995B (en) Intelligent teaching system based on Bayesian knowledge tracking model
CN107766324B (en) Text consistency analysis method based on deep neural network
CN110222163B (en) Intelligent question-answering method and system integrating CNN and bidirectional LSTM
CN112199608B (en) Social media rumor detection method based on network information propagation graph modeling
CN110516245A (en) Fine granularity sentiment analysis method, apparatus, computer equipment and storage medium
CN108363790A (en) For the method, apparatus, equipment and storage medium to being assessed
CN111753207B (en) Collaborative filtering method for neural map based on comments
CN111831831A (en) Knowledge graph-based personalized learning platform and construction method thereof
CN109726745A (en) A kind of sensibility classification method based on target incorporating description knowledge
CN113283488B (en) Learning behavior-based cognitive diagnosis method and system
CN107832295A (en) The title system of selection of reading machine people and system
CN108364066B (en) Artificial neural network chip and its application method based on N-GRAM and WFST model
CN113934846B (en) Online forum topic modeling method combining behavior-emotion-time sequence
CN115238199A (en) Knowledge graph-based online community learning path recommendation method, system and equipment
CN115510814A (en) Chapter-level complex problem generation method based on double planning
CN115906816A (en) Text emotion analysis method of two-channel Attention model based on Bert
CN113220964B (en) Viewpoint mining method based on short text in network message field
CN113065342B (en) Course recommendation method based on association relation analysis
CN113987124A (en) Depth knowledge tracking method, system and storage medium
Aliyanto et al. Supervised probabilistic latent semantic analysis (sPLSA) for estimating technology readiness level
CN113361615B (en) Text classification method based on semantic relevance

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant