CN109033463B

CN109033463B - Community question-answer content recommendation method based on end-to-end memory network

Info

Publication number: CN109033463B
Application number: CN201811008620.4A
Authority: CN
Inventors: 陈细玉; 林穗; 孙为军
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2018-08-28
Filing date: 2018-08-28
Publication date: 2021-11-26
Anticipated expiration: 2038-08-28
Also published as: CN109033463A

Abstract

The invention discloses a community question-answer content recommendation method based on an end-to-end memory network, which comprises the steps of firstly obtaining a title as a data set, preprocessing the data set, and dividing the data set into a training set, a verification set and a test set; then, establishing an end-to-end memory network model according to the data set; finally, a Stochastic Gradient Descent (SGD) optimization model with AdaGrad update rules is used.

Description

Community question-answer content recommendation method based on end-to-end memory network

Technical Field

The invention relates to the field of content recommendation, in particular to a community question and answer content recommendation method based on an end-to-end memory network.

Background

The network community question-answering is a main use platform for solving problems and sharing knowledge and experience of people at present, for example, knowing that the information range is wide, but not everyone is interested, so that the content which the user is interested in needs to be recommended to the user, and the viscosity of the user is increased.

Disclosure of Invention

The invention aims to solve one or more defects and provides a community question and answer content recommendation method based on an end-to-end memory network.

In order to realize the purpose, the technical scheme is as follows:

a community question-answer content recommendation method based on an end-to-end memory network comprises the following steps:

s1: acquiring a title as a data set, preprocessing the data set, and dividing the data set into a training set, a verification set and a test set;

s2: establishing an end-to-end memory network model according to the data set;

s3: a random gradient descent (SGD) optimization model with AdaGrad update rules was used.

Preferably, the data set of step S1 is divided into training set, verification set and test set on average.

Preferably, the title in step S1 is a content title of the user browsing and historical behavior in the community question and answer.

Preferably, the end-to-end memory model comprises a single layer model and a multilayer model; wherein the single-layer model comprises a memory component, an input component, and an output component;

wherein the memory component represents: title set D ═ x for storing historical behaviors₁,x₂...x_nWill each word w using a matrix A of size dim x V |_ij∈x_iMemory vector { a) embedded into d-dimension_ijIn such a that_ij＝Aw_ij. Entire sentence set { x_iUsing matrix A to convert into memory vector of dimension d { a }_i}；

The input component represents: the forward browsing title q is converted into vector B by B matrix, B is calculated and a is memorized_iThe matching degree between the two formulas is as follows: p is a radical of_i＝Softmax(b^Ta_i) (ii) a Wherein Softmax (z)_i)＝e^Zi/∑_je^ZjP is the probability vector on the input;

the output component represents: title set of historical behavior D ═ { x ═ x₁,x₂...x_nD, using a matrix C to convert into an output vector C with dimension d_iThe output o is the output vector c_iAnd probability vector weighted sum, formula:

final prediction f ═ Softmax (W (o + b));

the multi-layer model is that the header q of the input element is the sum of the previous-hop input header b and the output o, i.e. the input of the next layer k +1 is the output o from the layer k^kAnd input b^kThe formula: b^k+1＝o^k+bk；

Wherein each layer has its own embedded matrix a^k，C^kFor embedding input { x_i}。

Preferably, the multi-layer model further comprises a sentence representation, each sentence x_i＝{x_i1，x_i2，...，x_inEmbed each word and sum the resulting vectors, and add a time representation, the word vector being a 0-1 vector of length V, such that a_i＝∑_jAx_ij+T_A(i) (ii) a Wherein T is_A(i) Is a special matrix T encoding time information_ARow i of (1); all in oneMatrix Tc, ci ═ Σ for output embedding_jCx_ij+T_C(i)。，T_AAnd T_CAre learned during training.

Preferably, the multilayer model further comprises word similarity, and in the currently browsed title q in the first layer, keywords with similarity exceeding 0.8 in q in memory are added into q by using the word similarity, so that the situation that the weight of the titles with different words is too low while the keywords are similar or similar to the keywords in q in memory is avoided;

selecting keywords of the title being browsed from a corpus consisting of all preprocessed titles, and carrying out similarity calculation between every two words and the rest keywords to calculate a formula:

where yi is the coefficient for w1 and w2 branching at the beginning of the ith layer.

Preferably, the evaluation criteria for the model are accuracy, recall, and F1 score.

Compared with the prior art, the invention has the beneficial effects that:

the end-to-end memory network can remember a large amount of user behaviors and add time, so that the interest prediction of the user is more accurate and reliable. And reducing supervision items by adopting end-to-end training. The attention mechanism is included, so that different titles have different weights, the predicted interest points can be sequenced, the recommended emphasis points are different, the interest points with large weights are ranked highly, and the recommended content of the interest points is more than that of other interest points. And the word similarity is added, so that the prediction is more accurate.

Drawings

FIG. 1 is a flow chart of the present invention.

Detailed Description

The drawings are for illustrative purposes only and are not to be construed as limiting the patent;

the invention is further illustrated below with reference to the figures and examples.

Example 1

Referring to fig. 1, a community question-answer content recommendation method based on an end-to-end memory network includes the following steps:

s2: establishing an end-to-end memory network model according to the data set;

For example, it is known that questions and their answers are more likely to be shared than known in hundredths, rather than being answered. Each question is short and descriptive, so the question is the title. All the acquired titles need to be preprocessed, each title is firstly subjected to word segmentation, stop words and special characters such as 'a' and 'a' are then deleted, and because many 'reasons', 'how' and 'experiences' in questions are known, the words are also deleted, so that the situation that the weight of common irrelevant words is too large, required key words are covered, the maximum length of sentences is set to be 50, and the exceeding contents need to be cut is avoided. The data set is evenly divided into a training set, a validation set and a test set.

The method comprises the steps of selecting titles of historical behaviors of a user as memory in a model, wherein the historical behaviors comprise removing the latest browsed titles and agreeing titles which are browsed, answering the titles, paying attention to the titles, selecting the latest 5 titles according to time, and recommending the content related to the latest interest of the user, so that the selected titles are sorted according to the operation time of the user to form a title set D, the test effect is better when the embedding dimension of each title is 300-500-.

In this embodiment, the end-to-end memory model includes a single-layer model and a multi-layer model; wherein the single-layer model comprises a memory component, an input component, and an output component;

wherein the memory component represents: title set D ═ x for storing historical behaviors₁,x₂...x_nUsing a matrix of size dim x V |)A will each word w_ij∈x_iMemory vector { a) embedded into d-dimension_ijIn such a that_ij＝Aw_ij. Entire sentence set { x_iUsing matrix A to convert into memory vector of dimension d { a }_i}；

final prediction f ═ Softmax (W (o + b));

the multi-layer model is that the header q of the input element is the sum of the previous-hop input header b and the output o, i.e. the input of the next layer k +1 is the output o from the layer k^kAnd input b^kThe formula: b^k+1＝o^k+b^k；

In this embodiment, the multi-layer model further includes sentence representations, each sentence x_i＝{x_i1，x_i2，...，x_inEmbed each word and sum the resulting vectors, and add a time representation, the word vector being a 0-1 vector of length V, such that a_i＝∑_jAx_ij+T_A(i) (ii) a Wherein T is_A(i) Is a special matrix T encoding time information_ARow i of (1); similarly, the matrix Tc, ci ═ Σ for output embedding_jCx_ij+T_C(i)。，T_AAnd T_CAre all in the training periodAnd (4) learning.

Wherein each matrix such as A, B, C, W is also obtained by training, and the first jump matrix A is used for reducing the number of parameters for convenient training¹B, last hop matrix W^T＝C^KThe other memory matrix A of each hop is the same as the output matrix C of the previous hop, namely A^k+1＝C^kFor the same reason, the matrix T for time representation_A，T_CThe parameters are reduced in the same way.

In this embodiment, the multi-layer model further includes word similarity, and in the currently-viewed title q in the first layer, the keyword whose similarity in memory with that in q exceeds 0.8 is added to q by using the word similarity, so as to avoid that the title weight which is the same as or similar to that in q in memory but different from words is too low;

And the predicted result of the model is used as the nearest interest point of the user, and for each browsing title, the top 5 predicted interest points are selected according to the ranking. And taking the interest points as tags, recommending hot content corresponding to the tags, for example, if a predicted result tag comprises a friend, recommending the hot content with the friend tag.

In this embodiment, the evaluation criteria of the model are accuracy, recall, and F1 score.

It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims

1. A community question-answer content recommendation method based on an end-to-end memory network is characterized by comprising the following steps:

s2: establishing an end-to-end memory network model according to the data set;

s3: using a Stochastic Gradient Descent (SGD) optimization model with AdaGrad update rules;

the end-to-end memory model comprises a single-layer model and a multi-layer model; wherein the single-layer model comprises a memory component, an input component, and an output component;

wherein the memory component represents: title set D ═ x for storing historical behaviors₁,x₂...x_nWill each word w using a matrix A of size dim x V |_ij∈x_iMemory vector { a) embedded into d-dimension_ijIn such a that_ij＝Aw_ijEntire sentence set { x_iUsing matrix A to convert into memory vector of dimension d { a }_i}；

final prediction f ═ Softmax (W (o + b));

Wherein each layer has its own embedded matrix a^k，C^kFor embedding input { x_i}；

The multi-layer model further includes sentence representations, each sentence x_i＝{x_i1，x_i2，...，x_inEmbed each word and sum the resulting vectors, and add a time representation, the word vector being a 0-1 vector of length V, such that a_i＝∑_jAx_ij+T_A(i) (ii) a Wherein T is_A(i) Is a special matrix T encoding time information_ARow i of (1); similarly, the matrix Tc, ci ═ Σ for output embedding_jCx_ij+T_C(i)，T_AAnd T_CAre all learned during training;

the multilayer model also comprises word similarity, and in the forward browsing title q at the first layer, the keywords with similarity exceeding 0.8 in the q in memory are added into the q by using the word similarity, so that the situation that the weight of the title with different words is too low while the keywords are the same as or similar to the words in the q in memory is avoided;

2. The method for recommending content of a community question and answer based on an end-to-end memory network as claimed in claim 1, wherein said data set of step S1 is averagely divided into a training set, a validation set and a test set.

3. The method as claimed in claim 1, wherein the title in step S1 is a content title of browsing and historical behavior of the user in the community question and answer.

4. The method as claimed in claim 1, wherein the evaluation criteria of the model are accuracy, recall rate and F1 score.