Background
Short video is a new type of video with a short time. The shooting of the short video does not need to use professional equipment and professional skills. The user can conveniently shoot and upload to the short video platform directly through the mobile phone, so that the short video frequency quantity of the short video platform is increased very quickly. The requirement on the effective short video recommendation system is very urgent, and the effective short video recommendation system can improve the user experience and the user viscosity, so that huge commercial value is brought to the platform.
In recent years, many researchers have proposed personalized recommendation methods based on videos. These methods can be divided into three categories: collaborative filtering, content-based recommendations, and hybrid recommendation methods. But short video has different characteristics compared to video: the descriptive text is of low quality, short duration and the user has a long sequence of interactions over a period of time. Therefore, short video recommendations are a more challenging task. And there are many kinds of user's interactive behaviors including "click", "like" and "favorite" in the short video recommendation problem. Different interactive behaviors represent different likelihoods. "click" indicates that the user would like to watch the short video, but the emotion is not strong; the 'like' and 'favorite' belong to the strong and definite emotions of the user, the 'like' indicates that the user likes the short video and is willing to watch the video of the same type, and the 'favorite' indicates that the user not only likes the short video currently but also wants to watch the video later. Short videos that are "liked" and "favorite" by the user are also "clicked" by the user at the same time. "like" and "favorite" can be generalized to the same type of behavior, i.e., "positive" behavior. At this time, the interaction sequence of the user has two kinds of interaction behaviors, one is a "click" behavior, and the other is an "affirmative" behavior. Some methods have been proposed by researchers to address the short video recommendation problem. For example, Chen et al use a hierarchical attention mechanism to calculate the importance of both the item and category levels to obtain more accurate predictions. Li et al used a graph-based recurrent neural network to model and finally get the user's preferences.
The Chen et al method only uses the click behavior information of the user and does not consider other behavior information of the user. Li et al apply the sequence recommendation method to "click" and "affirmative" behavior sequences, respectively. Finally, experiments show that the user interest representation based on the 'positive' behavior sequence has no obvious effect on improving the recommendation effect. The reasons are two: the time interval of the 'positive' action sequence of the user is longer, and the sequence is not strong; the "positive" behavior sequence is modeled separately, ignoring the effect of the "positive" behavior on subsequent "click" behavior. The method creatively provides a multi-behavior interaction sequence modeling method, two behavior sequences of click and affirmation are put into one behavior sequence to be processed, and the user interest vector representation is generated. Where the "click" behavior is sequential, and the "affirmative" behavior is unordered because the behavior events are more spaced apart. The method combines a non-local network (non-local network) and a local network (local network), wherein the non-local network adopts an attention mechanism (attention mechanism) and learns the influence of 'positive' behaviors on 'click' behaviors in the past period of time; the local network adopts a gated recurrent neural network (GRU) to learn the sequentiality of the click behavior. The method is a recurrent neural network based on a non-local attention mechanism, and improves the structure of an original neural network, so that the network can simultaneously learn the influence of 'positive' behaviors on 'clicking' behaviors and the influence of 'clicking' behaviors on 'clicking' behaviors.
Disclosure of Invention
The technical problem to be solved by the invention is to predict the click rate of a user on a target short video according to a multi-behavior click sequence of the user on the short video. There are many kinds of user interaction behaviors including "click", "like", and "favorite". Different interactive behaviors represent different likelihoods. "click" indicates that the user would like to watch the short video, but the emotion is not strong; the 'like' and 'favorite' both belong to the strong and definite emotions of the user, the 'like' means that the user likes the short video and is willing to watch the video of the same type, and the 'favorite' means that the user not only likes the short video currently but also wants to watch the video later. Short videos that are "liked" and "favorite" by the user are also "clicked" by the user at the same time. "like" and "favorite" can be generalized to the same type of behavior, i.e., "positive" behavior. At this time, the interaction sequence of the user has two kinds of interaction behaviors, one is a "click" behavior, and the other is an "affirmative" behavior. However, the original sequence recommendation methods are all directed to a sequence of interactive actions. Therefore, the invention adopts the following technical scheme:
a short video recommendation method based on a non-local network and a local network comprises the following steps:
and (3) obtaining the influence of the 'positive' behavior on each 'click' behavior in the short video multi-behavior interaction sequence of the user by adopting an attention mechanism method. Sequence of interactive actions for a user
Can be represented as X ═ X
1,…,x
l]Wherein
Is the feature vector of the cover picture of the short video, and d is the feature vector length. Wherein a sequence of "positive" behaviors is represented as
And X
*Is a subset of X. "click" behavior sequenceI.e. X ═ X
1,…,x
l]. The influence of the 'positive' behavior sequence on the 'click' behavior is obtained by using an attention mechanism method in a non-local network method. Typically, the last-click short video (last-click) in the sequence is used to represent the user's current click interest, so the attention mechanism is based on the last-click short video:
wherein the content of the first and second substances,
and
are parameters that the model needs to be trained. x is the number of
tRepresenting the last short video vector representation in the click sequence,
the ith short video vector representation representing the "positive" sequence in the current "click" sequence. Sigma is sigmoid function.
Ith short video vector characterization representing a "positive" sequence in the current "click" sequence
The degree of importance of.
Is x
tThe impact of "positive" behavior in the ending "click" behavior sequence on the current click interest.
A user interest characterization is generated using a recurrent neural network based on a non-local attention mechanism. The original gated recurrent neural network (GRU) can only handle single-action sequences, with the structure:
zt=σ(Wxz·xt+Whz·ht-1)
rt=σ(Wxr·xt+Whr·ht-1)
wherein r is
tIs a reset gate, z
tTo update the gates (update gate), these two gating vectors determine which information can be used as the output of the gated loop unit.
Is the current memory content. x is the number of
tIs the node input for the current layer.
And
respectively, control the update gate z
tAnd a reset gate r
tThe parameter (c) of (c).
And
is to control the pre-memory content
The parameter (c) of (c). AnMatrix multiplication at the element level, σ is the sigmoid function.
Gated recurrent neural networks, however, do not apply to multi-behavior sequences. In order to be suitable for a multi-behavior sequence, the method improves the original gated recurrent neural network, so that the gated recurrent neural network unit (unit) selects information not only considering the current short video in the sequence and the state of the last gated recurrent neural network unit, but also considering the influence of "positive" behavior, as follows:
wherein z is
tTo update the gate (update gate), r
tIs a reset gate (reset gate) and these two gating vectors determine which information can be used as the output of the gated loop unit.
Is the current memory content. x is the number of
tIs the node input at the current level,
is the effect of "positive" behavior.
And
respectively, control the update gate z
tAnd a reset gate r
tThe parameter (c) of (c).
And
is to control the pre-memory content
The parameter (c) of (c). As is the element-level matrix multiplication, σ is the sigmoid function. Hidden state h of last layer of gate control recurrent neural network
tThe output of (a) is the user interest representation v.
Predicting the target short video x of the user according to the user interest representationnewClick rate of (2):
where v is a user interest representation, x
newIs the target short video.
And the predicted value of the click rate of the user on the target short video is shown.
And
is a matrix of transitions that is,
is an offset vector, b
2Is a bias scalar. σ is the sigmoid activation function.
And designing a loss function according to the model characteristics. Predicting value of click rate of target short video through user
Calculating a predicted value
And the true value y, and the error is used to update the model parameters. We use a cross-entropy loss function to guide the update process of model parameters:
wherein y ∈ {0,1} is a true value representing whether the user clicked on the target short video. σ is a sigmoid function. We update the model parameters using Adam optimizer.
The invention has the following beneficial technical effects:
(1) the invention discloses a multi-behavior sequence characterization method. Different from the previous single behavior sequence characterization method, the method puts the two behavior sequences of click and affirmation into one behavior sequence for processing to generate the user interest vector characterization. Where the "click" behavior is sequential, and the "affirmative" behavior is unordered because the behavior events are more spaced apart.
(2) The invention combines non-local network (non-local network) and local network (local network). Wherein, the non-local network adopts an attention mechanism (attention mechanism), and learns the influence of all 'positive' behaviors on 'click' behaviors in the past period of time; the local network adopts a gated recurrent neural network (GRU) to learn the influence of the click behavior on the click behavior in the near period of time.
(3) The invention relates to a recurrent neural network based on a non-local attention mechanism, which can enable the network to simultaneously learn the influence of 'positive' behavior on 'click' behavior and the influence of 'click' behavior on 'click' behavior by improving the structure of an original neural network.
Detailed Description
For further understanding of the present invention, the following describes a short video recommendation method based on non-local network and local network in detail with reference to specific embodiments, but the present invention is not limited thereto, and those skilled in the art can make insubstantial improvements and modifications under the core teaching of the present invention, and still fall within the scope of the present invention.
The short video click rate prediction task is to establish a model to predict the probability of the user clicking on the short video. The historical interactive short video sequence of the user is represented as
Wherein x is
jRepresenting the jth short video, l is the length of the sequence. There are many kinds of user interaction behaviors including "click", "like", and "favorite". Different interactive behaviors represent different likelihoods. "click" indicates that the user would like to watch the short video, but the emotion is not strong; the 'like' and 'favorite' belong to the strong and definite emotions of the user, the 'like' indicates that the user likes the short video and is willing to watch the video of the same type, and the 'favorite' indicates that the user not only likes the short video currently but also wants to watch the video later. Short videos that are "liked" and "favorite" by the user are also "clicked" by the user at the same time. Thus, the short video click-through rate prediction problem can be expressed as: inputting user multi-behavior interaction sequences
And target short video x
newTo predict the user-to-target short video x
newThe click rate of (c).
Therefore, the invention provides a short video recommendation method based on a non-local network and a local network. The method predicts the click rate of the user on the target short video according to the multi-behavior interaction sequence of the user on the short video. The multiple behaviors here include "click," "like," and "favorite" behaviors of the user. In the method, "like" and "favorite" are summarized as the same type of behavior, i.e., "affirmative" behavior. At this time, the interaction sequence of the user has two kinds of interaction behaviors, one is a "click" behavior, and the other is an "affirmative" behavior. The original sequence recommendation method is directed to a sequence of interactive behaviors. Li et al applied the sequence recommendation method to the "click" behavior sequence and the "affirmative" behavior sequence, respectively, and finally experiments showed that the user interest characterization based on the "affirmative" behavior sequence had a very insignificant effect on improving the recommendation effect. The reasons are two: the time interval of the 'positive' action sequence of the user is longer, and the sequence is not strong; the "positive" behavior sequence is modeled separately, ignoring the effect of the "positive" behavior on subsequent "click" behavior. The method creatively provides a multi-behavior interaction sequence modeling method, two behavior sequences of click and affirmation are put into one behavior sequence to be processed, and the user interest vector representation is generated. Where the "click" behavior is sequential, and the "affirmative" behavior is unordered because the behavior events are more spaced apart. The method combines a non-local network (non-local network) and a local network (local network), wherein the non-local network adopts an attention mechanism (attention mechanism) and learns the influence of 'positive' behaviors on 'click' behaviors in the past period of time; the local network adopts a gated recurrent neural network (GRU) to learn the sequentiality of the click behavior. The method is a cyclic neural network based on a non-local attention mechanism, and the structure of the original neural network is improved, so that the network can learn the influence of the 'positive' behavior on the 'click' behavior and the influence of the 'click' behavior on the 'click' behavior at the same time.
The method consists essentially of three parts, as shown in FIG. 2. The first part is to adopt an attention mechanism method to obtain the influence of the 'positive' behavior on each 'click' behavior in the short video multi-behavior interaction sequence of the user. The second part is to generate a user interest characterization using a recurrent neural network based on a non-local attention mechanism. And the third part is that the click rate of the user on the target short video is predicted according to the user interest representation.
As shown in fig. 1, according to one embodiment of the present invention, the method comprises the steps of:
s100, obtaining the influence of the 'positive' behavior on each 'click' behavior in the short video multi-behavior interaction sequence of the user by adopting an attention mechanism method. Sequence of interactive actions for a user
Can be represented as X ═ X
1,…,x
l]Wherein
Is the feature vector of the cover picture of the short video, and d is the feature vector length. Wherein a sequence of "positive" behaviors is represented as
And X
*Is a subset of X. The sequence of "click" actions is X ═ X
1,…,x
l]. The influence of the 'positive' behavior sequence on the 'click' behavior is obtained by using an attention mechanism method in a non-local network method. Typically, the last-click short video (last-click) in the sequence is used to represent the user's current click interest, so the attention mechanism is based on the last-click short video:
wherein the content of the first and second substances,
and
are parameters that the model needs to be trained. x is the number of
tRepresenting the last short video vector representation in the click sequence,
the ith short video vector representation representing the "positive" sequence in the current "click" sequence. Sigma is sigmoid function.
Ith short video vector characterization representing a "positive" sequence in the current "click" sequence
The degree of importance of.
Is x
tThe impact of "positive" behavior in the ending "click" behavior sequence on the current click interest.
And S200, generating a user interest characterization by adopting a recurrent neural network based on a non-local attention mechanism. The original gated recurrent neural network (GRU) can only handle single-action sequences, with the structure:
zt=σ(Wxz·xt+Whz·ht-1)
rt=σ(Wxr·xt+Whr·ht-1)
wherein r is
tIs a reset gate, z
tTo update the gate (update gate),these two gating vectors determine which information can be used as the output of the gated loop unit.
Is the current memory content. x is the number of
tIs the node input for the current layer.
And
respectively, control the update gate z
tAnd a reset gate r
tThe parameter (c) of (c).
And
is to control the pre-memory content
The parameter (c) of (c). As is the element-level matrix multiplication, σ is the sigmoid function.
Gated recurrent neural networks, however, do not apply to multi-behavior sequences. In order to be suitable for a multi-behavior sequence, the method improves an original gate control recurrent neural network, so that the selection of information by a gate control recurrent neural network unit (unit) not only considers the states of a short video and the last gate control recurrent neural network unit in the current sequence, but also considers the influence of 'positive' behavior, as follows:
wherein z is
tTo update the gate (update gate), r
tIs a reset gate (reset gate) and these two gating vectors determine which information can be used as the output of the gated loop unit.
Is the current memory content. x is the number of
tIs the node input at the current level,
is the effect of "positive" behavior.
And
respectively, control the update gate z
tAnd a reset gate r
tThe parameter (c) of (c).
And
is to control the pre-memory content
The parameter (c) of (c). As is the element-level matrix multiplication, σ is the sigmoid function. Hidden state h of last layer of gate control recurrent neural network
tThe output of (a) is the user interest representation v.
S300, according to the userInterest representation for predicting target short video x of usernewClick rate of (2):
where v is a user interest representation, x
newIs the target short video.
And the predicted value of the click rate of the user on the target short video is shown.
And
is a matrix of transitions that is,
is an offset vector, b
2Is a bias scalar. σ is the sigmoid activation function.
And S400, designing a loss function according to the model characteristics. Predicting value of click rate of target short video through user
Calculating a predicted value
And the true value y, and the error is used to update the model parameters. We use a cross-entropy loss function to guide the update process of model parameters:
wherein y ∈ {0,1} is a true value representing whether the user clicked on the target short video. σ is a sigmoid function. We update the model parameters using Adam optimizer.
The foregoing description of the embodiments is provided to facilitate understanding and application of the invention by those skilled in the art. It will be readily apparent to those skilled in the art that various modifications to the above-described embodiments may be made, and the generic principles defined herein may be applied to other embodiments without the use of inventive faculty. Therefore, the present invention is not limited to the above embodiments, and those skilled in the art should make improvements and modifications to the present invention based on the disclosure of the present invention within the protection scope of the present invention.