CN111294619B

CN111294619B - Long-short term interest modeling method for IPTV field

Info

Publication number: CN111294619B
Application number: CN202010129277.XA
Authority: CN
Inventors: 李恒; 雷航; 杨茂林; 曾敬鸿; 朱迪; 付守伟
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2020-02-28
Filing date: 2020-02-28
Publication date: 2021-09-10
Anticipated expiration: 2040-02-28
Also published as: CN111294619A

Abstract

The invention provides a long-term and short-term interest modeling method for the IPTV field, which distinguishes a long-term click sequence L_t‑1And short-term click sequence S_tLong term click sequence L_t‑1And short-term click sequence S_tEmbedding mapping to obtain long-term click sequence embedded representation

And short-term click sequence embedded representation

Further calculating the long-term preference of the family and the short-term interest U of the user_SThen according to family long-term preference and user short-term interest U_SObtaining user mix preferences U_hybirdAnd according to the user mixing preference U_hybirdCalculating click rate R_u. According to the invention, more accurate interest and hobby prediction and push are realized through the operation.

Description

Long-short term interest modeling method for IPTV field

Technical Field

The invention belongs to the field of deep learning modeling, and particularly relates to a long-short term interest modeling method for the field of IPTV.

Background

In the era of information overload, recommendation systems have become very important for internet services and are also widely used in different fields, such as: e-commerce websites, video websites, and the like. The television, one of the most commonly used household appliances in daily life, has gradually developed towards the internet. Currently, a large amount of video content has been embedded in IPTV. Therefore, the IPTV field also needs to introduce a recommendation system to solve the problem of how to filter the content meeting the preference of the user. In the prior art, a method and system for modeling the interests of a television viewer is disclosed, for example, in patent application No. 201610485614.2.

However, the IPTV application and the video website in the internet mainly have the following two problems:

1) there is a lot of implicit feedback in IPTV applications, but the explicit feedback is very poor. This makes it impossible to tell whether the user dislikes or does not notice his or her non-interactive items;

2) more particularly, the user of IPTV is typically the entire home, not an individual. The preferences of each person in the family may be different, which greatly increases the difficulty of recommending tasks.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides a long-term and short-term interest modeling method for the IPTV field, which constructs the long-term preference of a family and the short-term interest of a current user according to the historical click records of the family group and the user by using a sequence modeling mode, thereby realizing more accurate recommendation.

The specific implementation content of the invention is as follows:

a long-short term interest modeling method for IPTV field divides the history click record into long-term click sequence L_t-1And short-term click sequence S_tProcessing the data and obtaining the long-term preference U of the family through self-attention calculation_LUser short-term interest U_S。

In order to better implement the invention, the method further specifically comprises the following steps:

s1, obtaining a long-term click sequence L according to historical click records_t-1Short term click sequence S_t(ii) a Will long term click sequence L_t-1Short term click sequence S_tInputting the code into an embedded code layer to be coded, wherein the code is represented in a dense matrix form; the long-term click sequence obtained after coding is then embedded into the representation

Short term click sequence embedded representation

Outputting to a long-term and short-term modeling layer;

s2, embedding and representing long-term click sequences in long-term and short-term modeling layers

Short term click sequence embedded representation

Self-attention calculation is carried out, and family long-term preference U is obtained through mapping respectively_LUser short-term interest U_SAnd make the family prefer U for a long time_LUser short-term interest U_SInputting the long-short term preference fusion layer;

s3, long-term preference U of the family in the long-term and short-term preference fusion layer_LUser short-term interest U_SPerforming self-attention decoding and mapping to obtain user mixed preference U_hybird；

S4, mixing preference U to users_hybirdLinear mapping is carried out to obtain the click rate R_uAnd (6) predicting.

In order to better implement the present invention, further, in the step s2, the short-term interest U of the user is obtained_SThe method comprises the following specific steps:

step SA. calculation of user short-term interests U in a self-attention mechanism_SThe three inputs of Query, Key and Value are completely consistent and are short-term click sequencesColumn embedded representation

Embedding three-input short-term click sequences into a representation

Mapping into different spaces respectively through linear transformation to obtain embedded representation in corresponding space

Wherein W_q,W_k,W_vHas the dimension of

Therefore, it is

Has the dimension of

C is a fixed value;

then will be

Is split into n_hHead, obtained in the dimension of

Tensor of

Wherein d ═ C/n_hThen transposing each tensor, the tensor

From dimension of

Is transformed into

Step SB. divides each header after splitting

As an input of the dot product attention scaled, the following operations are performed: will be provided with

Performing matrix multiplication to obtain dimension of

The dot product attention of (1); then reducing the dot product attention by square root times of dimensionality through scale transformation; then, performing softmax normalization processing on the accumulated attention; finally, the dot product is added with attention

Matrix multiplication outputs scaled dot product attention;

step SC. splices the scaled dot product attentiveness of each output, and then performs linear transformation to output short-term multi-head attentiveness

Short term multi-headed attention

Has the dimension of

Step SD. directs short term multiple attention

Obtaining user short-term interest U by input point type feedforward network_S。

To better implement the present invention, in step SD., before outputting the multi-headed attention to the point-type feedforward network, residual concatenation and layer normalization are performed, and then the short-term multi-headed attention is focused

Inputting a point type feedforward network, and performing residual error connection and layer normalization on the output of the point type feedforward network to obtain the user short-term interest U_S。

In order to better implement the present invention, further, the step s2 obtains the long-term preference U of the family_LThe method comprises the following specific steps:

step Sa. for Long term click sequence L_t-1Encoded long-term click sequence embedded representation for each time step

The long-term click sequence embedded representation

Has the dimension of

Embedding a representation for each long-term click sequence

Short-term multi-head attention is calculated as described above

Is processed to obtain the long-term preference U for calculating the family_LLong-term multi-head attention of

The long-term multi-head attention

Has the dimension of

Step Sb. combines the long-term click sequence L_t-1Long-term multi-head attention per step in

Is compressed and then the average value of each column is calculated to obtain the dimension of

A row vector of (a); then the long-term click sequence L_t-1All the line vectors of the time step are spliced to finally obtain the dimension of

Family long-term preference U_L。

In order to better implement the present invention, further, the step s3. specifically includes: preference of family for long term U_LUser short-term interest U_SCalculating multi-head attention through a common attention mechanism, performing residual connection and layer normalization on the processed output, inputting the result into a point type feedforward network for further processing, performing residual connection and layer normalization on the further processed result, and finally obtaining user mixed preference U after linear change and Sigmoid function processing_hybirdThe user mix preferences U_hybirdDimension of

In order to better implement the present invention, further, the specific operation of step s4 is: first mix the user preferences U by global average pooling_hybirdReduce the dimension to

The click rate R is then obtained by linear mapping, i.e. matrix multiplication_uThe click rate R_uHas the dimension of

To better implement the invention, further, the long-term click sequence L_t-1And short-term click sequence S_tDividing according to t time step: the t time step is the last time segment in the acquired click sequence to be processed, and all time steps before the t time step in the acquired click sequence to be processed are defined as a long-term click sequence L_t-1The long-term click sequence L_t-1＝S₁∪S₂∪...∪S_t-1Defining t time step as a short-term click sequence S_tThe short-term click sequence S_t＝{i₁,i₂,...,i_m}。

To better implement the invention, further, the encoded long-term click sequence L_t-1The specific operation is as follows: will long term click sequence L_t-1Each time step S in_iEmbedding mapping to obtain long-term click sequence embedded representation

The long-term click sequence embedded representation

Dimension of

Wherein | S_iI represents the number of media assets in the processed time step, and K represents the size of the embedding layer;

the encoded short-term click sequence S_tThe specific operation is as follows: short-term click sequence S_tEmbedding mapping to obtain short-term click sequence embedded representation

The short term click sequence embedded representation

Has the dimension of

Wherein | S_tAnd | represents the number of assets in the t time step, and K represents the size of the embedding layer.

In order to better implement the invention, further, the point feed-forward network is composed of two fully-coupled layers, and a ReLU activation function is set between the two fully-coupled layers.

Compared with the prior art, the invention has the following advantages and beneficial effects:

1) meanwhile, the hobbies of family groups and the interests of users in a short period are considered, so that the interest recommendation result is more accurate;

2) through the click sequence, the user can be better distinguished whether the user dislikes or does not notice the non-recommended videos;

3) self-attention computation is used, which refines the representation by matching individual sequences to themselves; unlike general attention, the mechanism of self-attention reduces reliance on external information and is better at capturing internal correlations of sequences or features.

Drawings

FIG. 1 is a flow chart of the present invention;

FIG. 2 is a model structure and operation diagram of the present invention;

FIG. 3 shows the long term preference U of a family_LUser short-term interest U_SCalculating a flow chart;

FIG. 4 is family long term preference U_LUser short-term interest U_SA processing flow chart of multi-head attention in the calculation process;

FIG. 5 is a flow chart of the process of scaling dot product attention by ratio during a multi-head attention process;

FIG. 6 is a calculation of user mix preferences U for the decoding process_hybirdIs described.

Detailed Description

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it should be understood that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments, and therefore should not be considered as a limitation to the scope of protection. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.

Example 1:

a method for modeling long-short term interest in IPTV field, as shown in FIG. 1, comprises the following steps:

s1, obtaining a long-term click sequence L according to historical click records_t-1Short term click sequence S_t(ii) a The long-term click sequence L_t-1And short-term click sequence S_tDividing according to t time step: the t time step is the last time segment in the acquired click sequence to be processed, and in the acquired click sequence to be processed, the time step before the t time step is defined as a long-term click sequence L_t-1The long-term click sequence L_t-1＝S₁∪S₂∪...∪S_t-1Defining t time step as a short-term click sequence S_tThe short-term click sequence S_t＝{i₁,i₂,...,i_m}; for long-term click sequence L_t-1Encoding is carried out, and the specific operation is as follows: will long term click sequence L_t-1Each time step S in_iEmbedding mapping to obtain long-term click sequence embedded representation

The long-term click sequence embedded representation

Dimension of

Wherein | S_iI represents the number of media assets in the processed time step, and K represents the size of the embedding layer; for short-term click sequence S_tEncoding is carried out, and the specific operation is as follows: short-term click sequence S_tEmbedding mapping to obtain short-term click sequence embedded representation

The short term click sequence embedded representation

Has the dimension of

Wherein | S_tI represents the number of media assets in the time step of t, and K represents the size of the embedding layer; the long-term click sequence obtained after coding is then embedded into the representation

Short term click sequence embedded representation

Outputting to a long-term and short-term modeling layer;

Short term click sequence embedded representation

The working principle is as follows: through the division of the t time step, the preference of the family user before the t time step can be divided, the preference interest of the user using the television at present is the t time step, the preference interest and the preference interest are processed after the t time step is divided, and the final result can be more accurate. The specific division of the time step needs to be determined according to the scene, assuming that it is first time-dependentThe inter-step is divided in hours, so if the user clicks, the behavior of nearly one hour is considered as a short-term sequence and the earlier behavior is considered as a long-term sequence. Then coding the long-term and short-term sequence to obtain the family long-term preference U_LAnd user short-term interests U_SThen using the family long-term preference U_LDecoding user short-term interests U_SObtaining user mix preferences U_hybird. Finally, the user is mixed with the preference U_hybirdLinear mapping is carried out to finally obtain the click rate R of the user_uAnd (6) predicting. And in actual operation, in our recommendation system, there would be two phases, namely a recall phase and a sort phase, and the model would be used in the sort phase. First, in the recall phase, the recommendation system generates a longer candidate list, and then filters each candidate in the candidate list through the ranking model. Let us assume that the user clicks each candidate item and clicks, and calculate the click rate R according to the above-mentioned calculation procedure_u. Finally the click rate R_uAnd sequencing to obtain a final recommendation list.

Example 2:

the present invention is based on the above embodiment 1, as shown in FIG. 2, and further, as shown in FIG. 2, the present invention is to arrange the long-term click sequence L_t-1And short-term click sequence S_tAre respectively input into an embedded coding layer which maps the long-term click sequence L through embedding_t-1And short-term click sequence S_tEncoded as a long-term click sequence L_t-1Vector and short-term click sequence embedded representation

The long-term click sequence is then embedded into the representation

And short-term click sequence embedded representation

Sending the data into a long-short term modeling layer, and carrying out long-term click sequence on the long-short term modeling layer through a Transformer encoderColumn embedded representation

And short-term click sequence embedded representation

Calculating the long-term preference U of the family by self attention_LAnd user short-term interests U_SNote that the representation is embedded for short-term click sequences

Directly obtaining the short-term interest U of the user after being processed by a Transformer encoder_SAnd for long-term click sequences the representation is embedded

After being processed by a Transformer encoder, the long-term preference U of the family can be obtained through global average pooling_L(ii) a Because of the long-term click sequence L_t-1And short-term click sequence S_tDivided by t time steps, short-term click sequences S_tOne time step, and a long-term click sequence L_t-1For short-term click sequences S_tA plurality of time steps before the time step, so the long-term preference U of the family can be obtained after one global average pooling_L. Family long-term preference U to be generated_LAnd user short-term interests U_SSending the information into the long and short term preference fusion layer, and carrying out attention calculation in a Transformer decoder of the fusion layer of the long and short term preference to obtain the user mixed preference U_hybirdFinally, mix the user preferences U_hybirdDimension reduction by global average pooling to

Other parts of this embodiment are the same as those of embodiment 1, and thus are not described again.

Example 3:

in this embodiment, based on any one of the above embodiments 1-2, and as shown in fig. 3, 4, and 5, further, the ordinary attention mechanism has three inputs, which are Query (request), Key (primary Key), and Value (Value), and the calculation method is as follows:

and in the self-attention mechanism, for calculating the short-term interest U of the user_SAre identical, all short term click sequence embedded representations

Embedding three-input short-term click sequences into a representation

Respectively by linear transformation W_q,W_k,W_vMapping into different spaces

Wherein W_q,W_k,W_vThe dimensions of (a) are respectively:

therefore, it is

The dimensions of (a) are respectively:

then will be

Is split into n_hHead, obtainingDimension of

Tensor of

Wherein d ═ C/n_hThen transposing each tensor, the tensor

From m n_hX d is converted into n_h×m×d；

As shown in FIG. 3, the user's short-term interests U are calculated using self-attention in the long-and-short-term modeling layer_SThe process of (2) is as follows: first embedding the short-term click sequence of the input into the representation

Treatment as short-term multi-head attention

Then the obtained short-term attention of the head

Input into a point feed-forward network, where attention is paid to the point feed-forward network and the short term

The output ends of the two-dimensional model are all provided with a residual error connection and are subjected to layer normalization processing, and the short-term interest U of the user can be obtained after the point type feedforward network, the residual error connection and the layer normalization processing_SIt is used.

Short-term Bull attention with respect to FIG. 3

The calculation method of (2) is, as shown in fig. 4, to make linear change to the inputted multi-head data, then to process it into scaled dot product attention, then to splice the multi-head scaled dot product attention, and finally to make linear transformationOutputting short-term multi-head attention

On the other hand, as for the processing method of dot product attention by scaling in fig. 4, as shown in fig. 5, the tensor divided into multiple tensors

Performing matrix multiplication to obtain a dot product attention with dimension of m multiplied by m; then reducing the dot product attention by square root times of dimensionality through scale transformation; then, performing softmax normalization processing on the accumulated attention; finally, the dot product is added with attention

Matrix multiplication outputs scaled dot product attention;

likewise, the family long term preference U is calculated for the use of self-attention in the long and short term modeling layer_LAnd the above-mentioned process for calculating the short-term interest U of the user_SThe operation is approximately the same, and the long-term click sequence needs to be embedded with a representation

Wherein the short-term click sequence embedding representation E is performed for each time step_StOperation of the treatment to obtain long-term attention

The difference is that global average pooling treatment is required, and the specific steps are as follows: long-term multi-head attention per time step will be obtained

Compressing the dimensionality, and then calculating the average value of each column to obtain a row vector; then click long termSequence L_t-1All the line vectors of the time step are spliced to finally obtain the long-term preference U of the family_L(ii) a The formula for the global average pooling is as follows:

wherein GAP represents Global Average Pooling (Global Average Pooling), U_LPreference of family for long term U_L，

The output of attention for each step.

Other parts of this embodiment are the same as any of embodiments 1-2 described above, and thus are not described again.

Example 4:

this embodiment is different from the transform encoder in that the transform decoder uses a normal attention mechanism rather than a self-attention mechanism, based on any of the embodiments 1 to 3 described above, as shown in fig. 6. The user's short-term interest U is processed through similar calculation steps as a transform encoder, including a multi-head attention mechanism, residual connection and regularization and a point feed-forward network_SAnd inputting the family long-term interest preference to obtain the attention output of the transform decoder, wherein the specific user mixed preference U_hybirdThe calculation formula of (2) is as follows:

U_hybird＝Attention(U_L,U_S,U_S)

wherein, U_hybirdMix preferences for users with dimensions of

After obtaining the user mixing preference U_hybirdThen, the users are mixed with the preference U through global average pooling_hybirdReducing dimension, and then obtaining click rate R through linear mapping, namely matrix multiplication_uThe resulting click rate R_uHas the dimension of

The specific calculation formula is as follows:

R_u＝Linear(GAP(U_hybird))。

other parts of this embodiment are the same as any of embodiments 1 to 3, and thus are not described again.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way, and all simple modifications and equivalent variations of the above embodiments according to the technical spirit of the present invention are included in the scope of the present invention.

Claims

1. A long-term and short-term interest modeling method for the IPTV field is characterized in that a long-term click sequence L is extracted from a historical click record_t-1And short-term click sequence S_tProcessing the data and obtaining the long-term preference U of the family through self-attention calculation_LUser short-term interest U_S(ii) a The method specifically comprises the following steps:

Short term click sequence embedded representation

Outputting to a long-term and short-term modeling layer;

Short term click sequence embedded representation

S4, mixing preference U to users_hybirdLinear mapping is carried out to obtain click rate R for prediction_u；

In the step S2, the short-term interest U of the user is obtained_SThe method comprises the following specific steps:

step SA. calculation of user short-term interests U in a self-attention mechanism_SThe three inputs of Query, Key and Value are completely consistent and are embedded expressions of short-term click sequences

Embedding three-input short-term click sequences into a representation

Wherein the weight matrix W of the linear transformation_q,W_k,W_vHas the dimension of

Therefore, it is

Has the dimension of

C is a fixed value;

then will be

Is split into n_hHead, obtained in the dimension of

Tensor of

Wherein d ═ C/n_hThen transposing each tensor, the tensor

From dimension of

Is transformed into

Step SB. divides each header after splitting

The matrix is used as an input of the dot product attention of scaling, and the following operations are carried out: will be provided with

Performing matrix multiplication to obtain dimension of

Matrix multiplication outputs scaled dot product attention;

The short-term multi-headed attention

Has the dimension of

Step SD. directs short term multiple attention

Obtaining user short-term interest U by input point type feedforward network_S；

In said step SD., attention is paid to the short-term bull

Before inputting to the point type feedforward network, residual error connection and layer normalization are firstly carried out, and the residual error connection and layer normalization are also carried out on the output of the point type feedforward network, and finally the short-term interest U of the user is obtained_S；

The long-term preference U of the family is obtained in the step S2_LThe method comprises the following specific steps:

The long-term click sequence embedded representation

Has the dimension of

Embedding a representation for each long-term click sequence

Short-term multi-head attention is calculated as described above

The long-term multi-head attention

Has the dimension of

Family long-term preference U_L；

The step S3 specifically comprises the following operations: preference of family for long term U_LUser short-term interest U_SCalculating multi-head attention through a common attention mechanism, performing residual connection and layer normalization on the processed output, inputting the result into a point type feedforward network for further processing, performing residual connection and layer normalization on the further processed result, and finally obtaining user mixed preference U after linear change and Sigmoid function processing_hybirdThe user mix preferences U_hybirdDimension of

The specific operation of the step S4 is as follows: first mix the user preferences U by global average pooling_hybirdReduce the dimension to

2. The method of claim 1, wherein the long-term click sequence L is a long-term short-term interest modeling method in IPTV field_t-1And short-term click sequence S_tDividing according to t time step: the t time step is the last time segment in the acquired click sequence to be processed, and all time steps before the t time step in the acquired click sequence to be processed are defined as a long-term click sequence L_t-1The long-term click sequence L_t-1＝S₁∪S₂∪...∪S_t-1Defining t time step as a short-term click sequence S_tThe short-term click sequence S_t＝{i₁,i₂,...,i_m}。

3. The method as claimed in claim 2, wherein the method for modeling the long-short term interest in IPTV field is characterized in thatThen, the code long-term click sequence L_t-1The specific operation is as follows: will long term click sequence L_t-1Each time step S in_iEmbedding mapping to obtain long-term click sequence embedded representation

The long-term click sequence embedded representation

Dimension of

The short term click sequence embedded representation

Has the dimension of

4. The method as claimed in claim 1, wherein the point feed-forward network comprises two fully-coupled layers, and a ReLU activation function is set between the two fully-coupled layers.