CN113688600A

CN113688600A - Information propagation prediction method based on topic perception attention network

Info

Publication number: CN113688600A
Application number: CN202111049168.8A
Authority: CN
Inventors: 杨成; 石川; 王浩
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2021-09-08
Filing date: 2021-09-08
Publication date: 2021-11-23
Anticipated expiration: 2041-09-08
Also published as: CN113688600B

Abstract

The invention discloses an information propagation prediction method based on a theme perception attention network, which integrates a theme context and a propagation history context into a user representation for prediction. The topic context supports model propagation patterns for a particular topic, while the propagation history context can be further decomposed into user-dependent and location-dependent modeling. We can then use the encoded user context to construct a user representation under multiple topics. We then further integrate the user representation through a time-decay aggregation module to obtain a cascaded representation. Wherein all of these modules are driven by information dissemination features. Therefore, the information propagation prediction method based on the topic perception attention network can better fit the diffusion data of the real world and predict more accurately. In addition, predefined good theme distribution is required to be used in the traditional theme perception model, and the theme can be automatically learned by adopting the method.

Description

Information propagation prediction method based on topic perception attention network

Technical Field

The invention relates to the technical field of networks, in particular to an information propagation prediction method based on a theme perception attention network.

Background

Social networking platforms such as twitter and new wave microblog attract millions of users, and a large amount of information is spread among the users every day. The process of information dissemination, also known as cascading, models the dissemination pattern and the user behavior is widely used in many fields, such as popularity prediction, epidemiology and personalized recommendations. Next user prediction has been extensively studied in recent years as a popular micro-cascading prediction task. This problem is defined as the sequence of user infections, ordered over time for a given information item, predicting the next infected user (by convention, researchers will use "infection", "activation" or "influence" to describe that there is an interaction of a user with an information item).

Conventional micro-cascade prediction methods include independent cascade model (IC) based methods and embedding based methods. The independent cascade model allocates an independent diffusion probability between each user pair, and many cascade diffusion models are established on the basic assumption of the model, and are expanded by additionally considering more information, such as continuous time stamps and user attributes. There are also some studies that explored the impact of topic information on cascade modeling. TIC has for the first time studied the information dissemination prediction task from the perspective of topic perception, the main idea being by setting a specific topic probability for each user pair.

As research advances, researchers have proposed an embedding-based approach to cascade prediction, which embeds users into a continuous latent space by improving the expressive power of the model using representation learning techniques, and calculates the propagation probability between each pair of users by using a user embedding function, rather than directly estimating a real-valued parameter. However, neither the IC-based approach nor the embedding-based approach takes into account the modeling of the concatenation history sequence information. Recent work has shown that these models are not as efficient as deep learning models based on considering cascade sequences.

With the success of deep learning, Recurrent Neural Networks (RNNs) have shown a strong ability to model information dissemination. TopolSTM extends the standard LSTM model, building hidden states from Directed Acyclic Graphs (DAGs) extracted from social graphs. CYAN-RNN and DeepDiffuse combine a recurrent neural network with an attention mechanism to account for the propagation structure. RecCTIC proposes a Bayesian topological RNN model for capture tree dependence. The Diffusion-LSTM uses image information to assist in prediction, and builds a Tree-LSTM model to infer propagation paths. The FOREST expands the GRU model, and designs an additional structural context extraction strategy to utilize the underlying social graph information.

Recently, some attention networks have been proposed to better capture the propagation dependence in the cascade sequence. The HiDAN builds a hierarchical attention network, captures a non-sequence structure in the cascade by adopting an attention mechanism, excavates a real dependency relationship from the cascade, designs a time attenuation module by combining timestamp information of a user, and jointly models user dependency and time attenuation, thereby greatly improving the expression capability and interpretability of the model.

With the development of deep learning technology, some documents model information propagation cascade as infection sequences, and a circulating neural network is adopted to obtain a good effect. Although concatenation is usually represented as a sequence of users, ordered by infection timestamp, the real propagation process is usually not strictly ordered, relying on unobserved user connection graphs. Therefore, other studies have employed an attention mechanism to capture non-sequential long-term propagation dependencies.

However, existing neural network-based approaches assume that the propagation behavior and pattern of all information items are homogenous. This assumption may not hold in the real world. In a real information dissemination scenario, a user may have different behavior patterns for information items of different topics. Intuitively, the interests of the user are often diverse, and the propagation behavior of the user may also be diverse according to the topic of the information item. For example, a user may focus on different people and then forward different information under different topics, respectively, and thus have topic-specific dependencies. The existing method based on the neural network rarely utilizes information texts, does not consider the propagation mode and user behavior perceived by a modeling theme, cannot model the propagation mode and dependency relationship under a specific theme, and limits the expression capability of the model. Whereas traditional non-neural approaches have demonstrated the impact of the subject on the user.

Next user prediction has been extensively studied in recent years as a popular micro-cascading prediction task. Traditional modeling typically ignores the textual content of the propagating information item, resulting in learning mixed dependencies from different topics. In contrast, topic-aware modeling aims to explicitly decouple propagation dependencies under a particular topic, thereby enabling more accurate predictions. In fact, the traditional non-neural network method based on independent cascade model has proved the advantage of topic-aware modeling, which can model the behavior of information items from different topics separately. But these early methods were built on strong independent assumptions, and this strategy limited the generalization performance of the model and has been shown to be suboptimal by recent deep learning-based methods. To our knowledge, no previous research has proposed a topic perception model based on neural networks to mine propagation dependencies under different topics.

Disclosure of Invention

In view of the above, the present invention aims to provide an information propagation prediction method based on a topic-aware attention network, which starts from a formalized propagation prediction problem and introduces our embedding strategy to encode user/location/text information into a vector. We will then come up with a topic-aware attention tier aimed at capturing the historical propagation dependence and time-decay effects of different topics. Finally, our model will take a multi-topic cascade representation through a given topic-aware attention layer and then predict the next infected user.

In order to achieve the above purpose, the invention provides the following technical scheme:

the invention provides an information propagation prediction method based on a topic perception attention network, and S1, a topic context and a propagation history context are integrated into a user representation for prediction;

s2, the topic context supports the model building of the propagation mode aiming at the specific topic, and the propagation history context is further decomposed into user dependence model building and position dependence model building;

s3, constructing a user representation under multiple topics by using the user context obtained by encoding;

s4, further integrating the user representation through a time attenuation aggregation module, thereby obtaining a multi-topic cascade representation and then predicting the next infected user;

wherein each module is driven by information dissemination features.

Further, the specific method of step S1 is:

given a set of users U, a concatenation set V and a propagation information set M, the propagation sequence of the ith information item in M is defined as a concatenation

Wherein the tuple

Representing a user

In that

The moments are forwarded and the sequence is ordered by the time of infection, the propagation prediction task being defined as a given cascade c_iAnd previous infected user sequence

Predict next infected user as

Wherein n is 1,2, …, | c_i|-1。

Further, in step S2, the propagation pattern modeling is to encode semantic information of the propagation information text by using the pre-trained language model BERT.

Further, the propagation pattern modeling in step S2 is to embed the text encoded by BERT into the text by a full link layer

Conversion to propagating text embedding

y_i＝W_xx_i+b_x (1)

Wherein W_xAnd b_xRespectively a weight matrix and an offset vector.

Further, the user dependent modeling uses the embedded matrix in step S2

And encoding the users, wherein | U | represents the number of users, and K and d represent the number of topics and the embedding dimension respectively.

Further, for the cascade sequence

Each user in

User is embedded into

Wherein

Is the user embedding of the user under the kth topic.

Further, the position-dependent modeling in step S2 is to set a learnable position-embedded pos for each position_jWherein pos_jShared among all cascades.

Further, the encoding method in step S3 is:

topic context:

computing user embedding for each topic k

And passBroadcast text embedding y_iCosine similarity between them, and normalized by softmax function:

wherein K is 1,2, …, K, and

representing a user

A weight under the kth topic; user-embedded representation of the aggregated topic context as

Propagation history context:

in a cascade sequence

With previous users

Is calculated by the following formula:

wherein

Respectively for subject-specific linear mapping of target users and previous users;

user' s

And the user

BetweenFull attention score

And weight

Is used to describe the propagation history context and is calculated by the following formula:

wherein

Is a position dependent score from position m to position j;

full context-aware multi-topic user representation:

users in the k-th theme

Expressed as a weighted sum of previously infected users:

weighting of topic contexts

And location dependent score

Shared between different layers.

Further, the modeling method of the time decay aggregation module in step S4 is as follows:

converting the continuous-time decay into discrete time intervals:

wherein t is_lBy dividing the time range [0, T_max]Is divided into L sub-intervals { [0, t ]₁),..,[t_L-1,T_max) Where T is_maxIs the maximum timestamp in the data set, with a corresponding learnable weight for each time interval for each topic

Further, the obtaining method of the multi-topic cascade representation in step S4 is as follows:

the complete aggregation weight is calculated according to equation (8):

then the

Normalizing j to 1,2, …, n by a softmax function;

for each topic k, calculating

Being weighted

And a feedforward neural network with a ReLU activation function is adopted to endow the model with nonlinearity, and the output of the theme perception attention layer is represented in cascade as

Further, the method for predicting the next infected user in step S4 is:

given a cascading sequence

By measuring user embedding

And cascade embedding

To parameterize the next infected user

Probability, cascade and user of

The interaction probability of (a) is expressed as:

wherein Θ represents all parameters that need to be learned;

the training objective for predicting an infected user is defined by equation (10):

setting K topic prototype embedding

And encourages user embedding under k themes

Corresponding subject prototype m_kSimilarly, the goal is to maximize:

this term is taken as an additional training target and summed over all users:

the complete training objective function is

Where η is the equilibrium coefficient.

Compared with the prior art, the invention has the beneficial effects that:

the information propagation prediction method based on the topic perception attention network, provided by the invention, combines the advantages of topic-specific propagation modeling and deep learning technology, designs a novel and effective topic perception attention mechanism, and integrates a topic context and a propagation history context into a user representation for prediction. The topic context supports model propagation patterns for a particular topic, while the propagation history context can be further decomposed into user-dependent and location-dependent modeling. We can then use the encoded user context to construct a user representation under multiple topics. We then further integrate the user representation through a time-decay aggregation module to obtain a cascaded representation. Wherein all of these modules are driven by information dissemination features. Therefore, the information propagation prediction method based on the topic perception attention network can better fit the diffusion data of the real world and predict more accurately. In addition, predefined good theme distribution is required to be used in the traditional theme perception model, and the theme can be automatically learned by adopting the method.

Drawings

In order to more clearly illustrate the embodiments of the present application or technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings can be obtained by those skilled in the art according to the drawings.

Fig. 1 is an architecture diagram of a subject-aware attention network according to an embodiment of the present invention.

Detailed Description

wherein each module is driven by information dissemination features.

For a better understanding of the present solution, the method of the present invention is described in detail below with reference to the accompanying drawings.

In this section we will start with a formalized propagation prediction problem and introduce our embedding strategy to encode user/location/text information into a vector. We will then come up with a topic-aware attention tier aimed at capturing the historical propagation dependence and time-decay effects of different topics. Finally, our model will take a multi-topic cascade representation through a given topic-aware attention layer and then predict the next infected user. The complete structure of our proposed TAN is shown in figure 1.

1. Problem definition

Given a set of users U, a concatenation set V and a propagation information set M, the propagation sequence of the ith information item in M can be defined as a concatenation

Wherein the tuple

Representing a user

In that

The time of day is forwarded and the sequence is ordered by time of infection. Following the previous task setup, the propagation prediction task is defined as a given cascade c_iAnd previous infected user sequence

Predicting next infected user

Wherein n is 1,2, …, | c_i|-1。

2. Embedding layer

1) User embedding:

to capture user interests and dependencies on different topics, we use an embedded matrix

And encoding the users, wherein | U | represents the number of users, and K and d represent the number of topics and the embedding dimension respectively. For cascade sequences

Each user in

His user is embedded as

Wherein

Is the user embedding of the user under the kth topic.

2) Position embedding:

to exploit the cascade infection order information, we are for eachPosition setting a learnable position embedding pos_jWherein pos_jShared among all cascades.

3) Text embedding:

the semantic information of the transmission information text is coded by using a pre-training language model BERT. To measure topic similarity between user embedding and text embedding for a particular topic, we embed text encoded by BERT through a fully connected layer

Is converted into

y_i＝W_xx_i+b_x (1)

Wherein W_xAnd b_xRespectively a weight matrix and an offset vector.

3. Topic perception attention layer

In this section, we will further encode various context information into user representations, which are then aggregated in conjunction with time-decaying weights to generate a cascaded representation for each topic.

3.1 user representation enhancement

We incorporate the topic context and the propagation history context into the multi-topic user representation, respectively. The propagation history context can be further decomposed into user dependencies and location dependencies. Inspired by the multi-head attention mechanism, we treat a topic as a specific head (head) and perform the attention mechanism separately in each topic to extract user and location dependencies.

1) Topic context

Based on propagating text y_iWe propose to strengthen the user's embedding if there is a higher similarity to the text embedding under the kth topic

Is embedded. Specifically, we compute for each topic k

And y_iCosine similarity between them, and normalized by a softmax function:

wherein K is 1,2, …, K, and

representing a user

Weight under the k topic. User embedding of the aggregate topic context can then be represented as

We can find out that when the k-th topic corresponds to the user embedding

And y_iThe larger the cosine similarity of (a), the larger the assigned weight, and the more strengthened the user embedding under the theme.

2) Propagating historical context

Intuitively, a user is infected typically due to the spreading of text, while there are only a few users in the spreading sequence that were previously infected. Thus, the goal of propagating historical context is to extract and characterize and user

Infecting the associated user. In particular, we employ an attention mechanism to model user dependencies and give more attention weight to those users who may affect an infection. Formally, in a cascade sequence

With previous users

The dependent attention weight of (a) can be calculated by the following formula:

wherein

For subject-specific linear mapping of target users and previous users, respectively.

Intuitively, we should also focus on the source user and the recently infected users. Note that such dependencies are independent of the particular user, so we propose to model location dependencies under each topic. Instead of directly adding the predefined location embedding and the user embedding in the past, we compute the location-dependent score using a similar method to user-dependent modeling. In this way, our approach can better capture user-independent location dependencies for better prediction performance.

User' s

And the user

Complete attention score between

And weight

Can be used to describe the propagation history context and is calculated by the following formula:

wherein

Is a position dependent score from position m to position j.

3) Complete context-aware multi-topic user representation

To fully exploit the topic and propagation history context, we will refer to the users in the kth topic

Represented as a weighted sum of previously infected users.

Note that we can also add multiple tiers of the above operations to get a more accurate representation. In this case, the weight of the subject context

And location dependent score

Shared between different layers.

3.2 obtaining a cascading representation based on time-decaying aggregation

After extracting the multi-user representation under multiple topics, we need to aggregate them to obtain a cascading representation under multiple topics. We assume that the user's influence will decay with time and jointly consider the time decay and the propagation dependent weights in equation 4.

1) Time-decay impact modeling

Specifically, inspired by Deephawkes, we employed a non-parametric time-decay modeling strategy for each topic. Formally, a cascade of sequences giving historical infection

The continuous-time decay is first converted into discrete time intervals:

wherein t is_lBy dividing the time range [0, T_max]Is divided into L sub-intervals { [0, t ]₁),..,[t_L-1,T_max) Where T is_maxIs the maximum timestamp in the data set. For each topic, each time interval has a corresponding learnable weight

2) Computing cascading representations under multiple topics

The complete aggregate weight is the addition of an additional term to equation 4:

then the

J will be normalized by the softmax function to 1,2, …, n. Finally, for each topic k, we calculate

Being weighted

And a feed-forward neural network with a ReLU activation function is used to impart model non-linearity. The output of the topic perception attention layer is a cascade representation and can be represented as

3.3 training goals and model details

Given a cascading sequence

By measuring user embedding

And cascade embedding

To parameterize the next infected user

The probability of (c). Cascading with Users as follows

The interaction probability of (c) can be expressed as:

where Θ represents all parameters that need to be learned.

We then predict the training goals of the infected user as defined by the following equation:

in addition to this, we want each topic subspace to reflect different semantics, and the embedding of different users under the same topic should be as similar as possible. Therefore, we set K topic prototype embedding

And encourages user embedding under k themes

Corresponding subject prototype m_kSimilarly. Formalization, we aim at maximumAnd (3) conversion:

we therefore take this term as an additional training target and sum all users:

the complete training objective function is

Where η is the equilibrium coefficient. We optimized the parameters using a gradient descent method and Adam optimizer. To avoid the training process from being unstable, we also apply layer normalization and dropout techniques to user embedding.

In the present invention we propose a model TAN to model the subject-specific propagation dependence. Specifically, we jointly model the textual content information of the propagated item and the user propagation history sequence, and then propose a topic-aware attention mechanism for capturing historical propagation dependencies and time-decay effects under different topics. TAN can automatically learn topics and benefit from deep learning, as compared to traditional topic perception models. Compared with the current neural network-based model, TAN can not only effectively model subject-specific propagation patterns, but also better capture user dependence and location dependence. Meanwhile, the interpretability of the model is improved by extracting the theme information, so that the model makes a prediction and gives a prediction reason, namely, the information item of a specific theme is more likely to be forwarded by a user, and the forwarding operation can be carried out according to the attention weight, wherein the user is influenced by the user.

The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: it is to be understood that modifications may be made to the technical solutions described in the foregoing embodiments, or equivalents may be substituted for some of the technical features thereof, but such modifications or substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. An information propagation prediction method based on a topic awareness attention network is characterized by comprising the following steps:

s1, integrating the topic context and the propagation history context into the user representation for prediction;

wherein each module is driven by information dissemination features.

2. The information propagation prediction method based on the topic awareness network according to claim 1, wherein the specific method in step S1 is:

Wherein the tuple

Representing a user

In that

Time quiltForwarding, and sequencing by time of infection, the propagation prediction task being defined as a given cascade c_iAnd previous infected user sequence

Predict next infected user as

Wherein n is 1,2_i|-1。

3. The information propagation prediction method based on the topic awareness attention network as claimed in claim 2, wherein the propagation mode modeling in step S2 is to encode the semantic information of the propagation information text by using the pre-trained language model BERT, specifically, to embed the text encoded by BERT into the propagation information text through a full connection layer

Conversion to propagating text embedding

y_i＝W_xx_i+b_x (1)

Wherein W_xAnd b_xRespectively a weight matrix and an offset vector.

4. The information propagation prediction method based on topic awareness attention network according to claim 3 wherein the user dependent modeling uses embedded matrix in step S2

5. The topic aware attention network based information dissemination module of claim 4The method is characterized in that for the cascade sequence

Each user in

User is embedded into

Wherein

Is the user embedding of the user under the kth topic.

6. The information propagation prediction method based on topic awareness network according to claim 5, wherein the position-dependent modeling in step S2 is to set a learnable position-embedded pos for each position_jWherein pos_jShared among all cascades.

7. The information propagation prediction method based on the topic awareness network according to claim 6, wherein the encoding method in step S3 is:

topic context:

computing user embedding for each topic k

And propagating text embedding y_iCosine similarity between them, and normalized by softmax function:

wherein K is 1,2,.., K, and

representing a user

Propagation history context:

in a cascade sequence

With previous users

Is calculated by the following formula:

wherein

For topic-specific linear mapping of target users and previous users, respectively:

user' s

And the user

Complete attention score between

And weight

Used for describing propagation historyAnd calculated by the following formula:

wherein

Is a position dependent score from position m to position j;

full context-aware multi-topic user representation:

users in the k-th theme

Expressed as a weighted sum of previously infected users:

weighting of topic contexts

And location dependent score

Shared between different layers.

8. The information propagation prediction method based on the topic awareness attention network according to claim 1, wherein the modeling method of the time attenuation aggregation module in step S4 is:

converting the continuous-time decay into discrete time intervals:

wherein t is₁By dividing the time range [0, T_max]Is divided into L sub-intervals { [0, t ]₁)，..，[t_L-1，T_max) Where T is_maxIs the maximum timestamp in the data set, with a corresponding learnable weight for each time interval for each topic

9. The information propagation prediction method based on the topic awareness attention network according to claim 8, wherein the obtaining method of the multi-topic cascade representation in step S4 is:

the complete aggregation weight is calculated according to equation (8):

then the

Normalizing j 1, 2.. and n by a softmax function;

for each topic k, calculating

Being weighted

10. The information dissemination prediction method based on the topic awareness network as claimed in claim 1, wherein the method for predicting the next infected user in step S4 is:

given a cascading sequence