CN110336700B - Microblog popularity prediction method based on time and user forwarding sequence - Google Patents

Microblog popularity prediction method based on time and user forwarding sequence Download PDF

Info

Publication number
CN110336700B
CN110336700B CN201910621977.8A CN201910621977A CN110336700B CN 110336700 B CN110336700 B CN 110336700B CN 201910621977 A CN201910621977 A CN 201910621977A CN 110336700 B CN110336700 B CN 110336700B
Authority
CN
China
Prior art keywords
microblog
time
user
popularity
forwarding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910621977.8A
Other languages
Chinese (zh)
Other versions
CN110336700A (en
Inventor
黄宏宇
刘海燕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN201910621977.8A priority Critical patent/CN110336700B/en
Publication of CN110336700A publication Critical patent/CN110336700A/en
Application granted granted Critical
Publication of CN110336700B publication Critical patent/CN110336700B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/147Network analysis or design for predicting network behaviour
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/52User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail for supporting social networking services

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Computing Systems (AREA)
  • Tourism & Hospitality (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Primary Health Care (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a microblog popularity prediction model based on time and a user forwarding sequence, belonging to the field of message popularity prediction in a social network and comprising the following steps: s1: modeling a microblog forwarding sequence by utilizing a recurrent neural network, and capturing long-distance dependence of a message propagation process; s2: carrying out a nonlinear transformation network on the output result of the hidden layer, and learning the rate of each time step in the transmission process; s3: and predicting the future popularity of the microblog by using the early trend acceleration and the early popularity obtained by the speed under the optimization of the activity of the user. The method and the device ensure that the future fashion trend of the message is predicted more accurately in the early stage of message propagation, and the model not only utilizes historical propagation information, but also well describes the propagation process of the microblog.

Description

Microblog popularity prediction method based on time and user forwarding sequence
Technical Field
The invention belongs to the field of message popularity prediction in social networks, and relates to a microblog popularity prediction model 5 based on time and a forwarding sequence of a user
Background
The popularity and cheapness of web2.0 services have changed the way content is generated and consumed online. In recent years, internet technology is rapidly developing, and with the rapid rise and popularization of the internet, our lives cannot leave the network at present. Due to the network, content producers can reach an unimaginable audience using traditional channels, and services involving video, photo, music sharing, weblogs, social bookmarking sites, collaboration portals, and content submission, browsing, conducting ratings and discussions of content news aggregators, etc., are implemented worldwide. Social networking services, represented by Facebook, Twitter, microblog, WeChat, etc., play an important role in propagating hot spot incidents, and users rely on these social networks to receive updates for personal and global hot news.
Social networks have gradually emerged, and people increasingly like to publish their own speech and comment events on the internet. Social networks such as microblogs bring great convenience to people to acquire and share information. However, people are impacted by social networks while enjoying the benefits of the social networks, such as unrealistic messages and defamation spread by people on the internet, and if the messages are spread rapidly in the network, the judgment of people is affected, and people receive false information, so that unpredictable loss is caused. Therefore, if the fashion trend of the event can be predicted in advance in the early period of the event, public opinion control is well achieved for relevant government departments, and a company can greatly help to deal with the emergency in advance. The popularity prediction problem is a work with great value when the hot spots are exploded and the server is down. It is of great significance to network dimensions (e.g., caching and replication), online marketing (e.g., recommendation systems and media advertisements) or real-world outcome prediction (e.g., economic trends), emergency management, but is also a very difficult problem due to the structure of the social network itself and the large number of users.
Currently, the popularity prediction problem is generally solved by three methods. In detail, one is a machine learning method based on features, which adopts a classification or regression model to perform modeling, and the key point of the problem becomes the feature extraction, and the other is a method based on a point random process, which is used for modeling the message propagation process, can better depict the message propagation process and learn the message arrival process. The other is based on an infectious disease model, and a kinetic equation is used for expressing the message transmission rule. Classification or regression based models rely on feature extraction, do not characterize the process of message propagation, point random process based methods are deficient in performance and cannot adapt to every social network due to the diversity of social networks and do not take advantage of historical message supervision. Based on the analysis, a microblog popularity prediction model based on time and a forwarding sequence of a user is provided.
Disclosure of Invention
In view of the above, the present invention provides a microblog popularity prediction model based on time and a forwarding sequence of a user, which utilizes a recurrent neural network to model the forwarding sequence of a microblog and is used to capture long-distance dependence of a message propagation process, then performs a nonlinear transformation network on an output result of a hidden layer, learns a rate of each time step in the propagation process, and finally predicts future popularity of the microblog by using an early trend acceleration and an early popularity obtained by the rate under optimization of user liveness.
In order to achieve the purpose, the invention provides the following technical scheme:
a microblog popularity prediction model based on time and user forwarding sequence comprises
S1: modeling a microblog forwarding sequence by utilizing a recurrent neural network, and capturing long-distance dependence of a message propagation process;
s2: carrying out a nonlinear transformation network on the output result of the hidden layer, and learning the rate of each time step in the transmission process;
s3: and predicting the future popularity of the microblog by using the early trend acceleration and the early popularity obtained by the speed under the optimization of the activity of the user.
Further, step S1 includes the steps of:
s11: mapping of time vectors, converting each time composition unit into the length of the unit according to the unit at the upper stage, then setting the length of the unit in the vector, vectorizing user information, collecting historical microblog text information of each user in a microblog, aggregating the historical microblog text information into a document representing the user, aggregating all user documents into a document set, randomly generating topic-word distribution of each topic and document-topic distribution of each user microblog document, generating words in all documents according to the document-topic distribution and the topic-word distribution, continuously training the models according to Gibbs sampling of an LDA topic model, finally obtaining the topic distribution of each user document, and using the topic distribution as an interest vector of the user;
s12: splicing time and a user vector to be input as a whole, and performing embedding operation according to a certain rule;
s13: inputting the result of the step S12 into a recurrent neural network, inputting the result into a bottom RNN through an embedding layer for propagation training, solving the problem of gradient disappearance in a standard neural network by adopting an LSTM as the recurrent neural network, and finally obtaining hidden layer output of each time step through a forgetting gate, an input gate and an output gate;
the forget gate formula is:
ft=σ(Wf.[ht-1,xt]+bf),
wherein x istIs the input of the t-th layer, htHidden layer information, h, representing the current time stept-1Denotes hidden layer information at the previous time step, ". denotes multiplication of vectors, middle brackets denote that two vectors are connected and merged, σ (-) is a sigmoid activation function, WfAs a weight matrix, bfIs a bias vector.
The input gate and network status updates are:
it=σ(Wi.[ht-1,xt]+bi),
Figure BDA0002125794280000031
wherein, WCAnd bCRespectively representing a weight matrix and a bias vector, and tanh is a hyperbolic tangent function;
the output gate is:
ot=σ(WO.[ht-1,xt])+bo),ht=ot*tanh(Ct)
wherein, WOAnd boRespectively, the weight matrix and the bias parameters of the output gates.
Further, in step S2, the hidden layer output of the recurrent neural network is obtained, then nonlinear transformation is performed to obtain the propagation rate of the microblog at each forwarding time, the forwarding process of the message is modeled as a random point process, and the calculation formula is as follows:
vt=exp(Wmht+bm)
wherein, WmAs a weight matrix, bmAs a bias parameter, HtIs reflected in WmhtUpper, htIs the hidden layer information of the recurrent neural network and also represents the historical information in the sequence data.
Further, step S3 includes the steps of:
s31: the obtained rate function is used for calculating the propagation trend acceleration of the microblogs in the observation time, propagation trends of different types of microblogs are greatly different, and the propagation trend difference leads to future popularity, so that a feature capable of indicating the popularity trend change needs to be found and fused into a model, and the future popularity of the microblogs can be more accurately predicted, and the calculation formula is as follows:
Figure BDA0002125794280000032
wherein, TobsRepresenting observation time, n representing the number of elements in the forwarding sequence, viA rate function representing each forwarding instant;
s32: and quantifying the user activity to obtain the user activity of each time period on the microblog platform. The specific quantization formula is as follows:
Figure BDA0002125794280000033
n (t) represents the average number of microblogs issued by the user from the start time of a day to the current time t, η represents the average number of microblogs issued by the user in unit time on the microblog platform, and the unit time can be hours, minutes and seconds.
S33: dividing the trending acceleration and the popularity of the early message of step S31 by the user activity of step S32, respectively, yields a relative trending acceleration and a relative popularity, as follows:
Figure BDA0002125794280000041
then, combining the two to establish a linear regression model, wherein the calculation formula is as follows:
Figure BDA0002125794280000042
wherein, beta012Are model parameters.
The invention has the beneficial effects that: the method and the device ensure that the future fashion trend of the message is predicted more accurately in the early stage of message propagation, and the model not only utilizes historical propagation information, but also well describes the propagation process of the microblog.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.
Drawings
For the purposes of promoting a better understanding of the objects, aspects and advantages of the invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:
FIG. 1 is a system diagram of a microblog popularity prediction model based on time and a user's forwarding sequence;
FIG. 2 is a user vector generation process in a forwarding sequence;
fig. 3 is a schematic diagram of the operation of an input vector by LSTM.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.
Wherein the showings are for the purpose of illustrating the invention only and not for the purpose of limiting the same, and in which there is shown by way of illustration only and not in the drawings in which there is no intention to limit the invention thereto; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there is an orientation or positional relationship indicated by terms such as "upper", "lower", "left", "right", "front", "rear", etc., based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not an indication or suggestion that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes, and are not to be construed as limiting the present invention, and the specific meaning of the terms may be understood by those skilled in the art according to specific situations.
Before introducing the summary of the solution, 7 necessary concepts of the invention are presented.
Concept 1: predicting the popularity of the message, wherein the message refers to information generated in a social network, such as a microblog in a Sina microblog, and the popularity refers to a final result of the future propagation of the message and can be measured by the forwarding times of the microblog; message popularity refers to predicting the specific number of forwards a message will be in the future early in its publication.
Concept 2: a recurrent neural network is a neural network for processing sequence data, for example, time sequence data refers to data collected at different time points, and such data reflects the changing state or degree of a certain object, phenomenon, etc. with time. The invention is an LSTM network, and the idea of LSTM is to reasonably utilize three gates. The first is a forgetting gate which is responsible for controlling to continuously save the state of the long-term unit; the second is an input gate which is responsible for controlling the input of the network at the current moment to the long-term unit state; the third is an output gate which is responsible for controlling whether the long term cell state is taken as the current LSTM output.
Concept 3: the topic model is a method for modeling texts and learning the implicit topic distribution in the texts, overcomes the defects of a document similarity calculation method in the traditional information retrieval, and can automatically find out semantic topics among characters in massive Internet data.
Concept 4: the linear regression model, which is mainly a learning linear model, aims to predict the output of input values almost accurately. In this model, the dependent variable is continuous, and the independent variable may be continuous or discrete. If only one independent variable and one dependent variable are included and the relationship can be approximately represented by a straight line, the analysis is called unary linear regression analysis. If two or more independent variables are included in the regression analysis and the dependent variable and the independent variable are in a linear relationship, it is referred to as a multiple linear regression analysis.
Concept 5: the point random process is called a point process on a positive real number domain by setting that the forwarding time in a certain microblog forwarding sequence is a non-negative random variable generated according to a time sequence, and the definition formula is as follows:
Figure BDA0002125794280000051
wherein HtThe historical propagation process between the forwarding moments t is shown, the above formula shows the relation of the rate changing along with the time in the microblog propagation process, and H is addedtBecause it is considered that the current forwarding action is influenced by the history propagation process.
Concept 6: the observation time, the time elapsed when the message publication was propagated for a period of time before the prediction began.
Concept 7: the popularity of messages tends to stabilize for the time that does not grow any longer.
The invention provides a microblog popularity prediction model based on time and a user forwarding sequence, which takes the information of a Xinlang microblog source microblog and subsequent forwarded microblog information as training sets and can more accurately predict the future popularity of the microblog after training. The model is modeled by utilizing a forwarding sequence of the microblog, the purpose of predicting the future popularity of the message is finally achieved, the model is totally divided into three parts, as shown in figure 1, in the first part, the forwarding sequence of the microblog is modeled by utilizing a recurrent neural network and is used for capturing the long-distance dependence of the message propagation process; the second part carries out nonlinear transformation network on the output result of the hidden layer and learns the speed of each time step in the transmission process; and the third part predicts the future popularity of the microblog by using the early trend acceleration and the early popularity obtained by the speed under the optimization of the activity of the user.
1. The first part comprises the following three steps:
step 1: the mapping of the time vector, for each time component unit, converts to the length of the unit according to the unit of the upper level, and then sets its length in the vector. For example, in the unit of minute, the unit of the upper level is hour, and one hour has 60 minutes, so according to the above definition, the length of the minute in the vector is 60, and given a time at will, the minute time in the time vector can be known, and the number m is obtained by taking the modulus of the length of the unit, and then the m-th position of the corresponding unit in the time vector is 1, and the rest positions are 0, so that the numerical value of the minute can be represented in the time vector. Vectorizing user information, collecting historical microblog text information of each user in a microblog, aggregating the historical microblog text information into a document representing the user, aggregating all user documents into a document set, randomly generating topic-word distribution of each topic and document-topic distribution of each user microblog document, generating words in all documents according to the document-topic distribution and the topic-word distribution, continuously performing model training according to Gibbs sampling of an LDA topic model, finally obtaining the topic distribution of each user document, and using the topic distribution as an interest vector of the user, wherein the specific process is shown in FIG. 2.
Step 2: the time and the user vector are spliced together to be input as a whole, and embedding operation is carried out according to a certain rule.
And step 3: and (3) inputting the result of the step (2) into a recurrent neural network, and then inputting the result into a bottom RNN through an embedding layer for propagation training, wherein the standard recurrent neural network has the gradient disappearance problem, and in order to solve the problem, a LSTM based on a door mechanism can be adopted. The LSTM is characterized in that the output of the hidden layer depends not only on the current input but also on the output of the previous layer, and the output of the hidden layer is obtained through the forgetting gate, the input gate and the output gate, and the specific process is shown in fig. 3. The forget gate formula is: f. oft=σ(Wf.[ht-1,xt]+bf) The input gate and network status are updated as follows: i.e. it=σ(Wi.[ht-1,xt]+bi),
Figure BDA0002125794280000061
The output gate is ot=σ(WO.[ht-1,xt])+bo), ht=ot*tanh(Ct)。
2. A second part comprising one of the steps of:
step 1: and acquiring hidden layer output of the recurrent neural network, and then performing nonlinear transformation to obtain the propagation rate of the microblog at each forwarding moment. The forwarding process of the message is modeled as a random point process, and a specific calculation formula is as follows:
vt=exp(Wmht+bm)
wherein, WmAs a weight matrix, bmAs a bias parameter, HtIs reflected in WmhtUpper, htIs the hidden layer information of the recurrent neural network and also represents the historical information in the sequence data.
3. The third part comprises the following three steps:
step 1: and calculating the propagation trend acceleration of the microblog in the observation time by using the obtained rate function. The spreading trends of different types of microblogs are greatly different, and the differences of the spreading trends lead to future popularity, so that a feature capable of showing the changes of the popularity trends needs to be found and is fused into a model, and the future popularity of the microblogs can be accurately predicted. The calculation formula is as follows:
Figure BDA0002125794280000071
wherein, TobsRepresenting observation time, n representing the number of elements in the forwarding sequence, viRepresenting a rate function for each forwarding instant.
And 2, quantifying the user activity to obtain the user activity of each time period on the microblog platform. The specific quantization formula is as follows:
Figure BDA0002125794280000072
n (t) represents the average number of microblogs issued by the user from the start time of a day to the current time t, η represents the average number of microblogs issued by the user in unit time on the microblog platform, and the unit time can be hours, minutes and seconds.
And 3, dividing the trend acceleration and the popularity of the early message in the step 1 by the user activity in the step 2 respectively to obtain the relative trend acceleration and the relative popularity, wherein the relative trend acceleration and the relative popularity are as follows:
Figure BDA0002125794280000073
then, combining the two to establish a linear regression model, wherein the calculation formula is as follows:
Figure BDA0002125794280000074
wherein, beta012Are model parameters.
Finally, the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered by the claims of the present invention.

Claims (3)

1. A microblog popularity prediction method based on time and a user forwarding sequence is characterized by comprising the following steps: the method comprises the following steps:
s1: modeling a microblog forwarding sequence by utilizing a recurrent neural network, and capturing long-distance dependence of a message propagation process;
s2: acquiring hidden layer output of a recurrent neural network, and then performing nonlinear transformation to obtain the propagation rate of the microblog at each forwarding moment;
s3: predicting the future popularity of the microblog by utilizing the early trend acceleration and the early popularity obtained by the speed under the optimization of the user activity; the method comprises the following steps:
s31: and calculating the propagation trend acceleration of the microblog in the observation time by using the obtained rate function, wherein the calculation formula is as follows:
Figure FDA0003112459230000011
wherein, TobsRepresenting observation time, n representing the number of elements in the forwarding sequence, viA rate function representing each forwarding instant;
s32: quantifying the user activity to obtain the user activity of each time period on the microblog platform, wherein a specific quantification formula is as follows:
Figure FDA0003112459230000012
n (t) represents the average number of microblogs issued by users from the starting time of a day to the current time t, and eta represents the average number of microblogs issued by users in unit time on a microblog platform;
s33: dividing the trending acceleration and the popularity of the early message of step S31 by the user activity of step S32, respectively, yields a relative trending acceleration and a relative popularity, as follows:
Figure FDA0003112459230000013
then, combining the two to establish a linear regression model, wherein the calculation formula is as follows:
Figure FDA0003112459230000014
wherein, beta012Are model parameters.
2. The microblog popularity prediction method based on time and a user forwarding sequence according to claim 1, wherein: step S1 includes the following steps:
s11: mapping of time vectors, converting each time composition unit into the length of the unit according to the unit at the upper stage, then setting the length of the unit in the vector, vectorizing user information, collecting historical microblog text information of each user in a microblog, aggregating the historical microblog text information into a document representing the user, aggregating all user documents into a document set, randomly generating topic-word distribution of each topic and document-topic distribution of each user microblog document, generating words in all documents according to the document-topic distribution and the topic-word distribution, continuously training the models according to Gibbs sampling of an LDA topic model, finally obtaining the topic distribution of each user document, and using the topic distribution as an interest vector of the user;
s12: splicing time and a user vector to be input as a whole, and performing embedding operation according to a certain rule;
s13: inputting the result of the step S12 into a recurrent neural network, inputting the result into a bottom RNN through an embedding layer for propagation training, solving the problem of gradient disappearance in a standard neural network by adopting an LSTM as the recurrent neural network, and finally obtaining hidden layer output of each time step through a forgetting gate, an input gate and an output gate;
the forget gate formula is:
ft=σ(Wf.[ht-1,xt]+bf),
wherein x istIs the input of the t-th layer, htHidden layer information, h, representing the current time stept-1Denotes hidden layer information at the previous time step, ". denotes multiplication of vectors, middle brackets denote that two vectors are connected and merged, σ (-) is a sigmoid activation function, WfAs a weight matrix, bfAs a bias vector
The input gate and network status updates are:
it=σ(Wi.[ht-1,xt]+bi),
Figure FDA0003112459230000021
wherein, WCAnd bCRespectively representing a weight matrix and a bias vector, and tanh is a hyperbolic tangent function;
the output gate is:
ot=σ(WO.[ht-1,xt])+bo),ht=ot*tanh(Ct)
wherein, WoAnd boRespectively weight matrix and bias parameters.
3. The microblog popularity prediction method based on time and a user forwarding sequence according to claim 1, wherein: in step S2, the hidden layer output of the recurrent neural network is obtained, then nonlinear transformation is performed to obtain the propagation rate of the microblog at each forwarding time, the forwarding process of the message is modeled as a random point process, and the calculation formula is as follows:
vt=exp(Wmht+bm)
wherein, WmAs a weight matrix, bmAs a bias parameter, HtIs reflected in WmhtUpper, htIs the hidden layer information of the recurrent neural network and also represents the historical information in the sequence data.
CN201910621977.8A 2019-07-10 2019-07-10 Microblog popularity prediction method based on time and user forwarding sequence Active CN110336700B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910621977.8A CN110336700B (en) 2019-07-10 2019-07-10 Microblog popularity prediction method based on time and user forwarding sequence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910621977.8A CN110336700B (en) 2019-07-10 2019-07-10 Microblog popularity prediction method based on time and user forwarding sequence

Publications (2)

Publication Number Publication Date
CN110336700A CN110336700A (en) 2019-10-15
CN110336700B true CN110336700B (en) 2021-09-14

Family

ID=68146339

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910621977.8A Active CN110336700B (en) 2019-07-10 2019-07-10 Microblog popularity prediction method based on time and user forwarding sequence

Country Status (1)

Country Link
CN (1) CN110336700B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111241392B (en) * 2020-01-07 2024-01-26 腾讯科技(深圳)有限公司 Method, apparatus, device and readable storage medium for determining popularity of article
CN112580879B (en) * 2020-12-23 2023-10-10 河南广播电视台 Information popularity prediction method based on graph neural network
CN112580878B (en) * 2020-12-23 2023-10-10 河南广播电视台 Information popularity prediction method based on graph neural network
CN113190733B (en) * 2021-04-27 2023-09-12 中国科学院计算技术研究所 Network event popularity prediction method and system based on multiple platforms
CN113536144B (en) * 2021-06-17 2022-04-19 中国人民解放军国防科技大学 Social network information propagation scale prediction method and device
CN114912941B (en) * 2022-04-11 2023-08-11 四川大学 Shoe fashion trend prediction system and method based on big data
CN114997464B (en) * 2022-04-26 2024-08-06 北京交通大学 Popularity prediction method based on graph time sequence information learning
CN115470994B (en) * 2022-09-15 2023-07-11 苏州大学 Information popularity prediction method and system based on explicit time and cascade attention
CN117134997B (en) * 2023-10-26 2024-03-01 中电科大数据研究院有限公司 Edge sensor energy consumption attack detection method, device and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105975504A (en) * 2016-04-28 2016-09-28 中国科学院计算技术研究所 Recurrent neural network-based social network message burst detection method and system
CN109063827A (en) * 2018-10-25 2018-12-21 电子科技大学 It takes automatically in the confined space method, system, storage medium and the terminal of specific luggage

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105975504A (en) * 2016-04-28 2016-09-28 中国科学院计算技术研究所 Recurrent neural network-based social network message burst detection method and system
CN109063827A (en) * 2018-10-25 2018-12-21 电子科技大学 It takes automatically in the confined space method, system, storage medium and the terminal of specific luggage

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种基于RNN 的社交消息爆发预测模型;笱程成等;《软件学报》;20171130;摘要,第1.2节-第1.2节,第3.1节-第3.2节 *

Also Published As

Publication number Publication date
CN110336700A (en) 2019-10-15

Similar Documents

Publication Publication Date Title
CN110336700B (en) Microblog popularity prediction method based on time and user forwarding sequence
CN106682770B (en) Dynamic microblog forwarding behavior prediction system and method based on friend circle
CN110232109A (en) A kind of Internet public opinion analysis method and system
CN111885399B (en) Content distribution method, device, electronic equipment and storage medium
CN109325112A (en) A kind of across language sentiment analysis method and apparatus based on emoji
CN111339404A (en) Content popularity prediction method and device based on artificial intelligence and computer equipment
CN107341145A (en) A kind of user feeling analysis method based on deep learning
Velampalli et al. Performance evaluation of sentiment analysis on text and emoji data using end-to-end, transfer learning, distributed and explainable ai models
CN113590928A (en) Content recommendation method and device and computer-readable storage medium
CN111723295B (en) Content distribution method, device and storage medium
KR101575779B1 (en) Program rating prediction method and apparatus, and system based on sentiment analysis of viewers comments
Xiao et al. User behavior prediction of social hotspots based on multimessage interaction and neural network
CN107784387A (en) The continuous dynamic prediction method that a kind of microblogging event information is propagated
CN115470991A (en) Network rumor propagation prediction method based on user short-time emotion and evolutionary game
Guo et al. Who is answering whom? Finding “Reply-To” relations in group chats with deep bidirectional LSTM networks
CN116308854A (en) Information cascading popularity prediction method and system based on probability diffusion
Tong et al. Multimedia network public opinion supervision prediction algorithm based on big data
Zarzour et al. Sentiment analysis based on deep learning methods for explainable recommendations with reviews
CN114218457A (en) False news detection method based on forward social media user representation
KR20200106231A (en) Qualitative system for determining fake news, qualitative method for determining fake news, and computer-readable medium having a program recorded therein for executing the same
Drakopoulos et al. Discovering sentiment potential in Twitter conversations with Hilbert–Huang spectrum
WO2023087933A1 (en) Content recommendation method and apparatus, device, storage medium, and program product
Manasa et al. Detection of twitter spam using GLoVe vocabulary features, bidirectional LSTM and convolution neural network
CN114048395B (en) User forwarding prediction method and system based on time perception and key information extraction
CN115495671A (en) Cross-domain rumor propagation control method based on graph structure migration

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant