CN114357160A - Early rumor detection method and device based on generation propagation structure characteristics - Google Patents
Early rumor detection method and device based on generation propagation structure characteristics Download PDFInfo
- Publication number
- CN114357160A CN114357160A CN202111574716.9A CN202111574716A CN114357160A CN 114357160 A CN114357160 A CN 114357160A CN 202111574716 A CN202111574716 A CN 202111574716A CN 114357160 A CN114357160 A CN 114357160A
- Authority
- CN
- China
- Prior art keywords
- event
- propagation
- source text
- propagation structure
- detected
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 128
- 238000013145 classification model Methods 0.000 claims abstract description 79
- 238000010586 diagram Methods 0.000 claims abstract description 57
- 230000004044 response Effects 0.000 claims abstract description 43
- 238000012549 training Methods 0.000 claims abstract description 43
- 238000000034 method Methods 0.000 claims abstract description 41
- 238000003062 neural network model Methods 0.000 claims abstract description 37
- 238000000605 extraction Methods 0.000 claims abstract description 32
- 230000005540 biological transmission Effects 0.000 claims abstract description 20
- 230000006870 function Effects 0.000 claims description 27
- 238000004590 computer program Methods 0.000 claims description 12
- 230000015654 memory Effects 0.000 claims description 10
- 238000007781 pre-processing Methods 0.000 claims description 9
- 230000000306 recurrent effect Effects 0.000 claims description 9
- 238000003860 storage Methods 0.000 claims description 9
- 238000012545 processing Methods 0.000 claims description 7
- 238000010276 construction Methods 0.000 claims description 4
- 239000013598 vector Substances 0.000 description 30
- 239000011159 matrix material Substances 0.000 description 12
- 238000013528 artificial neural network Methods 0.000 description 8
- 238000004891 communication Methods 0.000 description 6
- 238000009826 distribution Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 5
- 238000004088 simulation Methods 0.000 description 5
- 238000012546 transfer Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 230000004927 fusion Effects 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 230000004931 aggregating effect Effects 0.000 description 2
- 125000004122 cyclic group Chemical group 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000001902 propagating effect Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000002411 adverse Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 230000007480 spreading Effects 0.000 description 1
- 238000003892 spreading Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides an early rumor detection method and device based on generation and propagation structural features, wherein the method comprises the following steps: acquiring a first source text of an event to be detected; inputting the first source text into a conditional variation self-encoder to obtain a first propagation structure characteristic of the event to be detected, which is generated by the conditional variation self-encoder; the conditional variation self-encoder is used for training and acquiring according to a second source text and a second propagation structure characteristic of the sample event; the second propagation structure feature is obtained by carrying out feature extraction on the propagation structure diagram of the sample event based on the graph convolution neural network model; the propagation structure chart is constructed and obtained according to a second source text of the sample event and a response tweet corresponding to the second source text; and inputting the first source text and the first transmission structure characteristics into the classification model to obtain a rumor detection result of the event to be detected, which is output by the classification model. According to the method and the device, after the source text of the event to be detected is obtained, the rumor detection result of the event to be detected can be quickly and accurately obtained.
Description
Technical Field
The invention relates to the technical field of data processing, in particular to an early rumor detection method and device based on generation and propagation structural features.
Background
With the rise and development of the internet era, the information flow and the information transmission speed are faster and faster, people can freely release information on social media, and the rumor is also caused to be rapidly transmitted. Rumors refer to the fact that people violate the true phase of a thing or knead by mistake, have certain deception and are the words of disturbing social order and even bringing serious panic and economic loss to the society. It is therefore a serious challenge how to distinguish rumors from the vast amount of information on social media.
Rumor detection techniques are processes that model and identify rumors by relevant computer knowledge and skills. The existing rumor detection technology has the following problems: (1) the existing rumor detection research is limited to the detection of rumors propagated for a certain time, namely the rumors can be identified only after a large amount of exposure, and the detection of the rumors in the early stage is lack of the detection and lacks of certain timeliness; (2) in the existing research, rumors are mostly detected according to text information in early rumors, but due to the diversification of information structures, it is difficult to obtain accurate rumors only according to the text information.
Disclosure of Invention
The invention provides an early rumor detection method and device based on generation and propagation structural features, which are used for solving the defects of poor timeliness and low accuracy of the rumor detection method in the prior art and realizing timely and accurate detection of rumors when a source text of the rumors is obtained.
The invention provides an early rumor detection method based on generation and propagation structural features, which comprises the following steps:
acquiring a first source text of an event to be detected;
inputting the first source text into a conditional variation self-encoder to obtain a first propagation structure characteristic of the event to be detected, which is generated by the conditional variation self-encoder; the conditional variation self-encoder is used for training and acquiring according to a second source text and a second propagation structure characteristic of the sample event; the second propagation structure feature is obtained by carrying out feature extraction on the propagation structure diagram of the sample event based on a graph convolution neural network model; the propagation structure diagram is constructed and obtained according to a second source text of the sample event and a response tweet corresponding to the second source text;
inputting the first source text and the first transmission structure characteristics into a classification model to obtain a rumor detection result of the event to be detected, which is output by the classification model; and the classification model is trained and obtained according to the second source text information, the second propagation structure characteristics and the rumor detection labels of the sample events.
According to the early rumor detection method based on the generated propagation structure characteristics provided by the invention, the first source text and the first propagation structure characteristics are input into a classification model, and the rumor detection result of the event to be detected, which is output by the classification model, is obtained, and the method comprises the following steps:
inputting the first source text into a feature extraction layer of the classification model to obtain text features of the first source text;
and splicing the text characteristics and the first transmission structure characteristics, and inputting the text characteristics and the first transmission structure characteristics into a classification layer of the classification model to obtain a rumor detection result of the event to be detected.
According to the early rumor detection method based on the generation and propagation structure characteristics, provided by the invention, the feature extraction layer of the classification model is constructed and generated based on the recurrent neural network model;
and constructing and generating a classification layer of the classification model based on a softmax function.
According to the early rumor detection method based on the generation of the propagation structure characteristics, provided by the invention, the propagation structure diagram is constructed based on the following steps:
and constructing a propagation structure diagram of the sample event according to the response relation between any two nodes by taking the second source text and each response tweet in the sample event as nodes.
According to the early rumor detection method based on the generation propagation structure characteristics, the response relation comprises a forwarding relation, a time dimension relation and a comment relation.
According to the early rumor detection method based on the generation propagation structural feature provided by the invention, the second propagation structural feature is obtained based on the following steps:
inputting the propagation structure diagram of the sample event into the graph convolution neural network model to obtain a second propagation structure characteristic of the sample event output by a characteristic extraction layer of the graph convolution neural network model;
and the graph convolution neural network model is trained and obtained according to the propagation structure diagram of the sample events and the rumor detection labels of the sample events.
According to the early rumor detection method based on the generated propagation structure characteristics provided by the invention, the first source text is input into a conditional variation self-encoder to obtain the first propagation structure characteristics of the event to be detected generated by the conditional variation self-encoder, and the method comprises the following steps:
preprocessing the first source text;
wherein the preprocessing comprises removing messy code data and text vectorization processing;
and inputting the preprocessed first source text into the condition variational self-encoder to obtain a first propagation structure characteristic of the event to be detected.
The invention also provides an early rumor detection device based on the generation and propagation structural features, which comprises:
the text acquisition module is used for acquiring a first source text of the event to be detected;
the propagation structure feature generation module is used for inputting the first source text into a conditional variation self-encoder to obtain a first propagation structure feature of the event to be detected, which is generated by the conditional variation self-encoder; the conditional variation self-encoder is used for training and acquiring according to a second source text and a second propagation structure characteristic of the sample event; the second propagation structure feature is obtained by carrying out feature extraction on the propagation structure diagram of the sample event based on a graph convolution neural network model; the propagation structure diagram is constructed and obtained according to a second source text of the sample event and a response tweet corresponding to the second source text;
the rumor detection module is used for inputting the first source text and the first propagation structure characteristics into a classification model to obtain a rumor detection result of the event to be detected, which is output by the classification model; and the classification model is trained and obtained according to the second source text information, the second propagation structure characteristics and the rumor detection labels of the sample events.
The present invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor executes the program to implement the steps of the method for detecting early rumors based on the generated propagation structure characteristics as described in any one of the above.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method for early rumor detection based on generation of propagation structural features as described in any of the above.
The present invention also provides a computer program product comprising a computer program which, when executed by a processor, implements the steps of the method for early rumor detection based on generation of propagation structural features as described in any of the above.
According to the early rumor detection method and device based on the generated propagation structure characteristics, the propagation structure diagram is constructed according to the second source text and the response tweet of the sample event, the propagation structure diagram is subjected to characteristic extraction based on the graph convolution neural network model to obtain the second propagation structure characteristics of the sample event, iteration is carried out according to the second source text and the second propagation structure characteristics of the sample event, and then the conditional variation self-encoder which can generate the propagation structure characteristics according to the source text of the event in a simulation mode can be obtained; when a first source text of an event to be detected is obtained, automatically simulating and generating a first propagation structure characteristic of the event to be detected based on a condition variational self-encoder; then, combining the first propagation structure characteristic of the event to be detected and the first source text, and then carrying out rumor detection on the event to be detected; on one hand, in the whole rumor detection process, corresponding propagation structure characteristics can be generated quickly only by acquiring the first source text of the event to be detected, time delay caused by acquiring the propagation structure characteristics after collecting a large number of response tweets is reduced, the rumor can be detected at the source of release of the rumor, and the timeliness of rumor detection is effectively improved; on the other hand, the source text information of the event is combined with the propagation structure characteristics, so that the characteristics related to rumor detection are enriched, and the rumor detection result is more accurate.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of an early rumor detection method based on generation and propagation structural features according to the present invention;
fig. 2 is a schematic structural diagram of a conditional variational self-encoder, a graph convolution neural network model and a classification model in the early rumor detection method based on generation propagation structural features according to the present invention;
FIG. 3 is a second schematic flow chart of the method for detecting early rumors based on structural features of generation propagation according to the present invention;
FIG. 4 is a third schematic flow chart of the method for early rumor detection based on structural features of generation propagation according to the present invention;
fig. 5 is a schematic structural diagram of an early rumor detection apparatus based on generation propagation structural features according to the present invention;
fig. 6 is a schematic structural diagram of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present embodiment, it is to be understood that the terms "first" and "second" are used merely to distinguish between different descriptions and are not to be construed as indicating or implying a relative importance or hierarchical order.
The conventional rumor detection method cannot timely and accurately detect rumor events at the source of rumor release, and has poor real-time performance; in addition, the data amount of early detection is small, rumors can only be detected according to the text information of events, and the rumors cannot be accurately obtained only by considering the single granularity of the text information.
For the above text, the present embodiment proposes an early rumor detection method based on the feature of generating propagation structure. The method comprises the steps that a conditional variational self-encoder is constructed according to a large amount of rumor propagation data of historical events and used for learning structural characteristics of rumor propagation, so that when only a source text of an event to be detected exists in the early stage, the trained conditional variational self-encoder can be used for predicting the propagation structural characteristics of the time to be detected, the propagation structural characteristics are combined with the source text for rumor detection, when the source text of the time to be detected is obtained, a rumor detection result can be rapidly and accurately obtained, and whether the event to be detected is the rumor event or not can be determined.
It should be noted that the execution subject of the method may be an electronic device, a component in an electronic device, an integrated circuit, or a chip. The electronic device may be a mobile electronic device or a non-mobile electronic device. By way of example, the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palm top computer, a vehicle-mounted electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook or a Personal Digital Assistant (PDA), and the like, and the non-mobile electronic device may be a server, a Network Attached Storage (NAS), a personal computer (personal computer, PC), a smart Television (TV), a teller machine, a self-service machine, and the like, and the present invention is not particularly limited.
The method for detecting early rumors based on the generation of propagation structural features according to the present invention is described below with reference to fig. 1, and comprises:
optionally, the event to be detected is an event requiring rumor detection. The event to be detected may be crawled from the internet in real time by using a crawler tool, or may be obtained by internet transmission or downloading, which is not specifically limited in this embodiment.
The sample event and the event to be detected may be events issued on various social network sites, such as a post sent on a microblog or a dynamic state issued on a WeChat, and the like, which is not specifically limited in this embodiment.
The rumor detection method in this embodiment is described below by taking a sample event and a post sent on a microblog as an example.
The first source text of the event to be detected is text information contained in a source post issued for the event to be detected.
optionally, the conditional variation self-encoder is used for training and acquiring according to a second source text and a second propagation structure feature of the sample event; the trained conditional variational self-encoder can generate corresponding propagation structure characteristics according to the source text, and early rumor detection can be rapidly and accurately performed by combining the propagation structure characteristics generated in the subsequent rumor detection with the source text.
The response tweet is a text which comments and forwards the source text, namely the response tweet is text information of a comment post which comments the source post and text information of a forward post which forwards the source post. The present embodiment collectively refers to the comment post and the forward post as the posting.
A Conditional Variant Auto Encoder (CVAE) is a deep hidden space generation model, and consists of an encoder and a decoder; the encoder and the decoder are formed by a neural network model, and the type and the structure of the model may be set according to actual requirements, such as a convolutional neural network and a cyclic neural network, which is not specifically limited in this embodiment.
Optionally, before performing step 102, a conditional variation self-encoder capable of generating the propagation structure feature is obtained after training is completed according to the second source text and the second propagation structure feature of the sample event, and the specific training step may include:
firstly, collecting samples and sorting a data set;
optionally, a large number of sample posts are collected, and the source posts and the transfer posts of the same event are aggregated in units of events to form a sample event data set C ═ C1,c2,…ci,…,cn}; wherein, ciThe sample event represents the ith sample event, and n represents the number of sample events contained in the sample event data set, and the specific number may be set according to actual requirements, which is not specifically limited in this embodiment.
Sample events The source text representing the ith sample event,the jth relevant comment post or forward post representing the ith sample event, i.e. the response tweet, miRepresents the total number of response tweets contained in the ith event;toThe text is extrapolated for all responses to the ith sample event.
Then, each text of the sample event can be preprocessed to obtain a text vector of each text in the sample eventWherein, ciA source text vector representing the ith sample event,toText vectors for all response tweets for the ith sample event.
Then, constructing and acquiring a propagation structure diagram according to a second source text of the sample event and a response tweet corresponding to the second source text; carrying out feature extraction on the propagation structure diagram of the sample event based on the graph convolution neural network model to obtain a second propagation structure feature of the sample event;
then, the source text vector c in the sample event dataset is given as c1,c2…,ci,…,cnAs condition information in a condition variation self-encoder. WhereinSource text vector, n, representing the ith sample eventiRepresenting a source text vector ciThe number of words contained in (a). The propagation structure characteristics H of all sample events are set as H0,H1,…,Hi,…,HnThe real sample data in the self-encoder is taken as conditional variation; wherein HiThe propagation structure characteristic of the ith sample event.
Optionally, the source text vector and the propagation structure feature of each sample event are used as a group of samples, a loss function of the conditional variation self-encoder is calculated and obtained according to the groups of samples, and iterative training is performed on the conditional variation self-encoder based on the loss function.
The learning principle of the conditional variable self-encoder is to encode original data into a hidden variable z conforming to a specific distribution, where the hidden variable z is set to conform to a standard normal distribution, i.e., p (z) ═ N (1, 0), and then decode the hidden variable z to generate new data.
For encoder E (-), ciAs condition information and real sample data HiAre input together into an encoder E (-) which calculates p (z | H)i,ci) Mean value of (a)iSum variance δi 2。
For a decoder D (-) using the recomparameter trick sampling to obtain an implicit variable z, the decoder samples the data H by the implicit variable ziAnd (6) carrying out reconstruction.
The calculation formula for obtaining the hidden variable z by using the heavy parameter skill sampling is as follows:
z=μi+ε⊙δi;
wherein ε follows a standard normal distribution N (0, 1), -is a point-by-point operator,. mu.iAnd deltai 2Mean and variance of normal distribution;
the formula for the loss function of the CVAE model is:
wherein,to reconstruct the loss function, E (-) is the mathematical expectation;as a distribution of encodersWith a prior distribution pθ(z) KL divergence between;to a posterior probability with respect to z, pθ(Hi|z,ci) To relate to HiThe conditional probability of (a);and θ are the network parameters of the encoder and decoder, respectively.
And iteratively updating the network parameters of the CVAE model based on the loss function of the CVAE model, so that a decoder of the CVAE model can reconstruct and generate simulated propagation structural features approximate to real sample data.
After the CVAE model training is completed, a first source text of the event to be detected is input into the trained CVAE model, and a first propagation structure characteristic corresponding to the first source text can be generated.
By learning the propagation structure characteristics from the forwarding information, comment information and source text of a large number of historical events, the propagation structure characteristics corresponding to the source text can be simulated and generated when only the source text is available in the early stage, the source text does not need to be waited for being diffused in a large number, and the detection timeliness is effectively improved; moreover, by integrating the source text and the propagation structure characteristics corresponding to the source text, the rumors can be timely and accurately detected when the source text is released in an early stage, and adverse effects caused by spreading of rumor events are avoided.
The network structure of the classification model can be set according to actual requirements, such as the number of layers of the network and the number of nodes in each layer. The classification model may be constructed and generated by various artificial intelligence models with classification functions, including but not limited to a convolutional neural network, a cyclic neural network, a computer vision group network, a residual error network, and the like, which is not specifically limited in this embodiment.
Rumor detection labels are true rumor detection categories for sample events, including rumors and non-rumors.
Optionally, before step 103 is executed, the trained classification model may be obtained by training in advance, a specific training mode is iterative training, and the specific training step may include: firstly, second source text information and second propagation structure characteristics of a sample event are obtained; and taking the second source text information and the second propagation structure characteristics of the sample events as input information, taking the rumor detection label of the sample events as a real label, and performing iterative training on the classification model to obtain the classification model for accurately distinguishing the rumor events from non-rumor events.
After the classification model training is completed, the first source text and the first transmission structure feature of the event to be detected are spliced and input into the trained classification model, and whether the event to be detected is a rumor event or a non-rumor event can be determined.
The propagation structure diagram is constructed according to a second source text and a response tweet of the sample event, the propagation structure diagram is subjected to feature extraction based on a graph convolution neural network model to obtain a second propagation structure feature of the sample event, and iteration is carried out according to the second source text and the second propagation structure feature of the sample event to obtain a conditional variation self-encoder which can generate the propagation structure feature in a simulation mode according to the source text of the event; when a first source text of an event to be detected is obtained, automatically simulating and generating a first propagation structure characteristic of the event to be detected based on a condition variational self-encoder; then, combining the first propagation structure characteristic of the event to be detected and the first source text, and then carrying out rumor detection on the event to be detected; on one hand, in the whole rumor detection process, corresponding propagation structure characteristics can be generated quickly only by acquiring the first source text of the event to be detected, time delay caused by acquiring the propagation structure characteristics after collecting a large number of response tweets is reduced, the rumor can be detected at the source of release of the rumor, and the timeliness of rumor detection is effectively improved; on the other hand, the source text information of the event is combined with the propagation structure characteristics, so that the characteristics related to rumor detection are enriched, and the rumor detection result is more accurate.
On the basis of the foregoing embodiment, in this embodiment, the inputting the first source text and the first propagation structure feature into a classification model to obtain a rumor detection result of the event to be detected output by the classification model includes: inputting the first source text into a feature extraction layer of the classification model to obtain text features of the first source text; and splicing the text characteristics and the first transmission structure characteristics, and inputting the text characteristics and the first transmission structure characteristics into a classification layer of the classification model to obtain a rumor detection result of the event to be detected.
Optionally, the classification model in this embodiment includes a feature extraction layer and a classification layer;
the feature extraction layer may be constructed and generated based on various neural network models, such as a recurrent neural network and a convolutional neural network, which is not specifically limited in this embodiment.
Optionally, according to the first source text and the first propagation structure feature, the step of rumor detection on the event to be detected includes inputting the first source text into a feature extraction layer of a classification model to obtain a text feature of the first source text;
then, splicing the text feature of the first source text and the first transmission structure feature to obtain a fusion feature; and inputting the fusion characteristics into a classification layer to obtain the probability P (rumor) of the event to be detected belonging to the rumor, and determining whether the event to be detected belongs to the rumor according to the probability of the event to be detected belonging to the rumor.
The specific determination method may be that the probability of the event to be detected belonging to the rumor is compared with a preset value, if the probability is greater than the preset value, the event to be detected belongs to the rumor, otherwise, the event belongs to the rumor.
In the embodiment, the first source text is subjected to feature extraction through the feature extraction layer of the classification model, so that deep features influencing rumor detection can be extracted, and the rumor detection result is more accurate; and the rumor detection is carried out on the event to be detected according to the fusion characteristics comprising the text characteristics and the first transmission structure characteristics, so that the accuracy of the rumor detection can be further improved.
On the basis of the above embodiment, the feature extraction layer of the classification model in this embodiment is constructed and generated based on a recurrent neural network model; and constructing and generating a classification layer of the classification model based on a softmax function.
Optionally, the classification model is generated based on a recurrent neural network and a classification layer construction;
the Recurrent Neural Network may be a standard Recurrent Neural Network (RNN) or one of its derived algorithms, such as a LSTM (Long short-term memory) Network, which is not specifically limited in this embodiment.
As shown in FIG. 2, the classification model is generated by LSTM network and softmax function construction.
The rumor detection method in this embodiment is described below by taking a classification model constructed by the LSTM network and the softmax function as an example.
Before rumor detection is performed on an event to be detected based on the classification model, the classification model after training can be obtained through pre-training, and the specific training step can include:
firstly, vectorizing a second source text of a sample event to obtain a second source text vector of the sample event iniIs a source text vector ciNumber of words, x, contained intRepresenting a source text vector ciThe word vector of the t-th word in (1).
Then, c is mixediInputting into an LSTM network, extracting higher-order feature representation of source text vector
The LSTM network comprises three gates, namely a forgetting gate, an input gate and an output gate. Three inputs are provided for each gate unit, which are the input values x of the LSTM network at the current timetOutput value a of the last-time LSTM networkt-1And the cell state value c at the previous timet-1。
The calculation formula of the forgetting gate in the LSTM network is as follows:
Ff=σ(Wf[at-1,xt]+bf);
Ffis the output of a forgetting gate for reading the input information x of the current networktAnd the output value a of the LSTM unit at the last momentt-1(ii) a σ (-) is an activation function with a control output between 0 and 1; wfIs a weight coefficient, bfBias term, at-1Is the output value of the LSTM network at the last time. FfDetermining the amount of information of the state value of the LSTM network at the last moment to be abandoned, wherein the value is between 0 and 1; when F is presentfThe value is 0, which represents that the state value c of the unit at the last moment is forgottent-1All information of (1) is retainedt-1The information in (1).
Input gate in LSTM for controlling input data xtHow much information is input into the cell, the calculation formula is:
Fi=σ(Wi[at-1,xt]+bi);
Fiis the output of the input gate for controlling how much input information can be input into the network elementIs the intermediate transition of the state value of the LSTM network, ctIs the updated state value of the current cell; wiAnd WcIs a weight coefficient, biAnd bcIs the bias term.
Output gate of LSTM controls current cell state valuect-1How much information is output, the calculation formula is:
Fo=σ(Wo[at-1,xt]+bo);
at=Fo*ct;
Fois an output control gate, which controls the current cell state value ctThe information output ratio of (1). WoAnd boAre the weight coefficients and the bias terms. a istThe high-order feature vectors of the source text that are finally output by the LSTM network.
Then, extracting a second propagation structure feature of the sample event generated by the conditional variation self-encoder, and enabling the second propagation structure feature and the corresponding source text feature vector aiSplicing, and then performing classification detection through a full connection layer by using a softmax function to obtain the probability of the rumor label, wherein the calculation formula is as follows:
wherein,is the ith sample event c of the classification model outputiRumor test result of (a)iIs a sample event ciA high-order feature representation of the source text; hiIs a sample event ciThe propagation structure of (1).
Due to the pair HiIs not calculable, the propagation structure features generated by the conditional variant self-encoder are sampled M times using monte carlo and averaged as expectedThe final predicted label probability expression may be converted to the following equation:
is a predicted rumor detection label; a isiIs a high-order feature of the source text and z is a hidden variable. The negative log-likelihood function can be used as a loss function of the classification model, and the formula is specifically calculated:
n is the number of sample events, aiIs a sample event ciA high-order feature representation of the source text;is the predicted rumor detection signature for the ith sample event.
And finally, carrying out iterative training on the classification model according to the loss function of the classification model until the training termination condition of the classification model is met, so as to obtain the classification model capable of accurately distinguishing rumor events from non-rumor events.
In the embodiment, the text used for representing rumor events and non-rumor events on the time sequence can be better captured in the conditional generation countermeasure network based on the recurrent neural network, so that the detection result is more accurate; in addition, Monte Carlo is used for sampling the propagation structure characteristics generated by the condition variation self-encoder for multiple times and then solving the mean value of the propagation structure characteristics, and the rumor detection probability function is converted, so that the optimized classification model can be quickly and effectively obtained.
On the basis of the foregoing embodiments, the propagation structure diagram in this embodiment is constructed based on the following steps: and constructing a propagation structure diagram of the sample event according to the response relation between any two nodes by taking the second source text and each response tweet in the sample event as nodes.
The propagation structure chart is a propagation structure for representing an interaction process and information among posts in an event, namely a response relation among the posts.
Optionally, constructing a propagation structure diagram for each sample event by taking the event as a unit; for the ith event, its propagation structure diagram is Gi=(Vi,Ei) (ii) a Wherein, ViFor propagating a set of nodes of a graph, each node represents a label, EiPropagating the set of edges of the structure diagram for the ith event, each edge representing a response relationship between any two post nodes, AiA adjacency matrix representing the ith event propagation structure map.
The adjacency matrix can be generated according to the response relationship construction between any two post nodes.
The response relation is a response relation between any two posts, and can be specifically set according to actual requirements.
By constructing the propagation structure diagram, the embodiment can mine the propagation structure information among posts in the event, capture richer context, and be more beneficial to rumor detection.
On the basis of the above embodiment, the response relationship in this embodiment includes a forwarding relationship, a time dimension relationship, and a comment relationship.
Optionally, the forwarding relationship is used to represent a forwarding hierarchical relationship between two posts, and a forwarding relationship may exist between any two posts, and a forwarding relationship may not exist; if a forwarding relationship exists between the source post and the forwarding post; there is no forwarding relationship between two forwarding posts that forward text from a source post.
The comment relationship is used for representing the comment hierarchical relationship between two posts, and the comment relationship may exist between any two posts and the comment relationship may not exist; if a forwarding relation exists between the source post and the comment post; there is no comment relationship between two comment posts commenting on a source post.
The time dimension relation is a time dimension existing between any two posts for forwarding or commenting the same post; or a time dimension in which the time of publication of different posts exists.
The comprehensive response relationship comprises a forwarding relationship, a time dimension relationship and a comment relationship to construct a propagation structure diagram, so that the propagation structure diagram contains rich relationship characteristics among posts, and the rumor detection result is more accurate.
On the basis of the foregoing embodiments, the second propagation structure characteristic in this embodiment is obtained based on the following steps: inputting the propagation structure diagram of the sample event into the graph convolution neural network model to obtain a second propagation structure characteristic of the sample event output by a characteristic extraction layer of the graph convolution neural network model; and the graph convolution neural network model is trained and obtained according to the propagation structure diagram of the sample events and the rumor detection labels of the sample events.
It should be noted that, before the condition variation self-encoder is trained, a second propagation structure feature of the sample event needs to be obtained in advance; the acquisition step of the second propagation structure characteristics comprises the steps of firstly, acquiring a propagation structure diagram of a sample event; then, each node and the adjacent matrix in the propagation structure chart are used as the input of the graph convolution neural network model, and each node in the propagation structure chart is updated to obtain the corresponding propagation structure characteristics. The updating of each node information is formed by aggregating the adjacent nodes connected with the node information and the information of the node information.
Optionally, the event propagation structure feature is constructed using Graph Convolution Network (GCN). The node update formula between each layer of GCN and the next layer of GCN is as follows:
wherein,propagating a feature matrix of all nodes of the ith event structure chart after being subjected to l-layer convolution updating; wherein h isij (l)The feature vector is updated by the GCN for the jth node of the event;to normalize symmetrical adjacencyMatrix, DiIs a contiguous matrix AiA degree matrix of (c); w(l)Is a weight matrix; b is(l)Is a bias term; σ (-) is the activation function. When l is 0, Hi (0)=Xi;XiA text vector for the ith event; mixing XiInputting the propagation map characteristic matrix into a layer I of GCN, and obtaining a propagation map characteristic matrix H after the propagation map characteristic matrix is subjected to convolution updating of the layer Ii (l)Then all event propagation map feature matrix H(l)={H1 (l),H2 (l),…,Hi (l),…,Hn (l)}。
Propagation map feature H to be obtainedi (l)Performing pooling operation to obtain HiAnd predicting the rumor label of the event by using the softmax classifier, wherein the formula is as follows:
Then, obtaining a loss function according to the predicted rumor label and the real rumor label; the graph convolution neural network model is trained based on the loss function. The formula for model optimization of the loss function by using the cross entropy function is as follows:
and yicPredicted rumor signatures and true rumor signatures for events, respectively; n is the number of sample events; c is the number of label categories, which in this example is 2, i.e. rumor event and non-rumor event.
It should be noted that here, the rumor detection is assisted training, and after the training is completed, the propagation structure characteristic H of the sample event is obtainediThe training data from the encoder is factored as a conditional variation.
In this embodiment, the second propagation structure feature of the sample event can be quickly and accurately obtained through the graph convolution neural network model, and is used as training data to train and obtain the conditional variation autoencoder for generating the propagation structure feature according to the source text simulation.
On the basis of the foregoing embodiments, in this embodiment, the inputting the first source text into the conditional variation self-encoder to obtain the first propagation structure characteristic of the event to be detected, which is generated by the conditional variation self-encoder, includes: preprocessing the first source text; wherein the preprocessing comprises removing messy code data and text vectorization processing; and inputting the preprocessed first source text into the condition variational self-encoder to obtain a first propagation structure characteristic of the event to be detected.
Optionally, before performing step 102, data processing may be performed on the first source text; the pretreatment step comprises: firstly, removing messy code data and/or blank line data in first source text information to obtain effective first source text information which does not contain the messy code data and/or the blank line data; then, the first source text is coded into a text vector by using a Doc2vec model, and the text vector of the first source text information is obtained
In this embodiment, each word in the text can be represented as a low-dimensional vector with a value in a real number domain by converting the first source text into a text vector, and semantic features of each word can be encoded into the text vector.
After the first source text information is preprocessed, a text vector of the first source text information can be obtained; the first source text information text vector can be input into a condition variation self-encoder to obtain a first propagation structure characteristic of the event to be detected.
It should be noted that, in the training process, the second source text and the response tweet of the sample event may also be preprocessed in the above manner, and the preprocessed second source text and the response tweet are used to train the sample, so as to train the conditional variation autoencoder, the graph convolution neural network model, and the classification model.
In the embodiment, through the preprocessing of the first source text, a word vector containing the semantic features of each word can be generated, and meanwhile, a foundation is laid for the subsequent rumor detection.
Referring to fig. 3 and 4, the method for detecting an early rumor based on the generation propagation structure feature proposed in this embodiment mainly includes the following steps: collecting a sample; extracting data and vectorizing the text; constructing a propagation structure diagram of a sample event, and constructing propagation structure characteristics by using a graph convolution network; training a variation self-encoder according to the source signature and the propagation structure characteristic of the sample event to obtain a generator capable of accurately generating the propagation structure characteristic after training is completed; training the classification model according to the source signature and the propagation structure characteristics of the sample event and the real rumor label to obtain a classification model capable of accurately classifying the event; taking a source post of an event to be detected as a condition, and generating a simulation propagation structure characteristic corresponding to the condition by a generator; combining the source post of the event to be detected with the corresponding simulated propagation structure characteristic, and then carrying out rumor detection through a softmax function.
The following steps are detailed, and specifically include:
the specific logic flow is as follows:
step 2, vectorizing the text; the method specifically comprises the following steps:
step 2.1, processing the sample data set, and aggregating the source posts and the post transfer information of the same event by taking the event as a unit;
step 2.2, performing text vectorization on all posts of the sample event by using Doc2vec to obtain text vector representation of each post;
step 3, training a graph convolution network GCN model, and constructing the propagation structure characteristics of the event; the method specifically comprises the following steps:
step 3.1, construct sample event propagation structure diagram G ═ V, E, and obtain adjacency matrix a. V is a node set, E is an edge set, each node represents a post, and each edge represents a forwarding relation, a time dimension relation and a comment relation between two post nodes;
and 3.2, constructing the propagation structure characteristics of each sample event by using the graph convolution network model GCN.
Step 3.3, inputting the obtained propagation structure characteristics into a softmax function to obtain a rumor prediction label of the sample event, and training the graph convolution network model according to a loss function obtained by the rumor prediction label and the rumor real label until the training termination condition of the model is met;
inputting the propagation structure chart of the sample event into the trained graph convolution network model to obtain the propagation structure characteristic H of the sample eventiGenerating a training data set of the confrontation network model CGAN as a subsequent condition;
step 4, training a CGAN model according to the source posts and the propagation structure characteristics of the sample events; the method specifically comprises the following steps:
step 4.1, extracting a text vector of a source post of a sample event as an input condition in a conditional variation autoencoder CVAE model;
step 4.2, extracting the propagation structure characteristic H output by the graph convolution network model in the step 3iThe conditional argument is derived from the real sample data in the encoder CVAE model.
4.3, training a CVAE model; wherein the CVAE consists of an encoder E (-) and a decoder D (-); the input to E (-) is the source post ci(as a condition) and true sample data HiThe output is the mean μ and variance δ of the hidden variable z2(ii) a The input to D (-) is the source post ciAnd the output is true sample data HiReconstructed data;
step 5, training a classification model according to the source signature and the propagation structure characteristics of the sample event and the true rumor label; the method comprises the following steps:
step 5.1, acquiring a text vector and a propagation structure characteristic of a source post of a sample event;
step 5.2, training the classification model by taking the text vector and the propagation structure characteristics of the source note of the sample event as input information and the rumor detection label of the sample event as a real label;
step 6, generating corresponding simulated propagation structural characteristics according to the source post of the event to be detected; the method comprises the following steps:
step 6.1, extracting a text vector of a source post of the event to be detected;
step 6.2, inputting the text vector of the source post into a conditional variation self-encoder, and generating corresponding simulated propagation structure characteristics through the conditional variation self-encoder CVAE;
step 7, carrying out early rumor detection according to the source signature of the event to be detected and the generated simulated propagation structural characteristics; the method comprises the following steps:
step 7.1, extracting text information of a source post of the event to be detected and simulating propagation structure characteristics;
step 7.2, the source text is input into the long-short-term LSTM after word2vec vectorization to obtain the high-order text characteristic representation a of the source texti;
Step 7.3, text characteristic aiAnd combining with the simulated propagation structure characteristics, inputting the characteristics into a softmax function for rumor detection.
In summary, in the rumor detection method in this embodiment, first, an event propagation structure diagram is constructed according to the source post and the transfer post of the sample event, and then, propagation structure feature extraction is performed on the propagation structure diagram of the sample event by using the graph convolution neural network; inputting the source signature of the sample event and the extracted propagation structure characteristics into a conditional variation self-encoder, training the conditional variation self-encoder, and finally obtaining a generator capable of generating simulated propagation structure characteristics similar to real propagation structure characteristics; inputting the source post of the event to be detected into a trained generator, and generating a simulated propagation structure characteristic corresponding to the source post; finally, the source signature and the generated simulated propagation structural characteristics are combined to carry out early rumor detection, so that the rumor event can be quickly and accurately detected when the source signature is released.
In summary, the embodiment uses the graph convolution network to construct the propagation structure features, so that the close relation between the source posts and the transfer posts can be better captured, and the information features hidden in the propagation structure can be better learned; based on the conditional variation self-encoder, a generator with good performance can be trained, and the generator can automatically generate the propagation structure characteristics under the condition of only giving the source post text information, namely, the propagation structure of the rumor can be predicted under the condition that only the source post information exists in the early period of the rumor, and the early detection of the rumor is realized. Furthermore, generating a model of the structural characteristics of propagation augments the data for early rumor detection in the early absence of information on propagation.
The present invention provides an early rumor detection device based on generation propagation structural features, which is described below, and the early rumor detection device based on generation propagation structural features and the early rumor detection method based on generation propagation structural features described above can be referred to in correspondence.
As shown in fig. 5, the present embodiment provides an early rumor detection apparatus based on generation of propagation structural features, which includes: a text acquisition module 501, a propagation structure feature generation module 502, and a rumor detection module 503, wherein:
the text acquisition module 501 is configured to acquire a first source text of an event to be detected;
the propagation structure feature generation module 502 is configured to input the first source text into a conditional variation self-encoder to obtain a first propagation structure feature of the event to be detected, which is generated by the conditional variation self-encoder; the conditional variation self-encoder is used for training and acquiring according to a second source text and a second propagation structure characteristic of the sample event; the second propagation structure feature is obtained by carrying out feature extraction on the propagation structure diagram of the sample event based on a graph convolution neural network model; the propagation structure diagram is constructed and obtained according to a second source text of the sample event and a response tweet corresponding to the second source text;
the rumor detection module 503 is configured to input the first source text and the first propagation structure feature into a classification model, so as to obtain a rumor detection result of the event to be detected, which is output by the classification model; and the classification model is trained and obtained according to the second source text information, the second propagation structure characteristics and the rumor detection labels of the sample events.
In the embodiment, a propagation structure diagram is constructed according to a second source text and a response tweet of a sample event, the propagation structure diagram is subjected to feature extraction based on a graph convolution neural network model to obtain a second propagation structure feature of the sample event, and iteration is performed according to the second source text and the second propagation structure feature of the sample event to obtain a conditional variation self-encoder which can generate the propagation structure feature in a simulation mode according to the source text of the event; when a first source text of an event to be detected is obtained, automatically simulating and generating a first propagation structure characteristic of the event to be detected based on a condition variational self-encoder; then, combining the first propagation structure characteristic of the event to be detected and the first source text, and then carrying out rumor detection on the event to be detected; on one hand, in the whole rumor detection process, corresponding propagation structure characteristics can be generated quickly only by acquiring the first source text of the event to be detected, time delay caused by acquiring the propagation structure characteristics after collecting a large number of response tweets is reduced, the rumor can be detected at the source of release of the rumor, and the timeliness of rumor detection is effectively improved; on the other hand, the source text information of the event is combined with the propagation structure characteristics, so that the characteristics related to rumor detection are enriched, and the rumor detection result is more accurate.
On the basis of the above embodiments, the rumor detection module in this embodiment is specifically configured to: inputting the first source text into a feature extraction layer of the classification model to obtain text features of the first source text; and splicing the text characteristics and the first transmission structure characteristics, and inputting the text characteristics and the first transmission structure characteristics into a classification layer of the classification model to obtain a rumor detection result of the event to be detected.
On the basis of the above embodiment, the feature extraction layer of the classification model in this embodiment is constructed and generated based on a recurrent neural network model; and constructing and generating a classification layer of the classification model based on a softmax function.
On the basis of the foregoing embodiments, the propagation structure diagram in this embodiment is constructed based on the following steps: and constructing a propagation structure diagram of the sample event according to the response relation between any two nodes by taking the second source text and each response tweet in the sample event as nodes.
On the basis of the above embodiment, the response relationship in this embodiment includes a forwarding relationship, a time dimension relationship, and a comment relationship.
On the basis of the foregoing embodiments, the second propagation structure characteristic in this embodiment is obtained based on the following steps: inputting the propagation structure diagram of the sample event into the graph convolution neural network model to obtain a second propagation structure characteristic of the sample event output by a characteristic extraction layer of the graph convolution neural network model; and the graph convolution neural network model is trained and obtained according to the propagation structure diagram of the sample events and the rumor detection labels of the sample events.
On the basis of the foregoing embodiments, the propagation structure feature generation module in this embodiment is specifically configured to: preprocessing the first source text; wherein the preprocessing comprises removing messy code data and text vectorization processing; and inputting the preprocessed first source text into the condition variational self-encoder to obtain a first propagation structure characteristic of the event to be detected.
Fig. 6 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 6: a processor (processor)601, a communication Interface (Communications Interface)602, a memory (memory)603 and a communication bus 604, wherein the processor 601, the communication Interface 602 and the memory 603 complete communication with each other through the communication bus 604. Processor 601 may invoke logic instructions in memory 603 to perform an early rumor detection method based on generating a propagation structural signature, the method comprising: acquiring a first source text of an event to be detected; inputting the first source text into a conditional variation self-encoder to obtain a first propagation structure characteristic of the event to be detected, which is generated by the conditional variation self-encoder; the conditional variation self-encoder is used for training and acquiring according to a second source text and a second propagation structure characteristic of the sample event; the second propagation structure feature is obtained by carrying out feature extraction on the propagation structure diagram of the sample event based on a graph convolution neural network model; the propagation structure diagram is constructed and obtained according to a second source text of the sample event and a response tweet corresponding to the second source text; inputting the first source text and the first transmission structure characteristics into a classification model to obtain a rumor detection result of the event to be detected, which is output by the classification model; and the classification model is trained and obtained according to the second source text information, the second propagation structure characteristics and the rumor detection labels of the sample events.
In addition, the logic instructions in the memory 603 may be implemented in the form of software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present invention further provides a computer program product, the computer program product comprising a computer program stored on a non-transitory computer readable storage medium, wherein when the computer program is executed by a processor, the computer is capable of executing the early rumor detection method based on the generated propagation structure features provided by the above methods, the method comprising: acquiring a first source text of an event to be detected; inputting the first source text into a conditional variation self-encoder to obtain a first propagation structure characteristic of the event to be detected, which is generated by the conditional variation self-encoder; the conditional variation self-encoder is used for training and acquiring according to a second source text and a second propagation structure characteristic of the sample event; the second propagation structure feature is obtained by carrying out feature extraction on the propagation structure diagram of the sample event based on a graph convolution neural network model; the propagation structure diagram is constructed and obtained according to a second source text of the sample event and a response tweet corresponding to the second source text; inputting the first source text and the first transmission structure characteristics into a classification model to obtain a rumor detection result of the event to be detected, which is output by the classification model; and the classification model is trained and obtained according to the second source text information, the second propagation structure characteristics and the rumor detection labels of the sample events.
In yet another aspect, the present invention further provides a non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor, implements an early rumor detection method based on generation of propagation structural features provided by the above methods, the method including: acquiring a first source text of an event to be detected; inputting the first source text into a conditional variation self-encoder to obtain a first propagation structure characteristic of the event to be detected, which is generated by the conditional variation self-encoder; the conditional variation self-encoder is used for training and acquiring according to a second source text and a second propagation structure characteristic of the sample event; the second propagation structure feature is obtained by carrying out feature extraction on the propagation structure diagram of the sample event based on a graph convolution neural network model; the propagation structure diagram is constructed and obtained according to a second source text of the sample event and a response tweet corresponding to the second source text; inputting the first source text and the first transmission structure characteristics into a classification model to obtain a rumor detection result of the event to be detected, which is output by the classification model; and the classification model is trained and obtained according to the second source text information, the second propagation structure characteristics and the rumor detection labels of the sample events.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (10)
1. An early rumor detection method based on generation of propagation structural features, comprising:
acquiring a first source text of an event to be detected;
inputting the first source text into a conditional variation self-encoder to obtain a first propagation structure characteristic of the event to be detected, which is generated by the conditional variation self-encoder; the conditional variation self-encoder is used for training and acquiring according to a second source text and a second propagation structure characteristic of the sample event; the second propagation structure feature is obtained by performing feature extraction on the propagation structure diagram of the sample event based on a graph convolution neural network model; the propagation structure diagram is constructed and obtained according to a second source text of the sample event and a response tweet corresponding to the second source text;
inputting the first source text and the first transmission structure characteristics into a classification model to obtain a rumor detection result of the event to be detected, which is output by the classification model; and the classification model is trained and obtained according to the second source text information, the second propagation structure characteristics and the rumor detection labels of the sample events.
2. The method of claim 1, wherein the inputting the first source text and the first propagation structure feature into a classification model to obtain the rumor detection result of the event to be detected output by the classification model comprises:
inputting the first source text into a feature extraction layer of the classification model to obtain text features of the first source text;
and splicing the text characteristics and the first transmission structure characteristics, and inputting the text characteristics and the first transmission structure characteristics into a classification layer of the classification model to obtain a rumor detection result of the event to be detected.
3. The method for early rumor detection based on generation of propagation structural features of claim 1, wherein the feature extraction layer of the classification model is generated based on a recurrent neural network model construction;
and constructing and generating a classification layer of the classification model based on a softmax function.
4. The method for detecting early rumors based on the generation of propagation structural features of any one of claims 1-3, wherein said propagation structural map is constructed based on the following steps:
and constructing a propagation structure diagram of the sample event according to the response relation between any two nodes by taking the second source text and each response tweet in the sample event as nodes.
5. The method of claim 4, wherein the response relationship comprises a forwarding relationship, a time dimension relationship, and a comment relationship.
6. The method for detecting early rumors based on generation of propagation structural features according to any one of claims 1 to 3, wherein said second propagation structural feature is obtained based on the following steps:
inputting the propagation structure diagram of the sample event into the graph convolution neural network model to obtain a second propagation structure characteristic of the sample event output by a characteristic extraction layer of the graph convolution neural network model;
and the graph convolution neural network model is trained and obtained according to the propagation structure diagram of the sample events and the rumor detection labels of the sample events.
7. The method for detecting early rumors based on generation of propagation structural features according to any one of claims 1 to 3, wherein said inputting said first source text into a conditional variation self-encoder to obtain a first propagation structural feature of said to-be-detected event generated by said conditional variation self-encoder comprises:
preprocessing the first source text;
wherein the preprocessing comprises removing messy code data and text vectorization processing;
and inputting the preprocessed first source text into the condition variational self-encoder to obtain a first propagation structure characteristic of the event to be detected.
8. An early rumor detection device based on generation of propagation structural features, comprising:
the text acquisition module is used for acquiring a first source text of the event to be detected;
the propagation structure feature generation module is used for inputting the first source text into a conditional variation self-encoder to obtain a first propagation structure feature of the event to be detected, which is generated by the conditional variation self-encoder; the conditional variation self-encoder is used for training and acquiring according to a second source text and a second propagation structure characteristic of the sample event; the second propagation structure feature is obtained by performing feature extraction on the propagation structure diagram of the sample event based on a graph convolution neural network model; the propagation structure diagram is constructed and obtained according to a second source text of the sample event and a response tweet corresponding to the second source text;
the rumor detection module is used for inputting the first source text and the first propagation structure characteristics into a classification model to obtain a rumor detection result of the event to be detected, which is output by the classification model; and the classification model is trained and obtained according to the second source text information, the second propagation structure characteristics and the rumor detection labels of the sample events.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program implements the steps of the method for early rumor detection based on generation of propagation structure features according to any one of claims 1 to 7.
10. A non-transitory computer readable storage medium, having a computer program stored thereon, wherein the computer program, when being executed by a processor, implements the steps of the method for early rumor detection based on generation of propagation structure features according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111574716.9A CN114357160B (en) | 2021-12-21 | 2021-12-21 | Early rumor detection method and device based on generated propagation structural features |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111574716.9A CN114357160B (en) | 2021-12-21 | 2021-12-21 | Early rumor detection method and device based on generated propagation structural features |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114357160A true CN114357160A (en) | 2022-04-15 |
CN114357160B CN114357160B (en) | 2024-07-23 |
Family
ID=81100561
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111574716.9A Active CN114357160B (en) | 2021-12-21 | 2021-12-21 | Early rumor detection method and device based on generated propagation structural features |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114357160B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117573865A (en) * | 2023-10-19 | 2024-02-20 | 南昌大学 | Rumor fuzzy detection method based on interpretable adaptive learning |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021042503A1 (en) * | 2019-09-06 | 2021-03-11 | 平安科技(深圳)有限公司 | Information classification extraction method, apparatus, computer device and storage medium |
CN113742604A (en) * | 2021-08-24 | 2021-12-03 | 三峡大学 | Rumor detection method and device, electronic equipment and storage medium |
-
2021
- 2021-12-21 CN CN202111574716.9A patent/CN114357160B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021042503A1 (en) * | 2019-09-06 | 2021-03-11 | 平安科技(深圳)有限公司 | Information classification extraction method, apparatus, computer device and storage medium |
CN113742604A (en) * | 2021-08-24 | 2021-12-03 | 三峡大学 | Rumor detection method and device, electronic equipment and storage medium |
Non-Patent Citations (1)
Title |
---|
琚心怡;: "基于深层双向Transformer编码器的早期谣言检测", 信息通信, no. 05, 15 May 2020 (2020-05-15) * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117573865A (en) * | 2023-10-19 | 2024-02-20 | 南昌大学 | Rumor fuzzy detection method based on interpretable adaptive learning |
Also Published As
Publication number | Publication date |
---|---|
CN114357160B (en) | 2024-07-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111444340B (en) | Text classification method, device, equipment and storage medium | |
CN111737476B (en) | Text processing method and device, computer readable storage medium and electronic equipment | |
CN109582956B (en) | Text representation method and device applied to sentence embedding | |
CN111914185B (en) | Text emotion analysis method in social network based on graph attention network | |
CN112699998B (en) | Time series prediction method and device, electronic equipment and readable storage medium | |
CN111966812A (en) | Automatic question answering method based on dynamic word vector and storage medium | |
CN114780831A (en) | Sequence recommendation method and system based on Transformer | |
CN112463989B (en) | Knowledge graph-based information acquisition method and system | |
CN115565038A (en) | Content audit, content audit model training method and related device | |
CN115759254A (en) | Question-answering method, system and medium based on knowledge-enhanced generative language model | |
CN114722833B (en) | Semantic classification method and device | |
CN116383517A (en) | Dynamic propagation feature enhanced multi-modal rumor detection method and system | |
CN115952343A (en) | Social robot detection method based on multi-relation graph convolutional network | |
CN111161238A (en) | Image quality evaluation method and device, electronic device, and storage medium | |
CN112348001B (en) | Training method, recognition method, device, equipment and medium for expression recognition model | |
CN114118058A (en) | Emotion analysis system and method based on fusion of syntactic characteristics and attention mechanism | |
CN114357160A (en) | Early rumor detection method and device based on generation propagation structure characteristics | |
CN111767720B (en) | Title generation method, computer and readable storage medium | |
CN113705207A (en) | Grammar error recognition method and device | |
CN117669723A (en) | Knowledge-graph-based space human factor information learning and reasoning method and device | |
CN111241843B (en) | Semantic relation inference system and method based on composite neural network | |
Ermatita et al. | Sentiment Analysis of COVID-19 using Multimodal Fusion Neural Networks. | |
CN114792097B (en) | Method and device for determining prompt vector of pre-training model and electronic equipment | |
CN116484851A (en) | Pre-training model training method and device based on variant character detection | |
CN116401372A (en) | Knowledge graph representation learning method and device, electronic equipment and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |