CN105354305A - Online-rumor identification method and apparatus - Google Patents

Online-rumor identification method and apparatus Download PDF

Info

Publication number
CN105354305A
CN105354305A CN201510750244.6A CN201510750244A CN105354305A CN 105354305 A CN105354305 A CN 105354305A CN 201510750244 A CN201510750244 A CN 201510750244A CN 105354305 A CN105354305 A CN 105354305A
Authority
CN
China
Prior art keywords
comment
feeling polarities
information
network
review information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510750244.6A
Other languages
Chinese (zh)
Inventor
牛凯
杨也康
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN201510750244.6A priority Critical patent/CN105354305A/en
Publication of CN105354305A publication Critical patent/CN105354305A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Abstract

Embodiments of the present invention provide an online-rumor identification method and apparatus. The method comprises: according to online data information input by a user, collecting original online information and comment information with an online expression; classifying the comment information into positive or negative comment training data according to an emotional polarity of the online expression; filtering obsolete words; determining a first emotional polarity of each piece of comment information; determining a second emotional polarity of each piece of comment information according to each parameter and comment influence that are acquired and comment posting time; according to network characteristics of friendships of really commenting users, clustering all the really commenting users; normalizing a second emotional polarity of each classification; obtaining a weight of an emotional polarity of each classification; and determining whether the online data information is a rumor according to the weight of the emotional polarity. By applying the method and apparatus disclosed by the embodiments of the present invention, an online rumor can be identified according to the topology of a complex network, thereby simplifying operation.

Description

A kind of recognition methods of network rumour and device
Technical field
The present invention relates to applications of computer network technical field, particularly a kind of recognition methods of network rumour and device.
Background technology
Rumour is a kind of with the unverified elaboration to the interested things of the public, event or problem of open or non-public channel propagation or annotation.Traditional gossip propagation adopts the mode of diffusion, propagates secretly between interpersonal interaction.And along with the fast development of network technology, the propagation for rumour opens new field domain.Meanwhile, the anonymity of network, the easier characteristic to public propagation have encouraged propagation and the influence power of rumour, and this has to cause the attention of people.
In prior art, the recognition methods of network rumour is classification learning method.The method regards classification problem as by whether credible for the network information, utilizes the feature of the network information, takes specific classification algorithm training sorter, determines that whether the network information is credible, to predict that whether the network information is for rumour.
Feature Selection is the committed step identifying rumour, the accuracy of impact classification prediction to a great extent.Current mainstream research thinks that the factor affecting information credibility mainly comes from content of text, customer attribute information and the Internet communication degree of depth.When rumour identification is carried out to the network information, using the text statistical nature of shallow-layer as the text feature identifying rumour, as whether comprised URL, Information issued client media types (Web/ moves), venue location point and feeling polarities etc. in text, these text features directly can reflect the authenticity of the network information.Carry out user characteristics when extracting, main consider have: the user that authenticity is higher, its information credibility issued is also higher, and Consideration comprises user's bean vermicelli number, pays close attention to number, the registration age, to have released news quantity and the whether personal information such as authentication of users.Propagate category feature, mainly consider whether information is forwarded, hop count and comment number etc.
But this method can not utilize the topological structure of complex network well, training characteristics is too much, complicated operation, poor universality.
Summary of the invention
The object of the embodiment of the present invention is the recognition methods and the device that provide a kind of network rumour, to utilize the Topology identification network rumour of complex network, simplifies the operation.
For achieving the above object, the embodiment of the invention discloses a kind of recognition methods of network rumour, described method comprises the steps:
Receive the network data information of user's input;
According to described network data information, gather in the original network information and the described original network information review information with network expression, described review information comprises: comment text content, comment user profile, comment issuing time and comment point praise number of times;
According to the feeling polarities of network expression, described review information is distinguished forward comment training data and negative sense comment training data, participle is carried out to the comment text content in described review information, delete the stop words in described review information, and in conjunction with sentiment dictionary, determine the first feeling polarities of described every bar review information;
Comment on term vector, the comment influence power of word composition not deleted in training data, comment text content according to the first feeling polarities of described every bar review information, described forward comment training data, described negative sense and comment on the second feeling polarities that described every bar review information is determined at issuing time interval; Described comment influence power is praised number of times according to described comment point and is obtained;
According to the original ratio of the described comment hour of log-on of user, bean vermicelli and good friend's ratio and the network information, acquisition truly comments on user;
According to the network characterization of the friend relation of described true comment user, cluster is carried out to all described true comment users;
Truly comment on the second feeling polarities corresponding to user according in class with described in each, the second feeling polarities of each class of normalization, obtains the weights of the feeling polarities of each class;
According to the weights of the described feeling polarities of all classes, judge whether described network data information is rumour.
Can in implementation in one of the present invention, described according to described network data information, gather the original network information, comprising:
According to described network data information, utilize regular expression to build keyword grammer, gather the described original network information of predetermined number in a network;
If the quantity of the described original network information gathered does not reach predetermined number, then according to forwarding relation, gather the transmission network information of the described original network information in a network, and it can be used as the described original network information, until the quantity of the described original network information reaches described predetermined number.
Can in implementation in one of the present invention, described participle is carried out to the comment text content in described review information, delete the stop words in described review information, and in conjunction with sentiment dictionary, determine the first feeling polarities of described every bar review information, comprising:
Participle is carried out to described comment text content, deletes the auxiliary words of mood in described review information, conjunction and preposition;
According to sentiment dictionary, determine each not deleted word w nemotion value k (w n), the scope of described emotion value is [-1,1];
According to described emotion value k (w n) and the distance dis (w of main body e of described each word and this comment text content n, e), determine the first feeling polarities score (e) of this review information, described distance dis (w n, e) be the n-th word w nand the number of characters at interval between the main body e of this comment text content, described first feeling polarities score (e) is:
s c o r e ( e ) = Σ w n k ( w n ) d i s ( w n , e ) .
Can in implementation in one of the present invention, the second feeling polarities of described every bar review information is determined at the term vector that in described the first feeling polarities according to described every bar review information, described forward comment training data, described negative sense comment training data, comment text content, not deleted word forms, comment influence power and comment issuing time interval, comprising:
Pass through formula
P o l a r ( c ) = Σ i = 1 n P ( θ i + ) P ( θ i - ) P ( c | θ i + ) P ( c | θ i - )
Determine the second feeling polarities of described every bar review information; Wherein, the second feeling polarities that polar (c) is this review information, be this forward comment training data that i-th the first feeling polarities is corresponding, it is the negative sense comment training data that i-th the first feeling polarities is corresponding;
Above-mentioned formula is decomposed into: P ( c | θ i + ) P ( c | θ i - ) = L · d ( t ) · Σ j = 1 n log P ( w j | θ i + ) P ( w j | θ i - ) , Wherein, L is the comment influence power of this review information, the comment issuing time interval that d (t) is this review information, (w 1w 2... w n) for word not deleted in the comment text content of this review information composition term vector, w jfor the not deleted word of jth in this term vector.
Can in implementation in one of the present invention, the network characterization of the described friend relation according to described true comment user, carries out cluster to all described true comment users, comprising:
According to the friend relation of described true comment user, build adjacency matrix A=[a kq] n × N, wherein, principal diagonal is all 0, all the other nodes: if there is concern relation between two comment users, be then 1; Otherwise be 0;
Structure degree matrix D=diag (| D 1|, | D 2| ..., | D n|), wherein, | D k| represent the degree of comment user k, the degree of comment user k is the quantity that there is the comment user of concern relation with comment user k;
According to adjacency matrix A=[a kq] n × Nwith degree matrix D=diag (| D 1|, | D 2| ..., | D n|) building Laplacian Matrix L, described Laplacian Matrix L is: L=D-A;
According to the cluster number K preset, solve described Laplacian Matrix L, obtain front K the minimum non-zero eigenwert of this Laplacian Matrix L and K proper vector of correspondence;
The matrix of a N × K is built according to K proper vector;
K-means algorithm is utilized to carry out cluster.
Can in implementation in one of the present invention, the weights of the described described feeling polarities according to all classes, judge whether described network data information is rumour, comprising:
The weights of the described feeling polarities of all classes of suing for peace;
If the weights of described feeling polarities and be greater than 0, then judge that described network data information is as non-rumour; Otherwise judge that described network data information is as rumour.
For achieving the above object, the embodiment of the invention also discloses a kind of recognition device of network rumour, it is characterized in that, described device comprises: receiving element, collecting unit, the first feeling polarities determining unit, the second feeling polarities determining unit, true comment user obtain unit, cluster cell, polarity weights acquisition unit and rumour identifying unit;
Described receiving element, for receiving the network data information of user's input;
Described collecting unit, for according to described network data information, gather in the original network information and the described original network information review information with network expression, described review information comprises: comment text content, comment user profile, comment issuing time and comment point praise number of times;
Described first feeling polarities determining unit, for the feeling polarities of expressing one's feelings according to network, described review information is distinguished forward comment training data and negative sense comment training data, participle is carried out to the comment text content in described review information, delete the stop words in described review information, and in conjunction with sentiment dictionary, determine the first feeling polarities of described every bar review information;
Described second feeling polarities determining unit, comments on term vector, the comment influence power of word composition not deleted in training data, comment text content for the first feeling polarities according to described every bar review information, described forward comment training data, described negative sense and comments on the second feeling polarities that described every bar review information is determined at issuing time interval; Described comment influence power is praised number of times according to described comment point and is obtained;
Described true comment user obtains unit, and for the original ratio according to the described comment hour of log-on of user, bean vermicelli and good friend's ratio and the network information, acquisition truly comments on user;
Described cluster cell, for the network characterization of the friend relation according to described true comment user, carries out cluster to all described true comment users;
Described polarity weights obtain unit, and for truly commenting on the second feeling polarities corresponding to user according in class with described in each, the second feeling polarities of each class of normalization, obtains the weights of the feeling polarities of each class;
Described rumour identifying unit, for the weights of the described feeling polarities according to all classes, judges whether described network data information is rumour.
Can in implementation in one of the present invention, described first feeling polarities determining unit, comprising: review information is distinguished subelement, deleted subelement, emotion value determination subelement and the first feeling polarities determination subelement;
Described review information distinguishes subelement, for the feeling polarities of expressing one's feelings according to network, described review information is distinguished forward comment training data and negative sense comment training data;
Described deletion subelement, for carrying out participle to described comment text content, deletes the auxiliary words of mood in described review information, conjunction and preposition;
Described emotion value determination subelement, for according to sentiment dictionary, determines each not deleted word w nemotion value k (w n), the scope of described emotion value is [-1,1];
Described first feeling polarities determination subelement, for according to described emotion value k (w n) and the distance dis (w of main body e of described each word and this comment text content n, e), determine the first feeling polarities score (e) of this review information, described distance dis (w n, e) be the n-th word w nand the number of characters at interval between the main body e of this comment text content, described first feeling polarities score (e) is:
s c o r e ( e ) = Σ w n k ( w n ) d i s ( w n , e ) .
Can in implementation in one of the present invention, described second feeling polarities determining unit, comprising: the second feeling polarities determination subelement and formula decompose subelement;
Described second feeling polarities determination subelement, for passing through formula
P o l a r ( c ) = Σ i = 1 n P ( θ i + ) P ( θ i - ) P ( c | θ i + ) P ( c | θ i - )
Determine the second feeling polarities of described every bar review information; Wherein, the second feeling polarities that polar (c) is this review information, be this forward comment training data that i-th the first feeling polarities is corresponding, it is the negative sense comment training data that i-th the first feeling polarities is corresponding;
Described formula decomposes subelement, for being decomposed into by above-mentioned formula: wherein, L is the comment influence power of this review information, the comment issuing time interval that d (t) is this review information, (w 1w 2... w n) for word not deleted in the comment text content of this review information composition term vector, w jfor the not deleted word of jth in this term vector.
Can in implementation in one of the present invention, described cluster cell, comprising: adjacency matrix builds subelement, degree matrix builds subelement, Laplacian Matrix builds subelement, solve subelement, matrix builds subelement and cluster subelement;
Described adjacency matrix builds subelement, for the friend relation according to described true comment user, builds adjacency matrix A=[a kq] n × N, wherein principal diagonal is all 0, all the other nodes: if there is concern relation between two comment users, be then 1; Otherwise be 0;
Described degree matrix builds subelement, for degree of structure matrix D=diag (| D 1|, | D 2| ..., | D n|), wherein, | D k| represent the degree of comment user k, the degree of comment user k is the quantity that there is the comment user of concern relation with comment user k;
Described Laplacian Matrix builds subelement, for according to adjacency matrix A=[a kq] n × Nwith degree matrix D=diag (| D 1|, | D 2| ..., | D n|) building Laplacian Matrix L, described Laplacian Matrix L is: L=D-A;
Describedly solve subelement, for according to the cluster number K preset, solve described Laplacian Matrix L, obtain front K the minimum non-zero eigenwert of this Laplacian Matrix L and K proper vector of correspondence;
Described matrix builds subelement, for building the matrix of a N × K according to K proper vector;
Described cluster subelement, carries out cluster for utilizing K-means algorithm.
Visible, in the embodiment of the present invention, choose the feeling polarities feature of text feature, comment time series and comment influence power, the comment user hour of log-on of user characteristics, bean vermicelli and good friend's ratio and the original ratio of the network information, and set up network topology structure according to the friend relation of comment user, spectral clustering is utilized to classify, according to the weights of the feeling polarities of each class and, whether recognition network information is rumour, the topological structure of complex network can be utilized so well, reduce training characteristics, simplify the operation, versatility is better.Certainly, arbitrary product of the present invention is implemented or method must not necessarily need to reach above-described all advantages simultaneously.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
The schematic flow sheet of the recognition methods of a kind of network rumour that Fig. 1 provides for the embodiment of the present invention;
The cluster result schematic diagram that Fig. 2 provides for the embodiment of the present invention;
The structural representation of the recognition device of a kind of network rumour that Fig. 3 provides for the embodiment of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
The embodiment of the invention discloses a kind of recognition methods and device of network rumour, in the program, according to the network data information of user's input, gather in the original network information and the original network information review information with network expression, according to the feeling polarities of network expression, review information is distinguished forward and negative sense comment training data, filter stop words, in conjunction with sentiment dictionary, determine the first feeling polarities of every bar review information, according to the parameters obtained and comment influence power with comment on issuing time interval and determine the second feeling polarities of every bar review information, according to the network characterization of the friend relation of true comment user, cluster is carried out to all true comment users, second feeling polarities of each class of normalization, obtain the weights of the feeling polarities of each class, according to the weights of this feeling polarities of all classes, judge whether network data information is rumour.
Below by specific embodiment, the present invention is described in detail.
The schematic flow sheet of the recognition methods of a kind of network rumour provided with reference to figure 1, Fig. 1 embodiment of the present invention, can comprise the steps:
S101: the network data information receiving user's input;
Wherein, network data information can be a comment, also can be one or more keyword.
S102: according to described network data information, gathers in the original network information and the described original network information review information with network expression;
Here, review information comprises: comment text content, comment user profile, comment issuing time and comment point praise number of times.
Wherein, according to network data information, gather the original network information, can comprise:
According to network data information, utilize regular expression to build keyword grammer, gather the original network information of predetermined number in a network;
If the quantity of the original network information gathered does not reach predetermined number, then according to forwarding relation, gather the transmission network information of the original network information in a network, and it can be used as the original network information, until the quantity of the original network information reaches predetermined number.
In practical application, reptile instrument and open network platform application programming interfaces can be adopted to gather the original network information, the method for collection has the collection of topic expression, canonical collection and forwarding to gather.
Wherein, the collection of topic label is: the keyword (clue theme or feature main body) in network data information is carried out to mark and crawls.
Canonical collection is: utilize regular expression to build keyword grammer, extract the network text comprising this keyword grammer, obtains magnanimity and to be correlated with rumour data.
The keyword that the method collects according to topic label acquisition method, utilizes regular expression to build keyword grammer, gathers the original network information of predetermined number in a network.Suppose, predetermined number is 10000, if utilize the quantity of the original network information of canonical acquisition method collection to be 9000, then need to utilize to forward and gather, according to forwarding relation, gather the transmission network information of the original network information in a network, and it can be used as the original network information, until the quantity of the described original network information reaches 10000.
Here forwarding collection is: utilize the forwarding relation in the network platform to crawl consolidated network information.The network user can be participated among the discussion of the network information by forwarding, thus forms multilayer forwarding network.For raw network information, travel through different forward node network, crawl magnanimity related data.
After determining the original network information, gather the review information with network expression in these original network informations.Because network expression can be categorized as positive and negative and neutral three classes, show emotion polarity, in the identification of network rumour, can think more clear and definite with the emotion polarity expressed by the review information of network expression.
S103: according to the feeling polarities of network expression, described review information is distinguished forward comment training data and negative sense comment training data, participle is carried out to the comment text content in described review information, delete the stop words in described review information, and in conjunction with sentiment dictionary, determine the first feeling polarities of described every bar review information;
Here, according to the feeling polarities of network expression, rough sort is carried out to review information, review information is distinguished forward comment training data and negative sense comment training data.
In addition, participle is carried out to the comment text content in review information, delete the stop word in review information, and in conjunction with sentiment dictionary, determine the first feeling polarities of every bar review information, can comprise the steps:
A1, participle is carried out to comment text content, delete the auxiliary words of mood in described review information, conjunction and preposition;
Here, participle is carried out to comment text content, obtain the part of speech of each word, part of speech comprises: verb, adjective, adverbial word, noun, auxiliary words of mood, conjunction, preposition and adverbial word etc., and wherein, auxiliary words of mood, conjunction and preposition etc. are without emotion polarity, can be deleted, to reduce calculated amount.
A2, according to sentiment dictionary, determine each not deleted word w nemotion value k (w n);
Wherein, the emotion value of word can be taken at the Chinese sentiment dictionary collection knowing net.This dictionary contains the emotion words in the expression of major part Chinese, and is just classified, and negative or neutral, the scope of emotion value scope is [-1,1].If a word value is close to 1, this word is the word of a forward.If the value close-1 of a word, this word polarity is negative sense.
A3, according to emotion value k (w n) and the distance dis (w of main body e of each word and this comment text content n, e), determine the first feeling polarities score (e) of this review information, wherein, distance dis (w n, e) be the n-th word w nand the number of characters at interval between the main body e of this comment text content, the first feeling polarities score (e) can be expressed as:
s c o r e ( e ) = Σ w n k ( w n ) d i s ( w n , e ) . - - - ( 1 )
So effective reduction departs from the emotion word weight of the main body of comment text content, improves concrete weight of modifying the subject emotion word of comment text content.
S104: comment on term vector, the comment influence power of word composition not deleted in training data, comment text content according to the first feeling polarities of described every bar review information, described forward comment training data, described negative sense and comment on the second feeling polarities that described every bar review information is determined at issuing time interval;
Wherein, comment on influence power and praise number of times acquisition according to comment point.
Particularly, can comprise the steps:
Pass through formula
P o l a r ( c ) = Σ i = 1 n P ( θ i + ) P ( θ i - ) P ( c | θ i + ) P ( c | θ i - ) - - - ( 2 )
Determine the second feeling polarities of every bar review information; Wherein, the second feeling polarities that polar (c) is this review information, be this forward comment training data that i-th the first feeling polarities is corresponding, it is the negative sense comment training data that i-th the first feeling polarities is corresponding.
Formula (2) can be decomposed into:
P ( c | θ i + ) P ( c | θ i - ) = L · d ( t ) · Σ j = 1 n log P ( w j | θ i + ) P ( w j | θ i - ) , - - - ( 3 )
Wherein, L is the comment influence power of this review information, the comment issuing time interval that d (t) is this review information, (w 1w 2... w n) for word not deleted in the comment text content of this review information composition term vector, w jfor the not deleted word of jth in this term vector.
S105: according to the original ratio of the described comment hour of log-on of user, bean vermicelli and good friend's ratio and the network information, acquisition truly comments on user;
In practical application, compared to comment spam user, the comment content being derived from true comment user has larger break-up value and reference value.By the behavioural analysis to comment spam user, choose following three kinds of comment user characteristicses and distinguished: the original ratio of comment user hour of log-on, bean vermicelli and good friend's ratio and the network information.
The hour of log-on of comment user has reacted a comment user to the service time of this network platform, hour of log-on and the comment spam user of true comment user have marked difference, by the differentiation to user's hour of log-on, the authenticity of comment user effectively can be screened.
Bean vermicelli and good friend's ratio of comment user have reacted the ratio commented on user's bean vermicelli quantity He pay close attention to quantity.Comment spam user tends to pay close attention to a large number of users, and truly comment on user's less concern comment spam user, therefore just form comment spam user and pay close attention to the extremely many and situation that bean vermicelli quantity is few of quantity, thus effectively distinguish true comment user and comment spam user.
The original ratio of the network information of comment user has reacted the basic mode that comment user releases news.The original ratio of true comment user is higher, and comment spam user is content distributed substantially by forwarding Composition of contents.
S106: according to the network characterization of the friend relation of described true comment user, cluster is carried out to all described true comment users;
Particularly, can comprise the steps:
B1, friend relation according to described true comment user, build adjacency matrix A=[a kq] n × N;
Wherein, principal diagonal is all 0, all the other nodes: if there is concern relation between two comment users, be then 1; Otherwise be 0.
In practical application, commenting on user in network structure can be regarded node, and the concern relation between comment user can be considered as the fillet between node.Definition good friend relational network is for having no right network, and the weight on limit is 1, if comment user node k pays close attention to user node q, so a kq=1, represent the fillet between node.
B2, structure degree matrix D=diag (| D 1|, | D 2| ..., | D n|);
Wherein, | D k| represent the degree of comment user k, the degree of comment user k is the quantity that there is the comment user of concern relation with comment user k.
B3, according to adjacency matrix A=[a kq] n × Nwith degree matrix D=diag (| D 1|, | D 2| ..., | D n|) build Laplacian Matrix L;
Here, Laplacian Matrix L can be expressed as: L=D-A.
The cluster number K that B4, basis are preset, solves this Laplacian Matrix L, obtains front K the minimum non-zero eigenwert of this Laplacian Matrix L and K proper vector of correspondence;
B5, build the matrix of a N × K according to K proper vector;
B6, K-means algorithm is utilized to carry out cluster.
In this in situation, adopt spectral clustering, what this algorithm had a lower complexity is applicable to mass data scene.The object of spectral clustering is just to locate a kind of method of rational segmentation figure, and make to split several subgraphs of rear formation, the weight or the similarity that connect the limit of different subgraph are low as far as possible, with the weight on the limit in subgraph or similarity high as far as possible.
Particularly, definition RatioCut minimizes the contact between class:
ρ ( κ 1 , κ 2 ) = Σ k ∈ κ 1 , q ∈ κ 2 a k q | κ 1 | | κ 2 | . - - - ( 4 )
Wherein, κ 1∪ κ 2=κ, a kqbe the weights of node k to node q, if two nodes are not connected, weights are zero.Utilize the most important character of RatioCut:
m i n κ 1 , κ 2 ρ ( κ 1 , κ 2 ) ≥ λ 2 K , - - - ( 5 )
The uncertainty np problem of polynomial expression complexity originally can be converted into the minimal eigenvalue solving Laplacian Matrix L.
Determine required clusters number K, by the matrix of vectorial for this K feature (row) composition arranged together N × K, regard wherein every a line as a vector in K dimension space, and use K-means algorithm to carry out cluster.In the result of cluster, every classification belonging to a line is exactly the node in original network structure that is the classification belonging to initial K data point difference.
Spectral clustering computation complexity is than traditional clustering algorithm, and such as K-means is much smaller.High dimensional data show particularly evident.For the sparse matrix that dimension is very high, eig is very efficient way, and the result obtained is the vector (usual K can not be very large) that some K tie up, and this is the result of carrying out dimensionality reduction by Laplacian Matrix proper vector.The data of these low-dimensionals are done K-means operand very little.
S107: truly comment on the second feeling polarities corresponding to user according in class with described in each, the second feeling polarities of each class of normalization, obtains the weights of the feeling polarities of each class;
S108: according to the weights of the described feeling polarities of all classes, judges whether described network data information is rumour.
Wherein, according to the weights of the feeling polarities of all classes, judge whether network data information is rumour, can comprise:
The weights of the feeling polarities of all classes of suing for peace;
If the weights of this feeling polarities and be greater than 0, then determine that described network data information is non-rumour; If the weights of this feeling polarities and be not more than 0, then determine that described network data information is rumour.
With reference to the cluster result schematic diagram that figure 2, Fig. 2 provides for the embodiment of the present invention, as can be seen from the figure, Centroid represents the publisher node of the original network information to be analyzed, and what periphery was connected is comment user node.The polarity of comment belonging to the depth correspondence of Node color.Comprise 6 comment class of subscribers in figure, wherein minority isolated comment user node is classified as a class.Second emotion polarity of normalization these 6 comment class of subscriber respectively, at the weights of the feeling polarities of summation these 6 comment class of subscriber, if the value finally obtained is greater than 0, then can think that this network information is real event, otherwise can think that this network information is rumour.
Apply embodiment illustrated in fig. 1, choose the feeling polarities feature of text feature, comment time series and comment influence power, the comment user hour of log-on of user characteristics, bean vermicelli and good friend's ratio and the original ratio of the network information, and set up network topology structure according to the friend relation of comment user, spectral clustering is utilized to classify, according to the weights of the feeling polarities of each class and, whether recognition network information is rumour, the topological structure of complex network can be utilized so well, reduce training characteristics, simplify the operation, versatility is better.
With reference to figure 3, the structural representation of the recognition device of a kind of network rumour that Fig. 3 provides for the embodiment of the present invention, can comprise: receiving element 301, collecting unit 302, first feeling polarities determining unit 303, second feeling polarities determining unit 304, true comment user obtain unit 305, cluster cell 306, polarity weights acquisition unit 307 and rumour identifying unit 308.
Wherein, receiving element 301, for receiving the network data information of user's input.
Collecting unit 302, for according to network data information, gathers in the original network information and the original network information review information with network expression.Here, review information comprises: comment text content, comment user profile, comment issuing time and comment point praise number of times.
First feeling polarities determining unit 303, for the feeling polarities of expressing one's feelings according to network, review information is distinguished forward comment training data and negative sense comment training data, participle is carried out to the comment text content in review information, delete the stop words in review information, and in conjunction with sentiment dictionary, determine the first feeling polarities of every bar review information.
Second feeling polarities determining unit 304, comments on term vector, the comment influence power of word composition not deleted in training data, comment text content for the first feeling polarities according to every bar review information, forward comment training data, negative sense and comments on the second feeling polarities that every bar review information is determined at issuing time interval.Here, comment on influence power and praise number of times acquisition according to comment point.
True comment user obtains unit 305, and for the original ratio according to the comment hour of log-on of user, bean vermicelli and good friend's ratio and the network information, acquisition truly comments on user.
Cluster cell 306, for the network characterization of the friend relation according to true comment user, carries out cluster to all true comment users.
Polarity weights obtain unit 307, and for truly commenting on the second feeling polarities corresponding to user according in class with each, the second feeling polarities of each class of normalization, obtains the weights of the feeling polarities of each class.
Rumour identifying unit 308, for the weights of the described feeling polarities according to all classes, judges whether network data information is rumour.
In practical application, collecting unit 302 can comprise: the first original information acquisition subelement, the second original information acquisition subelement and review information gather subelement.(not shown in Fig. 3)
Wherein, the first original information acquisition subelement, for according to network data information, utilizes regular expression to build keyword grammer, gathers the original network information of predetermined number in a network.If the quantity of the original network information gathered does not reach predetermined number, then trigger the second original information acquisition subelement.
Here, the second original information acquisition subelement, for according to forwarding relation, gathers the transmission network information of the original network information in a network, and it can be used as the original network information, until the quantity of the original network information reaches predetermined number.
Review information gathers subelement, for gathering in the original network information review information with network expression.
In addition, the first feeling polarities determining unit 303, can comprise: review information is distinguished subelement, deleted subelement, emotion value determination subelement and the first feeling polarities determination subelement.(not shown in Fig. 3)
Wherein, review information distinguishes subelement, for the feeling polarities of expressing one's feelings according to network, review information is distinguished forward comment training data and negative sense comment training data.
Deleting subelement, for carrying out participle to comment text content, deleting auxiliary words of mood, conjunction and the preposition in review information.
Emotion value determination subelement, for according to sentiment dictionary, determines each not deleted word w nemotion value k (w n).Here, the scope of emotion value is [-1,1].
First feeling polarities determination subelement, for according to emotion value k (w n) and the distance dis (w of main body e of each word and this comment text content n, e), determine the first feeling polarities score (e) of this review information.Here, distance dis (w n, e) be the n-th word w nand the number of characters at interval between the main body e of this comment text content, the first feeling polarities score (e) can be expressed as:
s c o r e ( e ) = Σ w n k ( w n ) d i s ( w n , e ) .
In practical application, the second feeling polarities determining unit 304, can comprise: the second feeling polarities determination subelement and formula decompose subelement.(not shown in Fig. 3)
Second feeling polarities determination subelement, for passing through formula
P o l a r ( c ) = Σ i = 1 n P ( θ i + ) P ( θ i - ) P ( c | θ i + ) P ( c | θ i - )
Determine the second feeling polarities of every bar review information.Wherein, the second feeling polarities that polar (c) is this review information, be this forward comment training data that i-th the first feeling polarities is corresponding, it is the negative sense comment training data that i-th the first feeling polarities is corresponding.
Formula decomposes subelement, for being decomposed into by above-mentioned formula:
P ( c | θ i + ) P ( c | θ i - ) = L · d ( t ) · Σ j = 1 n log P ( w j | θ i + ) P ( w j | θ i - ) .
Wherein, L is the comment influence power of this review information, the comment issuing time interval that d (t) is this review information, (w 1w 2... w n) for word not deleted in the comment text content of this review information composition term vector, w jfor the not deleted word of jth in this term vector.
In addition, cluster cell 306, can comprise: adjacency matrix builds subelement, degree matrix structure subelement, Laplacian Matrix structure subelement, solves subelement, matrix structure subelement and cluster subelement.(not shown in Fig. 3)
Wherein, adjacency matrix builds subelement, for the friend relation according to real user, builds adjacency matrix A=[a kq] n × N.Wherein, principal diagonal is all 0, all the other nodes: if there is concern relation between two users, be then 1; Otherwise be 0.
Degree matrix build subelement, for degree of structure matrix D=diag (| D 1|, | D 2| ..., | D n|).Wherein, | D k| represent the degree of user k, the degree of user k is the quantity of the user that there is concern relation with user k.
Laplacian Matrix builds subelement, for according to adjacency matrix A=[a kq] k × Kwith degree matrix D=diag (| D 1|, | D 2| ..., | D n|) build Laplacian Matrix L.Here, Laplacian Matrix L can be expressed as: L=D-A.
Solve subelement, for according to the cluster number K preset, solve Laplacian Matrix L, obtain front K the minimum non-zero eigenwert of this Laplacian Matrix L and K proper vector of correspondence.
Matrix builds subelement, for building the matrix of a N × K according to K proper vector.
Cluster subelement, carries out cluster for utilizing K-means algorithm.
In this case, rumour identifying unit 308, can comprise: polarity weights summation subelement and rumour judge subelement.
Wherein, the summation of polarity weights obtains subelement, for the weights of the feeling polarities of all classes of suing for peace.
Rumour judge subelement, if for feeling polarities weights and be greater than 0, decision network data message is non-rumour, otherwise determines that network data information is rumour.
Apply embodiment illustrated in fig. 3, choose the feeling polarities feature of text feature, comment time series and comment influence power, the comment user hour of log-on of user characteristics, bean vermicelli and good friend's ratio and the original ratio of the network information, and set up network topology structure according to the friend relation of comment user, spectral clustering is utilized to classify, according to the weights of the feeling polarities of each class and, whether recognition network information is rumour, the topological structure of complex network can be utilized so well, reduce training characteristics, simplify the operation, versatility is better.
It should be noted that, in this article, the such as relational terms of first and second grades and so on is only used for an entity or operation to separate with another entity or operational zone, and not necessarily requires or imply the relation that there is any this reality between these entities or operation or sequentially.And, term " comprises ", " comprising " or its any other variant are intended to contain comprising of nonexcludability, thus make to comprise the process of a series of key element, method, article or equipment and not only comprise those key elements, but also comprise other key elements clearly do not listed, or also comprise by the intrinsic key element of this process, method, article or equipment.When not more restrictions, the key element limited by statement " comprising ... ", and be not precluded within process, method, article or the equipment comprising described key element and also there is other identical element.
Each embodiment in this instructions all adopts relevant mode to describe, between each embodiment identical similar part mutually see, what each embodiment stressed is the difference with other embodiments.Especially, for system embodiment, because it is substantially similar to embodiment of the method, so description is fairly simple, relevant part illustrates see the part of embodiment of the method.
The foregoing is only preferred embodiment of the present invention, be not intended to limit protection scope of the present invention.All any amendments done within the spirit and principles in the present invention, equivalent replacement, improvement etc., be all included in protection scope of the present invention.

Claims (10)

1. a recognition methods for network rumour, is characterized in that, described method comprises step:
Receive the network data information of user's input;
According to described network data information, gather in the original network information and the described original network information review information with network expression, described review information comprises: comment text content, comment user profile, comment issuing time and comment point praise number of times;
According to the feeling polarities of network expression, described review information is distinguished forward comment training data and negative sense comment training data, participle is carried out to the comment text content in described review information, delete the stop words in described review information, and in conjunction with sentiment dictionary, determine the first feeling polarities of described every bar review information;
Comment on term vector, the comment influence power of word composition not deleted in training data, comment text content according to the first feeling polarities of described every bar review information, described forward comment training data, described negative sense and comment on the second feeling polarities that described every bar review information is determined at issuing time interval; Described comment influence power is praised number of times according to described comment point and is obtained;
According to the original ratio of the described comment hour of log-on of user, bean vermicelli and good friend's ratio and the network information, acquisition truly comments on user;
According to the network characterization of the friend relation of described true comment user, cluster is carried out to all described true comment users;
Truly comment on the second feeling polarities corresponding to user according in class with described in each, the second feeling polarities of each class of normalization, obtains the weights of the feeling polarities of each class;
According to the weights of the described feeling polarities of all classes, judge whether described network data information is rumour.
2. method according to claim 1, is characterized in that, described according to described network data information, gathers the original network information, comprising:
According to described network data information, utilize regular expression to build keyword grammer, gather the described original network information of predetermined number in a network;
If the quantity of the described original network information gathered does not reach predetermined number, then according to forwarding relation, gather the transmission network information of the described original network information in a network, and it can be used as the described original network information, until the quantity of the described original network information reaches described predetermined number.
3. method according to claim 1, is characterized in that, describedly carries out participle to the comment text content in described review information, delete the stop words in described review information, and in conjunction with sentiment dictionary, determine the first feeling polarities of described every bar review information, comprising:
Participle is carried out to described comment text content, deletes the auxiliary words of mood in described review information, conjunction and preposition;
According to sentiment dictionary, determine each not deleted word w nemotion value k (w n), the scope of described emotion value is [-1,1];
According to described emotion value k (w n) and the distance dis (w of main body e of described each word and this comment text content n, e), determine the first feeling polarities score (e) of this review information, described distance dis (w n, e) be the n-th word w nand the number of characters at interval between the main body e of this comment text content, described first feeling polarities score (e) is:
s c o r e ( e ) = Σ w n k ( w n ) d i s ( w n , e ) .
4. the method according to claim 1 or 3, it is characterized in that, the second feeling polarities of described every bar review information is determined at the term vector that in described the first feeling polarities according to described every bar review information, described forward comment training data, described negative sense comment training data, comment text content, not deleted word forms, comment influence power and comment issuing time interval, comprising:
Pass through formula
P o l a r ( c ) = Σ i = 1 n P ( θ i + ) P ( θ i - ) P ( c | θ i + ) P ( c | θ i - )
Determine the second feeling polarities of described every bar review information; Wherein, the second feeling polarities that polar (c) is this review information, be this forward comment training data that i-th the first feeling polarities is corresponding, it is the negative sense comment training data that i-th the first feeling polarities is corresponding;
Above-mentioned formula is decomposed into: P ( c | θ i + ) P ( c | θ i - ) = L · d ( t ) · Σ j = 1 n log P ( w j | θ i + ) P ( w j | θ i - ) , Wherein, L is the comment influence power of this review information, the comment issuing time interval that d (t) is this review information, (w 1w 2... w n) for word not deleted in the comment text content of this review information composition term vector, w jfor the not deleted word of jth in this term vector.
5. method according to claim 1, is characterized in that, the network characterization of the described friend relation according to described true comment user, carries out cluster, comprising all described true comment users:
According to the friend relation of described true comment user, build adjacency matrix A=[a kq] n × N, wherein, principal diagonal is all 0, all the other nodes: if there is concern relation between two comment users, be then 1; Otherwise be 0;
Structure degree matrix D=diag (| D 1|, | D 2| ..., | D n|), wherein, | D k| represent the degree of comment user k, the degree of comment user k is the quantity that there is the comment user of concern relation with comment user k;
According to adjacency matrix A=[a kq] n × Nwith degree matrix D=diag (| D 1|, | D 2| ..., | D n|) building Laplacian Matrix L, described Laplacian Matrix L is: L=D-A;
According to the cluster number K preset, solve described Laplacian Matrix L, obtain front K the minimum non-zero eigenwert of this Laplacian Matrix L and K proper vector of correspondence;
The matrix of a N × K is built according to K proper vector;
K-means algorithm is utilized to carry out cluster.
6. method according to claim 1 or 5, is characterized in that the weights of the described described feeling polarities according to all classes judge whether described network data information is rumour, comprising:
The weights of the described feeling polarities of all classes of suing for peace;
If the weights of described feeling polarities and be greater than 0, then judge that described network data information is as non-rumour; Otherwise judge that described network data information is as rumour.
7. the recognition device of a network rumour, it is characterized in that, described device comprises: receiving element, collecting unit, the first feeling polarities determining unit, the second feeling polarities determining unit, true comment user obtain unit, cluster cell, polarity weights acquisition unit and rumour identifying unit;
Described receiving element, for receiving the network data information of user's input;
Described collecting unit, for according to described network data information, gather in the original network information and the described original network information review information with network expression, described review information comprises: comment text content, comment user profile, comment issuing time and comment point praise number of times;
Described first feeling polarities determining unit, for the feeling polarities of expressing one's feelings according to network, described review information is distinguished forward comment training data and negative sense comment training data, participle is carried out to the comment text content in described review information, delete the stop words in described review information, and in conjunction with sentiment dictionary, determine the first feeling polarities of described every bar review information;
Described second feeling polarities determining unit, comments on term vector, the comment influence power of word composition not deleted in training data, comment text content for the first feeling polarities according to described every bar review information, described forward comment training data, described negative sense and comments on the second feeling polarities that described every bar review information is determined at issuing time interval; Described comment influence power is praised number of times according to described comment point and is obtained;
Described true comment user obtains unit, and for the original ratio according to the described comment hour of log-on of user, bean vermicelli and good friend's ratio and the network information, acquisition truly comments on user;
Described cluster cell, for the network characterization of the friend relation according to described true comment user, carries out cluster to all described true comment users;
Described polarity weights obtain unit, and for truly commenting on the second feeling polarities corresponding to user according in class with described in each, the second feeling polarities of each class of normalization, obtains the weights of the feeling polarities of each class;
Described rumour identifying unit, for the weights of the described feeling polarities according to all classes, judges whether described network data information is rumour.
8. device according to claim 7, is characterized in that, described first feeling polarities determining unit, comprising: review information is distinguished subelement, deleted subelement, emotion value determination subelement and the first feeling polarities determination subelement;
Described review information distinguishes subelement, for the feeling polarities of expressing one's feelings according to network, described review information is distinguished forward comment training data and negative sense comment training data;
Described deletion subelement, for carrying out participle to described comment text content, deletes the auxiliary words of mood in described review information, conjunction and preposition;
Described emotion value determination subelement, for according to sentiment dictionary, determines each not deleted word w nemotion value k (w n), the scope of described emotion value is [-1,1];
Described first feeling polarities determination subelement, for according to described emotion value k (w n) and the distance dis (w of main body e of described each word and this comment text content n, e), determine the first feeling polarities score (e) of this review information, described distance dis (w n, e) be the n-th word w nand the number of characters at interval between the main body e of this comment text content, described first feeling polarities score (e) is:
s c o r e ( e ) = Σ w n k ( w n ) d i s ( w n , e ) .
9. the device according to claim 7 or 8, is characterized in that, described second feeling polarities determining unit, comprising: the second feeling polarities determination subelement and formula decompose subelement;
Described second feeling polarities determination subelement, for passing through formula
P o l a r ( c ) = Σ i = 1 n P ( θ i + ) P ( θ i - ) P ( c | θ i + ) P ( c | θ i - )
Determine the second feeling polarities of described every bar review information; Wherein, the second feeling polarities that polar (c) is this review information, be this forward comment training data that i-th the first feeling polarities is corresponding, it is the negative sense comment training data that i-th the first feeling polarities is corresponding;
Described formula decomposes subelement, for being decomposed into by above-mentioned formula: wherein, L is the comment influence power of this review information, the comment issuing time interval that d (t) is this review information, (w 1w 2... w n) for word not deleted in the comment text content of this review information composition term vector, w jfor the not deleted word of jth in this term vector.
10. device according to claim 7, is characterized in that, described cluster cell, comprising: adjacency matrix builds subelement, degree matrix structure subelement, Laplacian Matrix structure subelement, solves subelement, matrix structure subelement and cluster subelement;
Described adjacency matrix builds subelement, for the friend relation according to described true comment user, builds adjacency matrix A=[a kq] n × N, wherein principal diagonal is all 0, all the other nodes: if there is concern relation between two comment users, be then 1; Otherwise be 0;
Described degree matrix builds subelement, for degree of structure matrix D=diag (| D 1|, | D 2| ..., | D n|), wherein, | D k| represent the degree of comment user k, the degree of comment user k is the quantity that there is the comment user of concern relation with comment user k;
Described Laplacian Matrix builds subelement, for according to adjacency matrix A=[a kq] n × Nwith degree matrix D=diag (| D 1|, | D 2| ..., | D n|) building Laplacian Matrix L, described Laplacian Matrix L is: L=D-A;
Describedly solve subelement, for according to the cluster number K preset, solve described Laplacian Matrix L, obtain front K the minimum non-zero eigenwert of this Laplacian Matrix L and K proper vector of correspondence;
Described matrix builds subelement, for building the matrix of a N × K according to K proper vector;
Described cluster subelement, carries out cluster for utilizing K-means algorithm.
CN201510750244.6A 2015-11-05 2015-11-05 Online-rumor identification method and apparatus Pending CN105354305A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510750244.6A CN105354305A (en) 2015-11-05 2015-11-05 Online-rumor identification method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510750244.6A CN105354305A (en) 2015-11-05 2015-11-05 Online-rumor identification method and apparatus

Publications (1)

Publication Number Publication Date
CN105354305A true CN105354305A (en) 2016-02-24

Family

ID=55330277

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510750244.6A Pending CN105354305A (en) 2015-11-05 2015-11-05 Online-rumor identification method and apparatus

Country Status (1)

Country Link
CN (1) CN105354305A (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106126700A (en) * 2016-07-01 2016-11-16 复旦大学 A kind of analysis method of microblogging gossip propagation
CN106570162A (en) * 2016-11-04 2017-04-19 北京百度网讯科技有限公司 Canard identification method and device based on artificial intelligence
CN107180077A (en) * 2017-04-18 2017-09-19 北京交通大学 A kind of social networks rumour detection method based on deep learning
CN107741939A (en) * 2016-10-31 2018-02-27 腾讯科技(深圳)有限公司 A kind of recognition methods of info web and device
CN108563686A (en) * 2018-03-14 2018-09-21 中国科学院自动化研究所 Social networks rumour recognition methods based on hybrid neural networks and system
CN108614855A (en) * 2018-03-19 2018-10-02 众安信息技术服务有限公司 A kind of rumour recognition methods
CN108681532A (en) * 2018-04-08 2018-10-19 天津大学 A kind of sentiment analysis method towards Chinese microblogging
CN108804608A (en) * 2018-05-30 2018-11-13 武汉烽火普天信息技术有限公司 A kind of microblogging rumour position detection method based on level attention
CN109299261A (en) * 2018-09-30 2019-02-01 北京字节跳动网络技术有限公司 Analyze method, apparatus, storage medium and the electronic equipment of rumour data
CN109670542A (en) * 2018-12-11 2019-04-23 田刚 A kind of false comment detection method based on comment external information
CN110084373A (en) * 2019-04-22 2019-08-02 腾讯科技(深圳)有限公司 Information processing method, device, computer readable storage medium and computer equipment
CN110866398A (en) * 2020-01-07 2020-03-06 腾讯科技(深圳)有限公司 Comment text processing method and device, storage medium and computer equipment
CN112231562A (en) * 2020-10-15 2021-01-15 北京工商大学 Network rumor identification method and system
CN113688202A (en) * 2021-07-30 2021-11-23 杭州网易云音乐科技有限公司 Emotion polarity analysis method and device, electronic equipment and computer storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104731770A (en) * 2015-03-23 2015-06-24 中国科学技术大学苏州研究院 Chinese microblog emotion analysis method based on rules and statistical model

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104731770A (en) * 2015-03-23 2015-06-24 中国科学技术大学苏州研究院 Chinese microblog emotion analysis method based on rules and statistical model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YEKANG YANG 等: "Exploiting the topology property of social network for rumor detection,", 《2015 12TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING》 *
张进 等: "基于特征分析的微博炒作账户识别方法", 《计算机工程》 *

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106126700A (en) * 2016-07-01 2016-11-16 复旦大学 A kind of analysis method of microblogging gossip propagation
CN106126700B (en) * 2016-07-01 2020-05-12 复旦大学 Analysis method for propagation of microblog rumors
CN107741939A (en) * 2016-10-31 2018-02-27 腾讯科技(深圳)有限公司 A kind of recognition methods of info web and device
CN107741939B (en) * 2016-10-31 2020-05-12 腾讯科技(深圳)有限公司 Webpage information identification method and device
CN106570162A (en) * 2016-11-04 2017-04-19 北京百度网讯科技有限公司 Canard identification method and device based on artificial intelligence
CN106570162B (en) * 2016-11-04 2020-07-28 北京百度网讯科技有限公司 Artificial intelligence-based rumor recognition method and device
CN107180077A (en) * 2017-04-18 2017-09-19 北京交通大学 A kind of social networks rumour detection method based on deep learning
CN108563686A (en) * 2018-03-14 2018-09-21 中国科学院自动化研究所 Social networks rumour recognition methods based on hybrid neural networks and system
CN108614855A (en) * 2018-03-19 2018-10-02 众安信息技术服务有限公司 A kind of rumour recognition methods
CN108681532A (en) * 2018-04-08 2018-10-19 天津大学 A kind of sentiment analysis method towards Chinese microblogging
CN108681532B (en) * 2018-04-08 2022-03-25 天津大学 Sentiment analysis method for Chinese microblog
CN108804608A (en) * 2018-05-30 2018-11-13 武汉烽火普天信息技术有限公司 A kind of microblogging rumour position detection method based on level attention
CN108804608B (en) * 2018-05-30 2021-08-27 武汉烽火普天信息技术有限公司 Microblog rumor position detection method based on level attention
CN109299261A (en) * 2018-09-30 2019-02-01 北京字节跳动网络技术有限公司 Analyze method, apparatus, storage medium and the electronic equipment of rumour data
CN109670542A (en) * 2018-12-11 2019-04-23 田刚 A kind of false comment detection method based on comment external information
CN110084373A (en) * 2019-04-22 2019-08-02 腾讯科技(深圳)有限公司 Information processing method, device, computer readable storage medium and computer equipment
CN110084373B (en) * 2019-04-22 2021-08-24 腾讯科技(深圳)有限公司 Information processing method, information processing device, computer-readable storage medium and computer equipment
CN110866398A (en) * 2020-01-07 2020-03-06 腾讯科技(深圳)有限公司 Comment text processing method and device, storage medium and computer equipment
CN112231562A (en) * 2020-10-15 2021-01-15 北京工商大学 Network rumor identification method and system
CN112231562B (en) * 2020-10-15 2023-07-14 北京工商大学 Network rumor recognition method and system
CN113688202A (en) * 2021-07-30 2021-11-23 杭州网易云音乐科技有限公司 Emotion polarity analysis method and device, electronic equipment and computer storage medium
CN113688202B (en) * 2021-07-30 2024-03-15 杭州网易云音乐科技有限公司 Emotion polarity analysis method and device, electronic equipment and computer storage medium

Similar Documents

Publication Publication Date Title
CN105354305A (en) Online-rumor identification method and apparatus
CN104239539B (en) A kind of micro-blog information filter method merged based on much information
CN105740228B (en) A kind of internet public feelings analysis method and system
CN103678670B (en) Micro-blog hot word and hot topic mining system and method
CN104820629B (en) A kind of intelligent public sentiment accident emergent treatment system and method
CN104778209B (en) A kind of opining mining method for millions scale news analysis
CN106202211B (en) Integrated microblog rumor identification method based on microblog types
CN103927398B (en) The microblogging excavated based on maximum frequent itemsets propagandizes colony's discovery method
CN102591854B (en) For advertisement filtering system and the filter method thereof of text feature
CN103678613B (en) Method and device for calculating influence data
CN103927297B (en) Evidence theory based Chinese microblog credibility evaluation method
CN101609459A (en) A kind of extraction system of affective characteristic words
CN108228853A (en) A kind of microblogging rumour recognition methods and system
CN109829089A (en) Social network user method for detecting abnormality and system based on association map
CN106156372B (en) A kind of classification method and device of internet site
CN102332025A (en) Intelligent vertical search method and system
CN106372072A (en) Location-based recognition method for user relations in mobile social network
CN109446404A (en) A kind of the feeling polarities analysis method and device of network public-opinion
CN101281521A (en) Method and system for filtering sensitive web page based on multiple classifier amalgamation
CN110457404A (en) Social media account-classification method based on complex heterogeneous network
CN102929873A (en) Method and device for extracting searching value terms based on context search
CN103544188A (en) Method and device for pushing mobile internet content based on user preference
CN103631862B (en) Event characteristic evolution excavation method and system based on microblogs
CN107239512A (en) The microblogging comment spam recognition methods of relational network figure is commented in a kind of combination
CN106168953A (en) Blog article towards weak relation social networks recommends method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20160224