CN106844765B - Significant information detection method and device based on convolutional neural network - Google Patents

Significant information detection method and device based on convolutional neural network Download PDF

Info

Publication number
CN106844765B
CN106844765B CN201710098500.7A CN201710098500A CN106844765B CN 106844765 B CN106844765 B CN 106844765B CN 201710098500 A CN201710098500 A CN 201710098500A CN 106844765 B CN106844765 B CN 106844765B
Authority
CN
China
Prior art keywords
event
paragraph
time
information
event information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710098500.7A
Other languages
Chinese (zh)
Other versions
CN106844765A (en
Inventor
谭铁牛
王亮
吴书
余峰
刘强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN201710098500.7A priority Critical patent/CN106844765B/en
Publication of CN106844765A publication Critical patent/CN106844765A/en
Application granted granted Critical
Publication of CN106844765B publication Critical patent/CN106844765B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Abstract

The invention discloses a significant information detection method and device based on a convolutional neural network. The method comprises the following steps: for the crawled data set, determining the time distribution of each event development stage and determining time nodes; for each event, dividing all event information corresponding to the event sample into a plurality of parts according to the determined time node, splicing text contents of the event information in each time phase into a paragraph, and generating a paragraph data set; learning an unsupervised expression vector for each paragraph in the paragraph dataset according to a distributed expression algorithm for the paragraph; for an event, inputting the unsupervised expression vector of each paragraph into a deep convolutional neural network model, obtaining the expression from the low layer to the high layer of each stage of the event by utilizing multilayer convolutional operation, extracting the key features of each stage of the event through k maximum pooling operation, and finally classifying the input information through a full connection layer.

Description

Significant information detection method and device based on convolutional neural network
Technical Field
The invention relates to the technical field of computer processing, in particular to a significant information detection method and device based on a convolutional neural network.
Background
The social media network is developed rapidly, is widely applied and easy to obtain, brings convenience to the life of users to a great extent on one hand, and enriches the experience of the users, but meanwhile, unreal information transmission on the social media network can disturb the normal life of people, misguides public sentiments, and endangers public safety and social stability. The task of identifying unrealistic information from a vast amount of social media network information is becoming more and more important and urgent, and early detection of unrealistic information is also becoming more practical and effective.
The existing method for identifying the unreal information is mainly a method of some feature engineering, and the extracted manual features can be derived from the following aspects, namely user credibility, microblog-level content, event-level content and aggregation from the microblog level to the event level. The extracted manual features can be roughly divided into the following categories, namely conflict viewpoints in microblogs, the feature of change of the microblog forwarding number along with time, microblog reply and signal microblogs containing suspected attitudes and the like. However, these manual feature-based methods are difficult to relate to new situations, and social media is dynamic, variable, and complex, which results in many new situations where manual features are difficult to design.
The CSID model can detect some significant information from the user generated content and generation time on social media, including but not limited to identification and early detection of rumor information. Generally, a microblog event includes thousands of related microblogs, and the microblog popularity varies greatly. Firstly, counting the time characteristics of unreal information and real information on a data set, wherein the time characteristics refer to the power law distribution characteristics of microblogs along with time. And then the microblog related to the event is grouped and processed by the model according to the corresponding time characteristic. For different groups of microblog texts, a representation learning method (representation learning method) is introduced into the model, and the expression of each group of microblog texts is learned by using a paragraph distributed expression learning algorithm (param vector). And finally, modeling high-order interaction among all groups of microblogs by using a deep convolutional neural network, performing a process of learning from low-order features to high-order features, learning implicit expressions (late representation) of all stages of event occurrence, and extracting important factors. Based on these implicit representations, the final representation of model events makes innovative contributions above the detection of non-information and early detection.
Disclosure of Invention
In view of the technical defects of the traditional artificial feature-based method, the invention provides a significant information detection method and device based on a convolutional neural network in order to better detect the information reliability.
According to an aspect of the present invention, a significant information detection method based on a convolutional neural network is provided, which includes the following steps:
step S1, for the crawled data set comprising a plurality of event information, determining the time distribution of each stage of the development of each event corresponding to the event information in the data set, and determining the time nodes corresponding to each time period; the event information in the data set comprises unreal event information and real time information, the event information corresponds to a plurality of events, and each event corresponds to a plurality of unreal event information or a plurality of real event information;
step S2, for each event, dividing all event information corresponding to the event sample into a plurality of parts according to the determined time node, splicing the text content of the event information in each time phase into a paragraph, and generating a paragraph data set;
step S3, learning an unsupervised expression vector of each paragraph in the paragraph data set according to a distribution expression algorithm of the paragraphs;
step S4, for an event, inputting the unsupervised expression vector of each paragraph into a deep convolutional neural network model, obtaining the expression from the bottom layer to the top layer of each stage of the event by utilizing multilayer convolutional operation, fully extracting the key features of each stage of the event through the k-max pooling operation, and finally classifying the input information through a full connection layer; after the deep convolutional neural network model is trained in the step S4 by using all events, a significant information detection model is obtained;
and step S5, classifying and detecting the information to be detected by using the significant information detection model.
Step S1 includes:
determining time stamps of all event information corresponding to the events;
for each event, sequencing the timestamps according to the time sequence;
equally dividing the time corresponding to the earliest time stamp and the latest time stamp into a plurality of time periods;
and determining time nodes corresponding to the multiple time periods.
Step S2 includes:
for each event, dividing the event information corresponding to the event into different time periods according to the time periods determined in step S1 and the time stamp of the event information corresponding to the event;
and splicing the text contents of the event information in each time period into a paragraph to obtain a plurality of paragraphs corresponding to the time periods to form a paragraph data set.
Step S3 includes:
and (3) regarding the paragraph data set as a corpus, and learning to obtain an unsupervised expression vector of each paragraph by using a distributed expression learning algorithm of unsupervised words and paragraphs on a word level and a paragraph level respectively.
Step S4 includes:
for each event, splicing the unsupervised vector expressions of all paragraphs into a matrix;
and inputting the matrix into a deep convolution neural network model for training.
According to a second aspect of the present invention, there is provided a significant information detection apparatus based on a convolutional neural network, comprising the steps of:
the time node determining module is configured to determine time distribution of each stage of development of each event corresponding to the event information in the data set and determine time nodes corresponding to each time period for the crawled data set comprising a plurality of event information; the event information in the data set comprises unreal event information and real time information, the event information corresponds to a plurality of events, and each event corresponds to a plurality of unreal event information or a plurality of real event information;
the paragraph generation module is configured to divide all event information corresponding to the event sample into a plurality of parts according to the determined time node for each event, and splice the text content of the event information in each time phase into a paragraph to generate a paragraph data set;
a vector generation module configured to learn an unsupervised expression vector for each paragraph in the paragraph dataset according to a distributed expression algorithm for the paragraph;
the model training module is configured to input the unsupervised expression vector of each paragraph into a deep convolutional neural network model for an event, obtain the expression from the bottom layer to the high layer of each stage of the event by utilizing multilayer convolutional operation, fully extract the key features of each stage of the event through the k-max pooling operation, and finally classify the input information through a full connection layer; after the deep convolutional neural network model is trained in the step S4 by using all events, a significant information detection model is obtained;
and the detection module is configured to utilize the significant information detection model to classify and detect the information to be detected.
The time node determination module:
the first determining submodule is configured to determine time stamps of all event information corresponding to the events;
a sorting submodule configured to sort the timestamps in chronological order for each event;
an equally dividing module configured to equally divide the time corresponding to the earliest time stamp and the latest time stamp into a plurality of time periods;
a second determining submodule configured to determine time nodes corresponding to the plurality of time periods.
The paragraph generation module includes:
the time period dividing submodule is configured to divide the event information corresponding to each event into different time periods according to the multiple determined time periods and the timestamp of the event information corresponding to the event;
and the paragraph generation submodule is configured to splice the text content of the event information in each time period into a paragraph, obtain a plurality of paragraphs corresponding to the plurality of time periods, and form a paragraph data set.
The vector generation module comprises:
and the unsupervised learning sub-module is configured to regard the paragraph data set as a corpus, and learn to obtain an unsupervised expression vector of each paragraph by using a distributed expression learning algorithm of unsupervised words and paragraphs on a word level and a paragraph level respectively.
The model training module comprises:
a splicing submodule configured to splice the unsupervised vector expressions of all paragraphs into one matrix for each event;
a training submodule configured to input the matrix to a deep convolutional neural network model for training.
Drawings
FIG. 1 is a schematic diagram of a significant information detection model CSID based on a convolutional neural network in the present invention;
FIG. 2 is a power-law distribution diagram of unreal information and real information on a microblog data set in the invention;
FIG. 3 is a schematic diagram illustrating comparison of early detection effects on a microblog data set by different comparison methods.
Detailed Description
In order that the objects, technical solutions and advantages of the present invention will become more apparent, the present invention will be further described in detail with reference to the accompanying drawings in conjunction with the following specific embodiments.
The invention discloses a Convolutional neural network-based significant Information Detection model (CSID) training method, which can be used for unreal Information identification and early Detection tasks in a social media network. The model may learn a holistic representation of events that contain microblogs of different orders of magnitude. Meanwhile, the CSID can also model each stage of event development according to the time characteristics of the event development, semantically express from a bottom layer to a high layer, select key characteristics through flexible k-max pooling operation, and transmit the key characteristics to a final full-connection layer for classification learning of social media network information. In the model, all microblogs contained in each event are divided into a plurality of groups according to the time phase of event development, each group of microblogs learns an expression and then sends the expression to a deep convolutional neural network, and finally the probability that the time belongs to unreal information is output. CSID model establishment: 1) for a large amount of crawled data sets of unreal information and real information, integrally researching the time distribution of each stage of event development, and determining time nodes corresponding to each time period; 2) for each event sample, dividing all microblogs into a plurality of parts according to the determined time nodes, and splicing the text contents of the microblogs in each time phase into a paragraph; 3) generating integral data sets into paragraphs, and learning an unsupervised expression vector of each paragraph according to a distribution expression algorithm of the paragraphs; 4) for an event sample, inputting an expression vector of each stage into a deep convolution neural network model, obtaining the expression from the bottom layer to the high layer of each stage of the event by utilizing multilayer convolution operation, fully extracting key features of each stage of the event through flexible k-max pooling operation, and finally classifying input information through a full connection layer; 5) on the test set, by gradient back propagation, a visual experiment is carried out on the convolution kernel and the gradient, and the significant information learned by the model is deeply analyzed and demonstrated. On the experiment of the Sina microblog data set and the twitter data set, a more accurate prediction effect is obtained compared with other existing models.
As shown in fig. 1, an embodiment of the present invention provides a significant information detection method based on a convolutional neural network, where the method includes:
receiving information to be classified;
inputting the information to be classified into a pre-trained significant information detection model;
and the significant information detection model outputs the result that the information to be classified is real information or unreal information.
In an embodiment, the salient information monitoring model firstly trains the model well according to the existing data, after the trained model is obtained, new information is input into the model for the newly appeared information through similar operation, and then the model outputs a probability value which represents the probability that the input information belongs to unreal information, and the larger the output value is, the more probable the input information is the unreal information.
The following describes in detail various problems involved in the technical solutions of the present invention with reference to the accompanying drawings. It should be noted that the described embodiments are only intended to facilitate understanding and do not have any limiting effect on the invention.
In order to better understand the role of the CSID model in the unreal information detection and verify the implementation effect of the present invention, experiments are taken as an example to explain, and the example adopts the xinlang microblog database. The experimental data set was divided into 60% training set, 30% testing set and 10% validation set.
The experiment contained four evaluation indices Accuracy (Accuracy), Precision (Precision), Recall (Recall) and F1-score. The research object respectively calculates Precision and Recall for unreal information and real information to display the capability of the model to detect the two kinds of information. The larger the values of the four evaluation indexes are, the higher the detection performance of the unreal information of the model is.
As shown in fig. 1, the specific experimental steps on the Sina microblog data set are as follows:
in step S1, a plurality of events E ═ E are included in the data set of the large amount of the crawled unreal information and real informationiFor an event, a plurality of pieces of information may be used to describe the event, for example, for a significant time, there may be a plurality of pieces of information such as microblogs or news to describe the event), the time distribution of each stage of the event development is studied as a whole, the timestamps (i.e., the time points at which the information is issued) of all the microblogs (here, the microblogs are taken as an example, and other information may also be collected) corresponding to all the events are firstly collected, the timestamps are arranged according to the time sequence, then the time periods corresponding to the earliest and latest timestamps are equally divided into M (for example, M is 20) and the time nodes corresponding to each time period are determined accordingly,
Ti=[ti-1,ti),i=1,2,…,20.
wherein T isiDenotes the ith time period, ti-1And tiRespectively representing the ith time phase start timestamp and the ith time phase end timestamp. In addition, each time node needs to be normalized, and the timestamp corresponding to the obtained time node is normalized to 0-1 interval.
Step S2, for each microblog containing multiple microblogsEvent sample ofFirstly, the first step is toThe time stamp t of all microblogs included in the eventjNormalizing to an interval of 0-1, dividing all microblogs into a plurality of parts according to the time nodes determined by S1, and splicing the text contents of the microblogs in each time phase into a paragraph, namely, the time stamp of the microblogs is in the ith time phase TiThe contents of all microblogs in the microblog list are spliced into a paragraph.
Step S3, regarding all microblog content text data sets in the step S2 as a corpus, learning to obtain expression vectors of each word and each paragraph by using unsupervised word and paragraph distributed expression learning algorithms word2vec and para2vec on the word level and the paragraph level respectively, and forming matrixes W and D respectively. Each column in the matrices W and D corresponds to an expression vector for a word and a paragraph, respectively.
Wherein N represents the number of words in a paragraph, the window width of the context is 2k, namely, k words before and after the current word are selected as the context, the algorithm maximizes the joint condition distribution probability p of all words in the paragraph mainly through the words of the context and the memory information in the paragraph expression vector, and the probability p is calculated through softmax. y isiThe output response, which represents the ith word, can be derived from,
y=b+UTh(pj,wn-k,…wn+k;D,W)
wherein p isjIs a vector representation of a paragraph, wnVector expression, p, representing the nth word in a paragraphjAnd wnOne column in each of the matrices D and W. b and U are parameters of softmax, and h is an averaging or splicing operation.
Step S4, for an event sample, the paragraph in S3 is expressed as a vector pjSpliced into a matrixWherein d and n represent the dimension of the matrix P, input into the deep convolutional neural network model, and utilize multipleThe layer convolution operation obtains the expression from the bottom layer to the high layer of each stage of the event, the output result of a certain layer in the deep neural network model is called a feature map, the output result of the low layer of the neural network is called a low-order feature map, the output result of the high layer of the neural network is called a high-order feature map, one element of the feature map can be obtained through the following convolution operation,
f[i]=tanh(<P[:,i:i+ω-1],C>F)
where P [: i + w-1] represents the i-th to (i + ω -1) -th columns of the matrix E, ω represents the width of the convolution kernel, and C represents the convolution weight matrix. The operation of the trace after matrix multiplication can be represented as a Frobenius inner product operation as follows:
<X,Y>F=Tr(XYT)
fully extracting key features of each stage of an event through flexible k maximum pooling operation, namely extracting k maximum elements in a feature mapAs a new characteristic diagram. And finally, classifying the input information by a full connection layer.
The deep convolutional neural network model can be initialized randomly and then trained continuously in S4 to update the parameters of the model.
And step S5, obtaining a gradient matrix of the input label on the input through gradient back transmission on the test set, and performing significance analysis on the input matrix to obtain microblog content playing a significant role in the corresponding input. In addition, deep visual analysis is carried out on the convolution kernel of the first convolution layer, and the distribution characteristics of the microblog content in the event are obtained.
FIG. 2 is a power-law distribution diagram of unreal information and real information on a microblog data set in the invention; in the data set shown in fig. 2, for real information and unreal information, the power law distribution of the microblog numbers over time is reflected by the change of the proportion of the microblog numbers in different stages over time. Fig. 3 shows the experimental results of early detection of unreal information.
Table 1 shows the attribute statistics in the Twitter and Weibo datasets
Table 2: identification of unrealistic information (M: unrealistic information, T: real information)
Table 2 shows the experimental results of the proposed CSID method compared to other methods available
The model provided by the invention discloses a power law distribution rule of the microblog quantity contained in the events in the social media network along with time, time nodes of each stage of the events are determined by adopting integral equal division according to the rule, and then each event is segmented according to the time stages, so that the microblog quantity with the same quantity in each time interval is ensured, and the events can be ensured to share one time scale on the whole. The model can learn more real expression of events and can fully mine and utilize the time law of information distribution. The expression from the bottom layer to the high layer of each stage of the event is obtained by utilizing multilayer convolution operation, so that high-order interaction and deep semantic expression of each stage of the event can be fully modeled; the key features of each stage of the event are fully extracted through flexible k-max pooling operation, so that the model can be more suitable for dynamic complex social media scenes.
The invention relates to a significant information detection task based on a convolutional neural network, and particularly aims at real social media occasions, such as large information quantity, obvious time span difference, complex semantic scenes, dynamic and variable user behaviors and the like, so that more accurate detection effect can be obtained by significant information detection.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention and are not intended to limit the present invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A significant information detection method based on a convolutional neural network comprises the following steps:
step S1, for the crawled data set comprising a plurality of event information, determining the time distribution of each stage of the development of each event corresponding to the event information in the data set, and determining the time nodes corresponding to each time period; the event information in the data set comprises unreal event information and real event information, the data set corresponds to a plurality of events, and each event corresponds to at least one unreal event information and/or at least one real event information;
step S2, for each event, dividing all event information corresponding to the event sample into a plurality of parts according to the determined time node, splicing the text content of the event information in each time phase into a paragraph, and generating a paragraph data set;
step S3, learning an unsupervised expression vector of each paragraph in the paragraph data set according to a distributed expression learning algorithm of the paragraph;
step S4, for an event, inputting the unsupervised expression vector of each paragraph into a deep convolutional neural network model, obtaining the expression from the bottom layer to the top layer of each stage of the event by utilizing multilayer convolutional operation, extracting the key features of each stage of the event through the k-max pooling operation, and finally classifying the input information through a full connection layer; after the deep convolutional neural network model is trained in the step S4 by using all events, a significant information detection model is obtained;
and step S5, classifying and detecting the information to be detected by using the significant information detection model.
2. The method according to claim 1, wherein step S1 includes:
determining time stamps of all event information corresponding to the events;
for each event, sequencing the timestamps according to the time sequence;
equally dividing the time corresponding to the earliest time stamp and the latest time stamp into a plurality of time periods;
and determining time nodes corresponding to the multiple time periods.
3. The method according to claim 1, wherein step S2 includes:
for each event, dividing the event information corresponding to the event into different time periods according to the time periods determined in step S1 and the time stamp of the event information corresponding to the event;
and splicing the text contents of the event information in each time period into a paragraph to obtain a plurality of paragraphs corresponding to the time periods to form a paragraph data set.
4. The method according to claim 1, wherein step S3 includes:
and (3) regarding the paragraph data set as a corpus, and learning to obtain an unsupervised expression vector of each paragraph by using a distributed expression learning algorithm of unsupervised words and paragraphs on a word level and a paragraph level respectively.
5. The method according to claim 1, wherein step S4 includes:
for each event, splicing the unsupervised vector expressions of all paragraphs into a matrix;
and inputting the matrix into a deep convolution neural network model for training.
6. A significant information detection device based on a convolutional neural network comprises the following steps:
the time node determining module is configured to determine time distribution of each stage of development of each event corresponding to the event information in the data set and determine time nodes corresponding to each time period for the crawled data set comprising a plurality of event information; the event information in the data set comprises unreal event information and real time information, the event information corresponds to a plurality of events, and each event corresponds to a plurality of unreal event information or a plurality of real event information;
the paragraph generation module is configured to divide all event information corresponding to the event sample into a plurality of parts according to the determined time node for each event, and splice the text content of the event information in each time phase into a paragraph to generate a paragraph data set;
a vector generation module configured to learn an unsupervised expression vector for each paragraph in the paragraph dataset according to a distributed expression learning algorithm for the paragraph;
the model training module is configured to input the unsupervised expression vector of each paragraph into a deep convolutional neural network model for an event, obtain the expression from the bottom layer to the high layer of each stage of the event by utilizing multilayer convolutional operation, fully extract the key features of each stage of the event through the k-max pooling operation, and finally classify the input information through a full connection layer; after the deep convolutional neural network model is trained by using all events, a significant information detection model is obtained;
and the detection module is configured to utilize the significant information detection model to classify and detect the information to be detected.
7. The apparatus of claim 6, wherein the time node determination module:
the first determining submodule is configured to determine time stamps of all event information corresponding to the events;
a sorting submodule configured to sort the timestamps in chronological order for each event;
an equally dividing module configured to equally divide the time corresponding to the earliest time stamp and the latest time stamp into a plurality of time periods;
a second determining submodule configured to determine time nodes corresponding to the plurality of time periods.
8. The apparatus of claim 6, wherein the paragraph generation module comprises:
the time period dividing submodule is configured to divide the event information corresponding to each event into different time periods according to the multiple determined time periods and the timestamp of the event information corresponding to the event;
and the paragraph generation submodule is configured to splice the text content of the event information in each time period into a paragraph, obtain a plurality of paragraphs corresponding to the plurality of time periods, and form a paragraph data set.
9. The apparatus of claim 6, wherein the vector generation module comprises:
and the unsupervised learning sub-module is configured to regard the paragraph data set as a corpus, and learn to obtain an unsupervised expression vector of each paragraph by using a distributed expression learning algorithm of unsupervised words and paragraphs on a word level and a paragraph level respectively.
10. The apparatus of claim 6, wherein the model training module comprises:
a splicing submodule configured to splice the unsupervised vector expressions of all paragraphs into one matrix for each event;
a training submodule configured to input the matrix to a deep convolutional neural network model for training.
CN201710098500.7A 2017-02-22 2017-02-22 Significant information detection method and device based on convolutional neural network Active CN106844765B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710098500.7A CN106844765B (en) 2017-02-22 2017-02-22 Significant information detection method and device based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710098500.7A CN106844765B (en) 2017-02-22 2017-02-22 Significant information detection method and device based on convolutional neural network

Publications (2)

Publication Number Publication Date
CN106844765A CN106844765A (en) 2017-06-13
CN106844765B true CN106844765B (en) 2019-12-20

Family

ID=59134861

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710098500.7A Active CN106844765B (en) 2017-02-22 2017-02-22 Significant information detection method and device based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN106844765B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA3047353C (en) 2017-01-06 2023-05-23 The Toronto-Dominion Bank Learning document embeddings with convolutional neural network architectures
CN107688870B (en) * 2017-08-15 2020-07-24 中国科学院软件研究所 Text stream input-based hierarchical factor visualization analysis method and device for deep neural network
CN108491480B (en) * 2018-03-12 2021-05-11 义语智能科技(上海)有限公司 Rumor detection method and apparatus

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102411611A (en) * 2011-10-15 2012-04-11 西安交通大学 Instant interactive text oriented event identifying and tracking method
CN104834747A (en) * 2015-05-25 2015-08-12 中国科学院自动化研究所 Short text classification method based on convolution neutral network
CN104915448A (en) * 2015-06-30 2015-09-16 中国科学院自动化研究所 Substance and paragraph linking method based on hierarchical convolutional network
CN105608200A (en) * 2015-12-28 2016-05-25 湖南蚁坊软件有限公司 Network public opinion tendency prediction analysis method
CN105740349A (en) * 2016-01-25 2016-07-06 重庆邮电大学 Sentiment classification method capable of combining Doc2vce with convolutional neural network
CN105975497A (en) * 2016-04-27 2016-09-28 清华大学 Automatic microblog topic recommendation method and device
CN106202211A (en) * 2016-06-27 2016-12-07 四川大学 A kind of integrated microblogging rumour recognition methods based on microblogging type

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102411611A (en) * 2011-10-15 2012-04-11 西安交通大学 Instant interactive text oriented event identifying and tracking method
CN104834747A (en) * 2015-05-25 2015-08-12 中国科学院自动化研究所 Short text classification method based on convolution neutral network
CN104915448A (en) * 2015-06-30 2015-09-16 中国科学院自动化研究所 Substance and paragraph linking method based on hierarchical convolutional network
CN105608200A (en) * 2015-12-28 2016-05-25 湖南蚁坊软件有限公司 Network public opinion tendency prediction analysis method
CN105740349A (en) * 2016-01-25 2016-07-06 重庆邮电大学 Sentiment classification method capable of combining Doc2vce with convolutional neural network
CN105975497A (en) * 2016-04-27 2016-09-28 清华大学 Automatic microblog topic recommendation method and device
CN106202211A (en) * 2016-06-27 2016-12-07 四川大学 A kind of integrated microblogging rumour recognition methods based on microblogging type

Also Published As

Publication number Publication date
CN106844765A (en) 2017-06-13

Similar Documents

Publication Publication Date Title
CN109376222B (en) Question-answer matching degree calculation method, question-answer automatic matching method and device
CN108073568A (en) keyword extracting method and device
CN105022754B (en) Object classification method and device based on social network
CN110909164A (en) Text enhancement semantic classification method and system based on convolutional neural network
CN110750640A (en) Text data classification method and device based on neural network model and storage medium
CN107688870B (en) Text stream input-based hierarchical factor visualization analysis method and device for deep neural network
WO2018112696A1 (en) Content pushing method and content pushing system
CN106844765B (en) Significant information detection method and device based on convolutional neural network
CN112948575A (en) Text data processing method, text data processing device and computer-readable storage medium
Ciaburro et al. Python Machine Learning Cookbook: Over 100 recipes to progress from smart data analytics to deep learning using real-world datasets
CN110543474A (en) User behavior analysis method and device based on full-buried point and potential factor model
CN113312480A (en) Scientific and technological thesis level multi-label classification method and device based on graph convolution network
CN110569355B (en) Viewpoint target extraction and target emotion classification combined method and system based on word blocks
Hadioui et al. Machine learning based on big data extraction of massive educational knowledge
Li et al. Incorporating trust relation with PMF to enhance social network recommendation performance
CN109977131A (en) A kind of house type matching system
CN107291686B (en) Method and system for identifying emotion identification
CN111723302A (en) Recommendation method based on collaborative dual-model deep representation learning
CN112433952B (en) Method, system, device and medium for testing fairness of deep neural network model
Hamad et al. Sentiment analysis of restaurant reviews in social media using naïve bayes
Rezaeenour et al. Developing a new hybrid intelligent approach for prediction online news popularity
CN112035607B (en) Method, device and storage medium for matching citation difference based on MG-LSTM
Angdresey et al. Classification and Sentiment Analysis on Tweets of the Ministry of Health Republic of Indonesia
Salmam et al. Prediction in OLAP data cubes
CN114443846A (en) Classification method and device based on multi-level text abnormal composition and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant